[CDBI] How well does Class::DBI scale?

Rick Welykochy rick at praxis.com.au
Sun Dec 18 03:29:16 GMT 2005

Hello to all fans and users of Class::DBI,

This is an amazing module, one which I have only become familiar
with in the last month. I love the how it "fits" into perl very
nicely and hides the vagarities of DBI and "rows as hashes" in a
cozy and safe object model.

The project I have inherited is a bit of a worry, though. It uses Class::DBI
for a system having approx 50 tables, and with Parent -> Child relationships
(has_many) nest appprox 10 deep. As well, there are a few hundred Parent ->
Child (has_a) one-to-one relationships.

That said, the code I have inherited is tight, compact and the programmer
really knew how to use Class::DBI well. There is nary an SQL statement in
sight, with everything done behind the scenes using (unoptimised) Class::DBI

The problem is that for a relatively simple report of, say, 20 rows from
the top-most table (hierachically speaking), upwards of 4000 (or more)
individual SQL statements are fired off to the DB server (PostgreSQL in
this case). This is because the top-most table refers to relatively
trivial great-grandchild tables often for simple things like the name
of something whoe primary key is a (serial) ID.

My job is to optimise these, i.e. unravel the goodness that Class::DBI
objects are doing, write several (LARGE) SQL statements to get the
same result sets, and then map those back into Class::DBI using the
construct() method. This approach is working well. A page that used to
fire off 476 SQL statements now fires off less than 50 with a few hours
of work spent optimising. The goal is to optismise but leave the rest
of the code instact, i.e. keep consistently using the Class::DBI paradigm
without reverting back to DBI and hashes.

The process of optimising in this way is labour-intensive and error-prone.
Of course, I am logging and counting SQL statements to be able to
quantify and observe improvements. But there are many many slow areas
in the medium-sized system that will require heaps of manual optimisation.

My question to the list: has anyone else run into these problems (too
many SQLs) and if so, have they found an automatable solution?

A few notes:

1. yes, I have googled for this and found some discussion pointing out
    that Class::DBI can be an SQL hog

2. I have reverse engineered Class::DBI and observed the caching behaviour
    and am not too happy with it. For no apparent reason, sometimes CDBI
    goes back to the database when a perfectly good and completely
    "__flesh()'d" object is already in the database. This is especially
    annoying in sth_to_objects() when never seems to cache things very well.

3. I have played with columns( Essential => all cols ) but observe very little
    improvement (i.e. reduction) in the number of SQLs.

4. I am using Class::DBI 0.96 still (don't ask!)


More information about the ClassDBI mailing list