[CDBI] How well does Class::DBI scale?
rick at praxis.com.au
Sun Jan 22 04:30:19 GMT 2006
Thanks to all on the list who responded to my inquiries about
improving spped and efficiency in an app built around CDBI.
Rick Welykochy wrote:
> The problem is that for a relatively simple report of, say, 20 rows from
> the top-most table (hierachically speaking), upwards of 4000 (or more)
> individual SQL statements are fired off to the DB server (PostgreSQL in
> this case). This is because the top-most table refers to relatively
> trivial great-grandchild tables often for simple things like the name
> of something whoe primary key is a (serial) ID.
1. I did hand-optimise some queries, then manually create the required
CDBI data objects using the construct() method. This is a very time-
consuming process, and requires lots of testing and fiddle-farting
around with rather large SQL statements. The results are worth it if
you have the time to spend on it. Some of the optimisations results in
one SQL statement creating hundreds of CDBI objects - a great improvement
in speed and responsiveness. But, this approach kinda defeats the purpose
of CDBI, don't you think?
2. I audited the use of Essential columns in the app. Practically non-existent.
Since I was really getting tired of step 1. above, I diverted my attention
for a day and added all columns that were not bytea (or BLOB) to the
Essential columns lists for all 50 or so tables. The speed improvement
was anywhere from 1.5 to 2.5 X faster (with correspondingly less SQL
being fired at the PostgreSQL server). I'm happy with that, and it
took less than a days work and required very little testing since the
application code was not touched.
My client is happy with the results and I can now move on to writing
more application code.
My conclusion and advice: in these days of large memories, negafast CPUs
and heaps of disk space, err on the side of too many columns in Essential,
i.e. does it really matter if your query loads 5 columns or 25? Nope, I don't
think so. It is far more efficient (programmer-wise) to simply load
all columns that ain't really really big (ignore blobs) for all rows
from all tables. This guarantees that only rarely will CDBI have to return
to the database a second time to get more data for a row.
Like many optimisations, I could spend months more getting another
speed increase by a factor of 2 X or even 10 X on certain pages, but
it is not worth the effort.
Rick Welykochy || Praxis Services
Web (hosting), security and high-performance computing are the
three areas where Linux has more strength.
-- Bob Muglia, senior VP in charge of Windows Server development
More information about the ClassDBI