[CDBI] Make CDBI go fast

Bill Moseley moseley at hank.org
Thu Feb 15 05:07:34 GMT 2007

On Wed, Feb 14, 2007 at 08:50:20PM -0500, Michael G Schwern wrote:

Which of the items you listed do you think are the biggest performance

> * Make iterators fetch rows on demand. (Done: see rt.cpan.org 24959)
> In case y'all didn't know, CDBI pre-fetches all the rows out of a
> database when it searches and sticks that list into the iterator.
> It then creates objects on demand from that list.  This is really
> inefficient if you do, say, CDBI->search(...)->first and search()
> does a SELECT which returns a lot of rows.

I just looked through my code and almost everywhere ->first is used
it's when I'm returning one (or maybe just a few) rows.  For the vast
majority of the time I want to fetch all the rows in a single query.
I guess the bulk of my searches are displaying a list of results a
page at a time so I'm using LIMIT.

I'd rather see some of CDBI::Sweet's features added into CDBI so using
LIMIT is the way to reduce fetching un-wanted rows.  I'd rather use
the memory then do extra trips to the database.

> * Inflate columns into objects on demand.

This would be nice.  Not sure how much of a speed killer that is,
though.  And I wonder if anyone is using that behavior to test which
columns inflate to other objects.

> * Allow searches to pre-flesh out column groups (so it takes 1 query
> instead of N)
> Right now, searches will only fetch the essential columns in its
> query.  Then if you use a non-essential column it must do another
> query for each object.  For lots of objects with several groups this
> can get expensive.  It would be handy if there was a syntax like:
>     my @objects = CDBI->search( col => 42, { flesh_out => \@groups }
>     );

I always think the column groups are suppose to handle most cases.
But, it would be handy to be able to override the essential group in
some cases -- I'd probably just like to be able to specify the

> * Somehow make joins go fast
> The existing system of relationships doesn't take advantage of
> joins.

Ya, I've can't imagine CDBI without using CDBI::Sweet.

> * Construct a hierarchy of CDBI objects from a SELECT
> Use the results of one custom query to be used to create several
> objects.  In one query which returns columns from several tables you
> could create several objects per row.  This is probably related to
> making joins efficient.

Perhaps not a performance issue, but I often use views where it would
be nice to create several objects.  In some cases where the view
mostly represents an existing class I create the view class as a sub
class.  But, often the view includes columns from related classes, so
it would be handy to populate those objects, too.  I do this manually
in a few instances now.

> * A bulk insert method * A bulk delete method

Bulk inserts don't see to come up too often, and for bulk deletes I
just write the sql.  I actually find set_sql to be just fine for more
complex tasks.

Let's see, I'm sure there's plenty of wish list items people can come
up with.  Not related to performance, but I wish delete() returned
true only when it really deleted the object from the database.
Sometimes that's important to know.

Bill Moseley
moseley at hank.org

More information about the ClassDBI mailing list