[CDBI] Make CDBI go fast

Bill Moseley moseley at hank.org
Thu Feb 15 05:07:34 GMT 2007


On Wed, Feb 14, 2007 at 08:50:20PM -0500, Michael G Schwern wrote:

Which of the items you listed do you think are the biggest performance
killers?


> * Make iterators fetch rows on demand. (Done: see rt.cpan.org 24959)
> 
> In case y'all didn't know, CDBI pre-fetches all the rows out of a
> database when it searches and sticks that list into the iterator.
> It then creates objects on demand from that list.  This is really
> inefficient if you do, say, CDBI->search(...)->first and search()
> does a SELECT which returns a lot of rows.

I just looked through my code and almost everywhere ->first is used
it's when I'm returning one (or maybe just a few) rows.  For the vast
majority of the time I want to fetch all the rows in a single query.
I guess the bulk of my searches are displaying a list of results a
page at a time so I'm using LIMIT.

I'd rather see some of CDBI::Sweet's features added into CDBI so using
LIMIT is the way to reduce fetching un-wanted rows.  I'd rather use
the memory then do extra trips to the database.


> * Inflate columns into objects on demand.

This would be nice.  Not sure how much of a speed killer that is,
though.  And I wonder if anyone is using that behavior to test which
columns inflate to other objects.


> * Allow searches to pre-flesh out column groups (so it takes 1 query
> instead of N)
> 
> Right now, searches will only fetch the essential columns in its
> query.  Then if you use a non-essential column it must do another
> query for each object.  For lots of objects with several groups this
> can get expensive.  It would be handy if there was a syntax like:
> 
>     my @objects = CDBI->search( col => 42, { flesh_out => \@groups }
>     );

I always think the column groups are suppose to handle most cases.
But, it would be handy to be able to override the essential group in
some cases -- I'd probably just like to be able to specify the
columns.


> * Somehow make joins go fast
> 
> The existing system of relationships doesn't take advantage of
> joins.

Ya, I've can't imagine CDBI without using CDBI::Sweet.


> * Construct a hierarchy of CDBI objects from a SELECT
> 
> Use the results of one custom query to be used to create several
> objects.  In one query which returns columns from several tables you
> could create several objects per row.  This is probably related to
> making joins efficient.

Perhaps not a performance issue, but I often use views where it would
be nice to create several objects.  In some cases where the view
mostly represents an existing class I create the view class as a sub
class.  But, often the view includes columns from related classes, so
it would be handy to populate those objects, too.  I do this manually
in a few instances now.


> * A bulk insert method * A bulk delete method

Bulk inserts don't see to come up too often, and for bulk deletes I
just write the sql.  I actually find set_sql to be just fine for more
complex tasks.


Let's see, I'm sure there's plenty of wish list items people can come
up with.  Not related to performance, but I wish delete() returned
true only when it really deleted the object from the database.
Sometimes that's important to know.



-- 
Bill Moseley
moseley at hank.org




More information about the ClassDBI mailing list