[CDBI] Make CDBI go fast
Michael G Schwern
schwern at gmail.com
Thu Feb 15 16:03:49 GMT 2007
Ok, let's answer everyone's questions in one shot. First, here's what I'm working on:
* What is this app?
I've been specificly tasked with speeding up an existing database-heavy web application. It uses CDBI heavily. Its big and does some hairy business logic. CDBI is a bottleneck, I've identified some of the specifics through profiling. They're having to write far too much SQL by hand to be performant. They've also hacked the guts of CDBI a little to optimize for their particular setup.
Its not a public web site, they're essentially developing a service/application which they sell directly to customers, and is used by a small number of paid users. The complexity comes from the hairy business logic and the large volume of data.
It does not have a good test suite.
I am optimizing *specificly* for this application. If it happens to be useful for other CDBI users, great, but that's not my concern. As much as possible will be sent back into CDBI, CDBI::mysql or released as a plugin.
* What database is it using?
MySQL 5 InnoDB. MySQL hasn't been much of a bottleneck but its limitations wrt transactions and foreign key constraints have caused them to do more work in Perl than they should be. Moving to Postgres is a possibility I'm exploring, but that's a whole other discussion.
MySQL means views are to be avoided (they're really inefficient), no custom types, no custom constraints, crappy foreign key constraints and crappy transactional isolation. Can you tell I've been using Postgres lately?
* What's the data like?
Its a large, normalized schema on the order of 100+ tables. Its replicating existing business logic which cannot easily be changed so its a bit crazy in places. Some tables have in the order of a million+ rows.
* What's performance like?
Performance is generally good with most pages coming back in less than 4 seconds but some can take hundreds. Ideally we'd like most page loads to get down under 1 second with the occassional 10-20 second report. In effect, for it to have performance more like a stand-alone application then a web page.
* What about using another ORM?
The basic table classes are all generated from YAML schema files so there is some flexibility to move to another ORM. As the tests are weak and the code is complex I want to alter as little of the application code as possible. It would have to have a very good CDBI wrapper.
* What about Rose-DB?
I know nothing about it. I'm open to convincing. I don't see a CDBI compatibility layer.
* What about DBIx::Class or CDBI::Sweet?
Ditto, except they do have a CDBI compat layer. I'm told DBIx::Class was tried in the past and it didn't work out but I don't have the details. If folks say it has the performance I'm looking for I'm willing to give it a shot.
More information about the ClassDBI