[CDBI] Class::DBI::Iterator

Matt S Trout dbix-class at trout.me.uk
Sun Aug 27 22:18:05 BST 2006

Tom Metro wrote:
> As mentioned in a prior post, I had a need to write some methods that 
> operated on collections of CDBI records, and found subclassing 
> Class::DBI::Iterator as one solution. Below are some of the challenges I 
> ran into and areas in which Class::DBI::Iterator seems deficient.
> In the existing code I was working on there was a parent record type and 
> a child record type with a parent->has_many(child) type relationship set 
> up. To flesh out the example a bit further, the child records held 
> information on events, which could either be single occurrences or 
> recurring. There was a requirement to be able to expand the recurring 
> events in memory into a collection of single events, and combine those 
> with any single events already in the collection. There was also a need 
> to return the collection sorted by date, and the ability to access 
> either the earliest or latest event in the collection.
> To accomplish the above in my Class::DBI::Iterator subclass I overrode 
> the constructor to return the collection in sorted order, and then added 
> a last() method to complement the existing first() method. That 
> permitted code like:
> $parent->child->first; # earliest event
> $parent->child->last;  # latest event
> And then an expand() method was added, which builds the expanded list, 
> and returns a new iterator object, so:
> $parent->child->expand->first; # earliest event in expanded list
> One of the complication I ran into when implementing this was the 
> sorting. Objects in the iterator's list are not inflated, so sorting on 
> anything but the ID requires first inflating the objects.
> The iterator class doesn't have an explicit method to inflate the 
> objects. That behavior is embedded in the next() iteration method. So to 
> inflate all the objects in the collection you have to loop over the 
> collection by repeatedly calling next(), or equivalently use the slice() 
> method.
> That seemed like functionality that should have been in 
> Class::DBI::Iterator. In my subclass I added:
> # return entire (inflated) collection as an array ref
> sub all {
>     my $self = shift;
>     my @data = $self->slice(0, $self->count - 1);
>     $self->reset;
>     return \@data;
> }
> (If incorporated into the parent class, perhaps a better name can be 
> chosen, and it should include "return @data if wantarray" to be 
> consistent with the rest of the API.)
> You'll note my all() method includes a call to $self->reset. That's 
> because slice() internally uses next(), which increments the internal 
> position index, $self->{_place}. Either this is an oversight, or it was 
> assumed that the caller would be discarding the original iterator object 
> and not care about the state of the position index. slice() should 
> probably be modified to incorporate a call to reset().
> The next minor problem was that Class::DBI::Iterator provides no methods 
> for accessing its internal data structure holding the collection. 
> There's the undocumented data() method, but that returns a copy of the 
> list, rather than providing modification access. That can be a problem 
> when you want to sort, and you don't want to return a new iterator 
> object. Modifying $self->{_data} seems to be the only option.
>  From an efficiency perspective, I'm not too keen on the idea of doing 
> sorts in Perl on data that was just pulled from the database. For this 
> particular application, the data sets are typically tiny, so it is 
> inconsequential, but I think the preferred approach would be to modify 
> the has_many() query to return the records in the desired order.
> What I'm less sure about is how CDBI handles modification to 
> collections. If I access $parent->child and get back a sorted list from 
> the database, then call $parent->add_to_child() to add another record to 
> the list, then call $parent->child again, does the newly added record 
> get returned in the collection? Is it in the correct sort position? Does 
> CDBI hit the database every time a has_many() method is called, or does 
> it cache collections also? (And what if I saved a reference to the first 
> iterator object, does it get updated? I doubt it.)
> The next problem I ran into was probably a result of the perhaps unusual 
> requirements for the expand() method. Because the sequence of expanded 
> recurring events were never going to be stored in the database and had 
> no direct relationship to a record in the database, they were 
> constructed as Class::Accessor objects, instead of CDBI objects. This 
> resulted in the iterator holding a mix of uninflated CDBI objects and 
> inflated Class::Accessor objects. This became a problem when it was time 
> to construct a new iterator object, which in turn wanted to sort the 
> list, which required inflating all of the objects.
> The next() method in Class::DBI::Iterator:
> sub next {
>     my $self = shift;
>     my $use  = $self->{_data}->[ $self->{_place}++ ] or return;
>     my @obj  = ($self->class->construct($use));
>         ...
> assumes the object is a CDBI object (as it calls construct()) and that 
> all objects in its collection are of exactly the same class (as passed 
> to the iterator's constructor and returned by $self->class). While it 
> may be rather unusual for that not to be the case, it also seems like it 
> will try and inflate an already inflated object (which might happen if 
> the calling code makes use of reset() and iterates over a collection 
> more than once).
> In my subclass I overrode next() as follows to skip over already 
> inflated objects:
> sub next {
>     my $self = shift;
>     my $use  = $self->{_data}->[ $self->{_place}++ ] or return;
>     # don't inflate an already inflated object
>     return $use if ref($use) ne 'HASH';
>     $self->{_place}--;
>     return $self->SUPER::next(@_);
> }
> This should probably be in the original code.

By the time you're finished writing all of this i suspect the code's going to 
look fairly similar to parts of DBIx::Class::ResultSet - we preferred 
resultsets as first-class concepts whereas CDBI (and Rose::DB::Object) are 
firmly wedded to the class-is-table, object-is-row philosophy.

It's probably worth having a close look at the ResultSet API and making a 
decision whether to port the parts of the API you need to your custom 
Class::DBI::Iterator class (or as a patch to CDBI core), or to switch your 
code across to DBIx::Class.

If you choose the first route I'll do my best to answer questions on here 
about the best way to copy stuff across; I've got a pretty good knowledge of 
both codebases. If you choose the second then we'll be happy to help on the 
dbix-class list but I'll get shouted at again if I mention it much more on here :)

      Matt S Trout       Offering custom development, consultancy and support
   Technical Director    contracts for Catalyst, DBIx::Class and BAST. Contact
Shadowcat Systems Ltd.  mst (at) shadowcatsystems.co.uk for more information

+ Help us build a better perl ORM: http://dbix-class.shadowcatsystems.co.uk/ +

More information about the ClassDBI mailing list