[CDBI] Class::DBI::Iterator

Tom Metro tmetro+cdbi at gmail.com
Sun Aug 27 21:00:32 BST 2006


As mentioned in a prior post, I had a need to write some methods that 
operated on collections of CDBI records, and found subclassing 
Class::DBI::Iterator as one solution. Below are some of the challenges I 
ran into and areas in which Class::DBI::Iterator seems deficient.

In the existing code I was working on there was a parent record type and 
a child record type with a parent->has_many(child) type relationship set 
up. To flesh out the example a bit further, the child records held 
information on events, which could either be single occurrences or 
recurring. There was a requirement to be able to expand the recurring 
events in memory into a collection of single events, and combine those 
with any single events already in the collection. There was also a need 
to return the collection sorted by date, and the ability to access 
either the earliest or latest event in the collection.

To accomplish the above in my Class::DBI::Iterator subclass I overrode 
the constructor to return the collection in sorted order, and then added 
a last() method to complement the existing first() method. That 
permitted code like:

$parent->child->first; # earliest event
$parent->child->last;  # latest event

And then an expand() method was added, which builds the expanded list, 
and returns a new iterator object, so:

$parent->child->expand->first; # earliest event in expanded list


One of the complication I ran into when implementing this was the 
sorting. Objects in the iterator's list are not inflated, so sorting on 
anything but the ID requires first inflating the objects.

The iterator class doesn't have an explicit method to inflate the 
objects. That behavior is embedded in the next() iteration method. So to 
inflate all the objects in the collection you have to loop over the 
collection by repeatedly calling next(), or equivalently use the slice() 
method.

That seemed like functionality that should have been in 
Class::DBI::Iterator. In my subclass I added:

# return entire (inflated) collection as an array ref
sub all {
     my $self = shift;

     my @data = $self->slice(0, $self->count - 1);
     $self->reset;
     return \@data;
}

(If incorporated into the parent class, perhaps a better name can be 
chosen, and it should include "return @data if wantarray" to be 
consistent with the rest of the API.)

You'll note my all() method includes a call to $self->reset. That's 
because slice() internally uses next(), which increments the internal 
position index, $self->{_place}. Either this is an oversight, or it was 
assumed that the caller would be discarding the original iterator object 
and not care about the state of the position index. slice() should 
probably be modified to incorporate a call to reset().

The next minor problem was that Class::DBI::Iterator provides no methods 
for accessing its internal data structure holding the collection. 
There's the undocumented data() method, but that returns a copy of the 
list, rather than providing modification access. That can be a problem 
when you want to sort, and you don't want to return a new iterator 
object. Modifying $self->{_data} seems to be the only option.


 From an efficiency perspective, I'm not too keen on the idea of doing 
sorts in Perl on data that was just pulled from the database. For this 
particular application, the data sets are typically tiny, so it is 
inconsequential, but I think the preferred approach would be to modify 
the has_many() query to return the records in the desired order.

What I'm less sure about is how CDBI handles modification to 
collections. If I access $parent->child and get back a sorted list from 
the database, then call $parent->add_to_child() to add another record to 
the list, then call $parent->child again, does the newly added record 
get returned in the collection? Is it in the correct sort position? Does 
CDBI hit the database every time a has_many() method is called, or does 
it cache collections also? (And what if I saved a reference to the first 
iterator object, does it get updated? I doubt it.)


The next problem I ran into was probably a result of the perhaps unusual 
requirements for the expand() method. Because the sequence of expanded 
recurring events were never going to be stored in the database and had 
no direct relationship to a record in the database, they were 
constructed as Class::Accessor objects, instead of CDBI objects. This 
resulted in the iterator holding a mix of uninflated CDBI objects and 
inflated Class::Accessor objects. This became a problem when it was time 
to construct a new iterator object, which in turn wanted to sort the 
list, which required inflating all of the objects.

The next() method in Class::DBI::Iterator:

sub next {
	my $self = shift;
	my $use  = $self->{_data}->[ $self->{_place}++ ] or return;
	my @obj  = ($self->class->construct($use));
         ...

assumes the object is a CDBI object (as it calls construct()) and that 
all objects in its collection are of exactly the same class (as passed 
to the iterator's constructor and returned by $self->class). While it 
may be rather unusual for that not to be the case, it also seems like it 
will try and inflate an already inflated object (which might happen if 
the calling code makes use of reset() and iterates over a collection 
more than once).

In my subclass I overrode next() as follows to skip over already 
inflated objects:

sub next {
     my $self = shift;

     my $use  = $self->{_data}->[ $self->{_place}++ ] or return;

     # don't inflate an already inflated object
     return $use if ref($use) ne 'HASH';

     $self->{_place}--;
     return $self->SUPER::next(@_);
}

This should probably be in the original code.


Lastly, the Class::DBI::Iterator man page seems to be rather bare bones. 
  None of the methods are documented, aside from the example usage 
shown. To effectively subclass it, it really needs to be fleshed out 
more. I'd be happy to contribute a patch to do that, and incorporate the 
above suggestions after I see what the feedback is on this thread.

  -Tom

-- 
Tom Metro
Venture Logic, Newton, MA, USA
"Enterprise solutions through open source."
Professional Profile: http://tmetro.venturelogic.com/



More information about the ClassDBI mailing list