[CDBI] Class::DBI::Iterator
Matt S Trout
dbix-class at trout.me.uk
Sun Aug 27 22:18:05 BST 2006
Tom Metro wrote:
> As mentioned in a prior post, I had a need to write some methods that
> operated on collections of CDBI records, and found subclassing
> Class::DBI::Iterator as one solution. Below are some of the challenges I
> ran into and areas in which Class::DBI::Iterator seems deficient.
>
> In the existing code I was working on there was a parent record type and
> a child record type with a parent->has_many(child) type relationship set
> up. To flesh out the example a bit further, the child records held
> information on events, which could either be single occurrences or
> recurring. There was a requirement to be able to expand the recurring
> events in memory into a collection of single events, and combine those
> with any single events already in the collection. There was also a need
> to return the collection sorted by date, and the ability to access
> either the earliest or latest event in the collection.
>
> To accomplish the above in my Class::DBI::Iterator subclass I overrode
> the constructor to return the collection in sorted order, and then added
> a last() method to complement the existing first() method. That
> permitted code like:
>
> $parent->child->first; # earliest event
> $parent->child->last; # latest event
>
> And then an expand() method was added, which builds the expanded list,
> and returns a new iterator object, so:
>
> $parent->child->expand->first; # earliest event in expanded list
>
>
> One of the complication I ran into when implementing this was the
> sorting. Objects in the iterator's list are not inflated, so sorting on
> anything but the ID requires first inflating the objects.
>
> The iterator class doesn't have an explicit method to inflate the
> objects. That behavior is embedded in the next() iteration method. So to
> inflate all the objects in the collection you have to loop over the
> collection by repeatedly calling next(), or equivalently use the slice()
> method.
>
> That seemed like functionality that should have been in
> Class::DBI::Iterator. In my subclass I added:
>
> # return entire (inflated) collection as an array ref
> sub all {
> my $self = shift;
>
> my @data = $self->slice(0, $self->count - 1);
> $self->reset;
> return \@data;
> }
>
> (If incorporated into the parent class, perhaps a better name can be
> chosen, and it should include "return @data if wantarray" to be
> consistent with the rest of the API.)
>
> You'll note my all() method includes a call to $self->reset. That's
> because slice() internally uses next(), which increments the internal
> position index, $self->{_place}. Either this is an oversight, or it was
> assumed that the caller would be discarding the original iterator object
> and not care about the state of the position index. slice() should
> probably be modified to incorporate a call to reset().
>
> The next minor problem was that Class::DBI::Iterator provides no methods
> for accessing its internal data structure holding the collection.
> There's the undocumented data() method, but that returns a copy of the
> list, rather than providing modification access. That can be a problem
> when you want to sort, and you don't want to return a new iterator
> object. Modifying $self->{_data} seems to be the only option.
>
>
> From an efficiency perspective, I'm not too keen on the idea of doing
> sorts in Perl on data that was just pulled from the database. For this
> particular application, the data sets are typically tiny, so it is
> inconsequential, but I think the preferred approach would be to modify
> the has_many() query to return the records in the desired order.
>
> What I'm less sure about is how CDBI handles modification to
> collections. If I access $parent->child and get back a sorted list from
> the database, then call $parent->add_to_child() to add another record to
> the list, then call $parent->child again, does the newly added record
> get returned in the collection? Is it in the correct sort position? Does
> CDBI hit the database every time a has_many() method is called, or does
> it cache collections also? (And what if I saved a reference to the first
> iterator object, does it get updated? I doubt it.)
>
>
> The next problem I ran into was probably a result of the perhaps unusual
> requirements for the expand() method. Because the sequence of expanded
> recurring events were never going to be stored in the database and had
> no direct relationship to a record in the database, they were
> constructed as Class::Accessor objects, instead of CDBI objects. This
> resulted in the iterator holding a mix of uninflated CDBI objects and
> inflated Class::Accessor objects. This became a problem when it was time
> to construct a new iterator object, which in turn wanted to sort the
> list, which required inflating all of the objects.
>
> The next() method in Class::DBI::Iterator:
>
> sub next {
> my $self = shift;
> my $use = $self->{_data}->[ $self->{_place}++ ] or return;
> my @obj = ($self->class->construct($use));
> ...
>
> assumes the object is a CDBI object (as it calls construct()) and that
> all objects in its collection are of exactly the same class (as passed
> to the iterator's constructor and returned by $self->class). While it
> may be rather unusual for that not to be the case, it also seems like it
> will try and inflate an already inflated object (which might happen if
> the calling code makes use of reset() and iterates over a collection
> more than once).
>
> In my subclass I overrode next() as follows to skip over already
> inflated objects:
>
> sub next {
> my $self = shift;
>
> my $use = $self->{_data}->[ $self->{_place}++ ] or return;
>
> # don't inflate an already inflated object
> return $use if ref($use) ne 'HASH';
>
> $self->{_place}--;
> return $self->SUPER::next(@_);
> }
>
> This should probably be in the original code.
By the time you're finished writing all of this i suspect the code's going to
look fairly similar to parts of DBIx::Class::ResultSet - we preferred
resultsets as first-class concepts whereas CDBI (and Rose::DB::Object) are
firmly wedded to the class-is-table, object-is-row philosophy.
It's probably worth having a close look at the ResultSet API and making a
decision whether to port the parts of the API you need to your custom
Class::DBI::Iterator class (or as a patch to CDBI core), or to switch your
code across to DBIx::Class.
If you choose the first route I'll do my best to answer questions on here
about the best way to copy stuff across; I've got a pretty good knowledge of
both codebases. If you choose the second then we'll be happy to help on the
dbix-class list but I'll get shouted at again if I mention it much more on here :)
--
Matt S Trout Offering custom development, consultancy and support
Technical Director contracts for Catalyst, DBIx::Class and BAST. Contact
Shadowcat Systems Ltd. mst (at) shadowcatsystems.co.uk for more information
+ Help us build a better perl ORM: http://dbix-class.shadowcatsystems.co.uk/ +
More information about the ClassDBI
mailing list