вторник, декабря 11, 2007

ActiveRecord's database shortcut methods

I see more and more people nowadays complain about other people using destroy_all to delete a bunch of records without understanding the performance decrease compared to using delete_all.

Well, I'll tell you what: in general they do the right thing. My opinion is that having such shortcut methods like delete_all (or update_attribute that issues an update command without triggering validation/callbacks) are evil. While ActiveRecord tries to provide a foundation for building your domain model (smart model, not just containers for data) those methods allow to mess everything up. E.g. you have a model, and when you decide to provide a means to delete all objects, you use delete_all because of better performance than destroy_all. Later on you decide to add some on_destroy callbacks, and then you find out that you need to change every call to delete_all to destroy_all.

Calling delete_all just breaks an invariant for your objects.

What it would be nice is to optimize destroy_all to just call delete_all if nothing else is required. The main reason to call destroy_all is the on_destroy callback. So, ActiveRecord could check if there is any callbacks and if there is no callbacks, then just call delete_all operation.

The same could be done for updates (and it was discussed for a long time): update only those fields that have changed. Then, instead of calling "special" method to update just one field, you will just update the required field and call regular #save. AR will detect all changed fields and issue an update command for just those fields.

How this can be implemented ? The ideal solution is to keep a set of "unmodified" data and compare object's data to that "unmodified" set on save. But on big objects it could require two times more memory. Instead, it could be easier to track what fields were assigned values to and set "changed" flag for that field. If the flag wasn't yet set and the assigned value differs from the one that it is now, then set the flag. If the flag was already set, then don't update it. Of course, it will sometimes update the field to the same value if you first set it to some other value and then set the value that was before. But I believe that such cases are rare and it is not a problem if we update it with the same value.
Storing a boolean flag per database column is not an issue also.