Why Your Active Record Implementation is Example of a Poor Software Design?

Sunday, May 13th, 2012|Beginner tips, Verison 4.2|by Romans

Through Agile Toolkit I’m sharing with the world my improved vision for a better Active Record Implementation. I believe that the widely popular Active Record implementations are examples of a bad software design. Here is the reason.

What is Active Record?

Active Record pattern have been hugely popular due to adoption in Ruby on Rails and in other frameworks. (Active Record explained on wikipedia). Typically a class is generated based on the database table schema. All the fields of the database are reflected as properties of the model. Class has several constructors allowing to load model directly by ID or search for records. Only one record can be loaded at a time.

Why Classic Active Record is Broken?

No Testability

When the object is created, constructors does not accept reference to the database engine. Instead, they manage to pull it from some global namespace. That makes it very difficult to use Active Record with multiple database configuration or for testing.

Static methods / Constructors

A static method of a class which creates new instance of that class is called “constructor”. With Active Record there are multiple constructors present which can load record in a different ways. Number of constructors depend on the table structure in some implementation such as loadByCode() would only be applicable if “code” field exists.

Abusive properties

Potentially table can contain any field therefore model may have no “reserved” properties. That makes it virtually impossible to build any decent logic in the abstract model class.

Code generation = duplication

I am always amazed how same people preach about code re-use and how awful it is to copy-paste your code and then use code generators which effectively build thousands of lines of code for them. It might look good in your versioning system, but it is a very bad development practice.

One table = one class

While frameworks typically allow you to specify “parent class” in the YAML definition, in practice it is rarely used. We know that sometimes one table may contain different entity types, but Active Record implementations does not help in separating them into different classes.

Operations with multiple records

By definition Active Record object can hold one record only. If you need to iterate through multiple records you will end up creating and destroying objects, which introduces performance overheads.

Lack of conditioning

Active Record typically allow to load ANY record present in the table. It’s virtually impossible in many implementation of Active Record to restrict loading to certain types of data. That is, for example, implementation of soft-deletion. Implementing it often is a major effort.

Recipe for an Improved Active Record

Dependency Injection

There might be many variables model may require. Database driver object is one thing, but it might also require some other information. Specifying multiple objects for constructor is troublesome and inconsistent. My suggestion is to specify just one object which contains links to other necessary resources.

This object can be passed through the factory class. I have solved this problem having each object carry reference to such an object and whenever new object is created, it also receives that reference. That is a “api” class which can be used to reference database connection: $this->api->db. In practice, there may be multiple API classes, which makes it possible to inject dependency into any object.

Avoiding Constructors

If a model object could have a state where it is not associated with the database record, then the same object could be reused multiple times. If object is created first and then load() method is used to load new record from database then it can be subsequently called to add more records without the need to create model instance every time.

That means other methods can be used for loading data. As developer you may define new methods for data loading in your model classes which would override default methods.

Avoiding using properties for fields

In a database one field usually is a primary key. In most cases it’s called “id” but not always. If we want to introduce a property in our model, which will always refer to this primary key it may clash with non-primary key in a classic implementation of Active Record.

Fortunately PHP objects can also act as arrays. By using $model['name']=’John’ the requests can be easily routed and saved inside internal array without polluting all of model property namespace. The $model->id then can always contain value of record’s primary key. Other operations are possible through methods set() and get(). $model->set($array);

In this implementation other model properties can be used internally by model business logic without affecting any fields.

Avoid Code Generation

Now that we have freed up properties of the Model class, we no longer need code generation. In fact we can configure model fields through PHP and store more meta-information for each field.

Not only that, but adding fields dynamically is now possible by using addField method. Native PHP calls are faster then using PHP to parse other file format. Native PHP is also more powerful and can allow you to define fields much more flexibly.

One table = Many Classes

Now we can inherit our models and add additional models and change behavior for the newly created classes. No longer there is one-to-one relation. You can also remove some field definition in your sub-classes if the field is not used for that business case.

Now you can create models for “User”, “Admin” and “Moderator” based on the same code-base and the same database table. Moderator, however, would have more methods/actions and might as well be able to address more fields in a table.

Ability to extend Models is a powerful strategy. It allows you to leave your existing code intact, but add a new model for your new use-cases. This greatly helps you to reduce amount of testing you need to perform after structural change of your database.

Operation with multiple rows

Now that the active record object have a state with no data, it can also be used as iterator through multiple records. What is great about our implementation is that for each iteration we simply need to load data from PDO into our internal array.

But let’s also add ability for a model to have conditions, which are automatically applied when loading data. By using conditions you can narrow down the selection of a model. Some models can even have default conditions, such as “Admin” model would only iterate through records with is_admin=”Y”.

Full Conditioning

With the availability of transactions, we can now insert record into the database and then attempt to load it. If existing model conditions will let record to be loaded, then it was saved properly. Otherwise the newly saved model does not conform to the conditions and transaction must be canceled.

With this the model of “Admin” can no longer save non-admin users into the database. Full conditioning now gives us a great assurance that any widget, piece of code or any developer working with Admin model would have no way to bypass some of the restrictions.

What’s even more valuable in SaaS applications is the ability to introduce condition based on a currently-logged-in user. Removing the need to always check for record author and having it done automatically is a great piece of mind for the security focused people.

Agile Toolkit delivers Improved Active Record Approach and more

Model implementation in Agile Toolkit offers all of the solutions described above and much more including Joins, Expressions, Traversing and Behaviors. Found out more: http://agiletoolkit.org/doc/modeltable