This article is an answer to the question: "Is it OK that my Entities extend some 3rd party class?"
Consider this beautiful PHP code:
class Book {
protected $title;
protected $is_read = false;
function makeRead() {
$this->is_read = true;
}
}
Isn't it nice when you can load a book from database just like this:
$book = BookStore::loadBy('title', 'How to PHP');
$book->makeRead();
The reality, however, is riddled with practical problems and, the approach above, simply does not always work.
I have looked through various PHP ORM and non-PHP persistence mappers during some extensive research. In this article, I wanted to share my findings and some of the conclusions I've made for my unique solution to this problem in Agile Data persistence framework. Hopefully I will be able to answer why some frameworks ask your Book
class to extend Model
/Entity
class and others don't.
Persistence frameworks come a long way, but they have originated from non-network environments. The core idea is that object's data is stored in a local file. When the data is needed, data is loaded (hydration) and object attributes are populated. Changes you perform on the attributes will be stored back.
Frameworks are designed to "make us, developers, care less". They solve some of our problems, either image watermarking, URL routing or long-term data storage. In this example, if we continue to "not care how data is persisted" we get in trouble:
foreach ($book->chapters as $chapter) {
echo $chapter->title.' by '.$chapter->author->name;
}
"Not caring" is not always practical. Not long, you run into issues and the initial concept of "relieving you from being aware that data is stored elsewhere" crumbles.
When the concept's fundamental principle fails, it can be patched up or a new approach can be created. So how do some popular frameworks deal with it?
Doctrine pretty much chooses to patch the failed concept. It tries to make your entities look clean and simple. But things are not simple and that's why your comments now influence your code. Meet annotations:
class User
{
/**
* @ORM\Id @ORM\Column @ORM\GeneratedValue
* @dummy
* @var int
*/
private $id;
/**
* @ORM\Column(type="string")
* @Assert\NotEmpty
* @Assert\Email
* @var string
*/
private $email;
}
Annotations give Persistence Manager (the code that stores / loads data) more knowledge of your object. PHP does not have good type handling, so you annotate that. If your database has field 'my-field' you cannot store it in a property, so you have to map it - again - through annotation.
These ORM don't even try to sell the concept of "not caring how data is persisted". Practicality over Concept design makes the developer painfully aware that data is stored elsewhere and "you better help me fetch it" concept takes over.
A practical solution requires a PHP code (not annotations) to be used for augmenting the "Entity". In other words - your Entity now must be extended from a class supplied by the framework to allow:
Agile Data has no Entities. It has Model
but it is different. Here is how: "In Agile Data, Model
instance represents Set of Records":
$book = new Book($db);
// set of all Book entities
$chapter = $book->withID(1)->ref('Chapter');
// set of all Chapters related to Book with id=1
Next, lets see how those different approaches help us address a practical problem.
Since we are now aware that the data is elsewhere, we want to fine-tune and minimize the amount of data retrieved and sent back to the database. How will different approaches deal with this task?
foreach ($book->chapters as $chapter) {
echo $chapter->title.' by '.$chapter->author->name;
}
Our original code is not efficient. That's because iteration is not aware of what information we will need. Persistence Mappers that allow iterating related objects will have to make a choice from two equally bad options:
Doctrine can approach problem with some DQL code:
$q = $em->createQuery("select partial b.{id,title} from MyApp\Domain\Book b");
but if you also want to get author.name
in the same query, it's becomes more and more cryptic: https://stackoverflow.com/a/9505215/204819
CakePHP (and few other ORMs) solve the issue by using a separate read-only stream object - query:
$query = $book->find()->select([
'Book.id',
'Book.title'
])
->contain([
'RealestateAttributes' => [
'fields' => [
'Author.name',
]
]
])
->where($condition);
Similarly to Doctrine, this requires developer to know database intimately.
I understand how important selective queries are, especially in a bigger applications. It's very important to let the developer choose which data they receive without going into database implementation details.
How do we solve this problem in Agile Data?
Agile Data defines fields, references and referenced fields all in one place:
class Book extends \atk4\data\Model {
public $table = 'book';
function init() {
parent::init();
$this->hasMany('Chapter', new Chapter());
}
}
class Chapter extends \atk4\data\Model {
public $table = 'chapter';
function init() {
parent::init();
$this->addField('title');
$this->hasOne('author_id', new Author())
->addField('author', 'full_name');
}
}
class Author extends \atk4\data\Model {
public $table = 'author';
function init() {
parent::init();
$this->addField('full_name');
}
}
Yet when you create model instance you can specify which fields to work with:
$book = new Book($db);
$chapter = $book->withID(1)->ref('Chapter');
$chapter->onlyFields('title', 'author');
Because Book was extended from Model
you can use its onlyFields
, ref
and withID
methods At the same time Doctrine used a raw user-defined Entity
class and CakePHP diverted query-specific-stuff into a separate Query
object. Are we asking for trouble by making Model
too smart? Well, ... no.
Firstly Model
and Persistence
are fully separate in Agile Data, keeping it clean, but there is another major difference:
When Doctrine populates 100 instances of a class that extends nothing and contains only a few properties, that's actually not bad. Other PHP ORMs hack around and try to avoid populating their heavy Entity objects giving you plenty of other options work with Query
class or fetching result arrays.
I saw the opportunity here to introduce a different approach.
In Agile Data 1 model = Set of records. But that's a set you can operate with - update, call methods and traverse. The next two examples show how you can select specific fields for your query:
foreach($book->ref('Chapters')->onlyFields('title', 'author_name') as $chapter) {
echo $chapter['title'].' by '.$chapter['author_name'];
}
This uses single query that fetches only 3 fields from the database, yet:
The design of Agile Data allows you to query without any database logic. You only need to decide which fields you want with your records. Quite often, you don't even have to do this choice yourself:
So suppose you want to display a CRUD listing chapters of a book with specific title and allow user to edit those those records? And then to restrict to only specific fields?
That's easily manageable through PHP code alone:
// Agile UI - Create a fully interractive CRUD for editing chapters of a specific book
$book = new Book($db);
$book->loadBy('title', 'How to PHP');
$crud = $app->add(new CRUD());
$crud->setModel($book->ref('Chapters'), ['title', 'author_name']);
So I wanted to give you some advice. Not on which data persistence to use, but rather on learning how to evaluate them.
If you use any database abstraction framework that magically saves / loads data - they are most likely to be impractical. You may get issues with large data sets.
Either through annotations, YAML file or PHP method calls - one way or another you must tell Persistence Mapper about your data. Some ways may involve extra file parsing, code generation or caching. Others may rely on the magical qualities of PHP.
I think that - the more straightforward is the approach, the better
From the framework design perspective, extending is good, because it's more efficient and elegant than hacking around. Yet - be aware when your data framework spits 100 copies of your class at you - this code won't scale.
There are not many Data Persistence frameworks that try to address actual problems in a consistent and complete way. In this article I have mentioned only few features of Agile Data, but, I invite you to give have a closer look at: