22. 7. 2010

Repository pattern

What is repository pattern and why it makes sense to apply it in our application? Let's start with a short look into the history of software design to better understand the need for the pattern in these days.


History
In the old days, the programs were quite simple. User interface, logic and data access code was usually hosted in one file, project, in the end in an executable file. Complexity wasn't high, it somehow worked and everybody was happy. As requirements get more complex and programs bigger and bigger, we started realized that in order to rule the chaos, several rules and patterns has to be introduced. We started divided the functionality into logical pieces, decoupled those parts from each other. We discovered also importance of unit testing and good code is also easy to test. We've understood that it is good thing to divide user interface or views from the domain and the domain from data access code and the data storage. The repository pattern is helping exactly in the above scenarios.


Repository pattern is an interface between your domain model and data mapping functionality. It acts like an in-memory collection of domain entities (objects).


What does the above mean in real world? Consider this blog. Assume the data are stored in a SQL database. Lets say we are using Hibernate data mapper to save our domain model to the database. What is our domain model, our domain entities? Nothing else than classes like Blog, Post, Comment. So far so good. I take instance of lets say Comment, and using Hibernate I can very conveniently save it to the underlying data storage, in our example SQL database. Now, lets repeat the definition of the pattern. Interface between domain model (Post, Blog, Comment) and data mapper (NHibernate, in the end SQL database). Why we need to put something between domain and data mapper if everything works so nice?

When not to use it
If your domain model isn't very complex and your data mapper is designed to aid unit testing, then the benefit will be low in comparing to adding one level of complexity. It is important to remember that repository pattern adds another layer of indirection between the model and data mapper. However - we all know that even very simple one button application can grow to big enterprise applications.

Clean model
However if your model is bigger than a simple application, your application will benefit from applying repository pattern. Important thing is that the repository interface hides the implementation of a data mapper (3rd party or home grown). Repository acts as collection of domain entities (e.g. collection of Comments in a Post in a Blog). To your domain functionality and the other layers, only this interface will be visible. This ensure that your domain won't get poisoned with data mapper code that naturally does not belongs to the domain functionality. It helps you to keep things separated, keep the things under the control.

Unit testing
Additionally, working with the interface and not some concrete classes greatly helps unit testing. It is very easy to mock or implement dummy stub of the interface. Remember that code without test is legacy code.

Loosely coupled design
Last thing that is nice side effect but I consider it as good practice is to separate all (3rd party) tools from the application if possible. This allows we to change tool to different one. Want to switch from NHibernate to Entity framework in .NET world? Not very big problem. Of course, depends of the complexity, but still it gives you the confides that you can do it if it is really needed what whatever reason.

How does the interface look like
Repository is usually strongly domain oriented. In other words, we are usually working with the interface that works concrete domain entity and methods have strong domain names and input/output types. Imagine we have domain entites (classes) Blog, Post and Comment. We would like to read and write posts to the data storage (a database). Common term for this functionality is CRUD (create, read, update, delete). Every repository should support CRUD. Consider following C# code:

public class Post

{
  public int Id { get; set; }
  public string Text { get; set; }
  public IList Comments { get; set; }
}

public interface IPostRepository
{
  // create
  void Insert(Post post);
 
  // read
  Post GetById(int id);
 
  // update
  void Update(Post post);
 
  // delete
  void Delete(Post post);
}
 
The above interface is dedicated to Post entity and with this design you get strongly typing. The client of the repository works with the interface as with in-memory collection of data. It does not care (and does not want to) what is going behind the scene (translate method to sql language, create sql command, execute command, covert result to domain objects). This greatly simplifies design of the client and ensures good testability.


Domain oriented repository is suggested way of working with repository pattern, it helps with Dependency Injection (DI)/Inversion of Control (IoC) pattern and aids readability and strongly typing.

Usually the data mappers provides the above functionality in some form. The implementation of the interace in most cases just wraps the data mapper's methods to provide domain oriented functionality for the entity (e.g. inserting of new Post).

Complex queries
The above interface is nice and easy, but is also very simple and for sure won't satisfies our needs. Usually we need to do complex queries and the interface needs to support it. It means nothing else that extending repository interface with functionality, where the client can specify what entities it wants back. Of course the repository is limited by the functionality of the underlying data mapper but the good news is that most of the data mappers these days support complex queries. Repository just have to wrap it a domain oriented way, that means expose methods with strong domain names and types. It is called Specification pattern (domain oriented), in comparing to Query pattern (generic) that is usually implemented by the data mapper. But this is topic for another post.

Generic Repository
For those hungry minds who doesn't want to wait, you can check out Generic Repository .NET project at http://code.google.com/p/genericrepository/. It is a project that provides build in interfaces for repositories, unit of work, transactions and so on. The goal is to provide base classes for various data mappers like NHibernate, Linq2Sql, Entity Framework and no sql mappers like RavenDb, Mongo or CouchDB, so you don't have to re-implement same functionality over and over again in your projects (which causing also bugs and decreases quality of the product). Accessing data storage is with it very fast. Java, Ruby and other developers can benefit from observing the interfaces. Main entry is base generic repository interface for all strong domain repositories. Complex queries are handler using specification pattern. Testing projects provides simple example for usage and tests. More documentation on the project page.

Žiadne komentáre:

Zverejnenie komentára