One way to optimize Entity Framework queries is to gather all the information you KNOW you will need at the time the query is executed, to avoid additional deferred queries. A common way of doing this is by using the .Include() method, which specifies additional tables/entities to be pulled at the time the query is run. There’s a minor problem with this; the .Include() method takes a string to indicate the objects to retrieve. This presents a significant problem for Agile developers, or any situation where your database or object schema is likely to undergo any significant amount of churn or refactoring. Consider the following excerpt from an Entity Framework 4 EDMX file, where we have a Job Entity and a parent JobType entity.
When we retrieve a job and want to include the job type information as part of the same query, the initial implementation would look something like this.
I don’t know about you, but the use of a magic string value to represent the parent table leaves me with a feeling of impending doom. If you decide to rename your parent entity to JobCategory, or introduce another entity between JobType and Job, your code will fail and you won’t catch it until execution time.
Magic strings are undergoing something of a renaissance in Microsoft technologies, and it’s a little disconcerting. ASP.NET MVC is riddled with them, and a group of like-minded developers have come up with the wonderful T4MVC template to eliminate or greatly reduce the need for these pesky buggers. It’s a tremendous boon to insulate your code against refactoring.
In our current project, we are using the ADO.NET Self-Tracking Entity Generator Template to create entities that keep track of their own changes but which are ignorant of their storage mechanism. Our application server tier is implemented in WCF, and this template is well-suited to architectures where the consuming application and the application server are both .NET-based.
Eliminating the magic strings requires a minor modification to the T4 template that generates the entities. Our insertion begins after line 194 of this template, which ends the region that generates complex properties for our entities, and before navigation properties are generated. Here’s what we’re inserting.
When T4 is run again, we get something like the following added to our entity definition for the Job class.
A public static inner class is generated for each entity, allowing us to rewrite our EF query like so:
Are the magic strings still there? Absolutely. But they’re defined in such a way that if your database schema or entity model change significantly, your code will break at compile time, which is vastly preferable to a run-time code bomb.
OK, what about those queries where we need to retrieve entities that are more than one level away? Consider the following additional excerpt from the EDMX. A job is at a location, and a location is tied to a state/province.
Hey, what’s with the “new”? Well, if you are using any sort of entity inheritance, your child entities will use the new keyword to mask any references from the parent class. This will come up if you are using either of the inheritance schemes available in EF4, which are table-per-hierarchy (TPH) and table-per-type (TPT). If you are using TPT, be aware that there are significant performance considerations.