One way to optimize Entity Framework queries is to gather all the information you KNOW you will need at the time the query is executed, to avoid additional deferred queries. A common way of doing this is by using the .Include() method, which specifies additional tables/entities to be pulled at the time the query is run. There’s a minor problem with this; the .Include() method takes a string to indicate the objects to retrieve. This presents a significant problem for Agile developers, or any situation where your database or object schema is likely to undergo any significant amount of churn or refactoring. Consider the following excerpt from an Entity Framework 4 EDMX file, where we have a Job Entity and a parent JobType entity.
When we retrieve a job and want to include the job type information as part of the same query, the initial implementation would look something like this.
- Job job = entityContext.Jobs
- .Include(“JobType”)
- .SingleOrDefault(j => j.JobId == jobId);
I don’t know about you, but the use of a magic string value to represent the parent table leaves me with a feeling of impending doom. If you decide to rename your parent entity to JobCategory, or introduce another entity between JobType and Job, your code will fail and you won’t catch it until execution time.
Magic strings are undergoing something of a renaissance in Microsoft technologies, and it’s a little disconcerting. ASP.NET MVC is riddled with them, and a group of like-minded developers have come up with the wonderful T4MVC template to eliminate or greatly reduce the need for these pesky buggers. It’s a tremendous boon to insulate your code against refactoring.
In our current project, we are using the ADO.NET Self-Tracking Entity Generator Template to create entities that keep track of their own changes but which are ignorant of their storage mechanism. Our application server tier is implemented in WCF, and this template is well-suited to architectures where the consuming application and the application server are both .NET-based.
Eliminating the magic strings requires a minor modification to the T4 template that generates the entities. Our insertion begins after line 194 of this template, which ends the region that generates complex properties for our entities, and before navigation properties are generated. Here’s what we’re inserting.
- region.Begin(“Include Reference Names”);
- #>
- public static <#=(entity.BaseType == null ? “” : “new “)#>class IncludeReferences
- {
- <#
- foreach (NavigationProperty navProperty in entity.NavigationProperties.Where(np => np.DeclaringType == entity))
- {
- #>
- public static readonly string For<#=code.Escape(navProperty)#> = “<#=code.Escape(navProperty)#>”;
- <#
- }
- #>
- }
- <#
- region.End();
When T4 is run again, we get something like the following added to our entity definition for the Job class.
- #region Include Reference Names
- public static class IncludeReferences
- {
- public static readonly string ForJobType = “JobType”;
- }
- #endregion
A public static inner class is generated for each entity, allowing us to rewrite our EF query like so:
- Job job = entityContext.Jobs
- .Include(Job.IncludeReferences.ForJobType)
- .SingleOrDefault(j => j.JobId == jobId);
Are the magic strings still there? Absolutely. But they’re defined in such a way that if your database schema or entity model change significantly, your code will break at compile time, which is vastly preferable to a run-time code bomb.
OK, what about those queries where we need to retrieve entities that are more than one level away? Consider the following additional excerpt from the EDMX. A job is at a location, and a location is tied to a state/province.
No problem.
- Job job = entityContext.Jobs
- .Include(Job.IncludeReferences.ForJobType)
- .Include(Job.IncludeReferences.ForLocation)
- .Include(Job.IncludeReferences.ForLocation + “.” + Location.IncludeReferences.ForStateProvince)
- .FirstOrDefault(j => j.JobId == jobId);
This can now devolve into a heated debate vis-à-vis the relative merits of appending strings vs. String.Format() vs. String.Join(). Nothing like bringing a gun to a knife-fight.
FINAL NOTE:
Hey, what’s with the “new”? Well, if you are using any sort of entity inheritance, your child entities will use the new keyword to mask any references from the parent class. This will come up if you are using either of the inheritance schemes available in EF4, which are table-per-hierarchy (TPH) and table-per-type (TPT). If you are using TPT, be aware that there are significant performance considerations.














Code First CTP5 already gives the option to get rid of the magic .Include() strings. See the first eager loading example here – http://blogs.msdn.com/b/adonet/archive/2011/01/31/using-dbcontext-in-ef-feature-ctp5-part-6-loading-related-entities.aspx
Thanks for the update. Will be nice to have that as an addition once it’s out of CTP status. The solution above offers a performance benefit as well (no lambda expression at runtime).
Thanks again!
Also, not for this case, CTP5 solves it, but a good way to add a check to know if the expression returns a member is to cast expression.Body as MemberExpression.
So in that solution you could check it the result of the casting is null and raise an argument exception or something like this.
This way you can control developers not adding other kind of expressions like include(x=> x.tostring()).
(hope my english it’s understandable)
Exactly how much time did it take you to create Panel Curtains “Getting rid of Magic Strings in Entity
Framework 4 Includes | Esencia Development”? It comes with
a great deal of fantastic knowledge. Thx -Philipp
[...] already discussed my deep, abiding disdain for magic strings in any programming language. When developing for iOS, [...]