I have moved my active blog over to tumblr. I've maintained this blog for reference but will be posting to http://www.robustsoftware.co.uk instead. I've pointed my Feedburner feed to tumblr so if you're subscribed already you should already have switched with me.

LINQ and the Law of Demeter

During Mike Wagg and Mark Needham’s talk on applying functional constructs to your C# code I had a bit of an epiphany. Derick Bailey recently wrote about how extension methods don’t count towards the Law of Demeter. For those of you unfamiliar with the Law of Demeter it says that you shouldn’t play with your friends members. You can look at them but you shouldn’t touch them. For a much drier and complete description you should read the wikipedia article on the Law of Demeter.

Now a lot the application of the Law of Demeter comes down to feel. As a geek this is annoying. I would much prefer something you can measure and then say you are breaking the Law of Demeter. Instead we have to have conversations about how I feel you’re breaking the Law of Demeter and you can argue that you feel your not. How… unscientific.

So all I can offer is a bit of insight into what I feel contributes to the Law of Demeter in terms of LINQ and extension methods in a more general sense.

Take this example LINQ statement:

company.Employees.Where(employee => employee.Salary > 10)

If we embark on a purely dot counting exercise we’ll see that we have three. In my eyes there are only two relevant dots as I look at the statement like this:

company.Employees.Where(employee => employee.Salary > 10)

If you consider what we are doing we are looking at the employees of the company, which is one step. Then for each employee we are inspecting their salary, which is a second step. The LINQ statement is not doing extra traversal for us, it’s just helping us write less code.

Does that mean this LINQ statement is ok? No. Two relevant dots starts my spider sense tingling. I can smell some encapsulation is required. What we really want is to pull this business logic into our domain:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why would you do this? It seems like a lot of additional work for such a simple LINQ statement. This is one of the things that Mike Wagg highlighted that I would echo. If you don’t encapsulate this logic from the start you will end up with 10 easy-to-write LINQ statements at various places in your code that do the exact same thing. After all it’s easier to write the query again rather than refactor other calls back into your domain.

Also notice how we now only a single dot per method, Demeter would be pleased:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why can we ignore most LINQ statements when thinking about Demeter? It’s because of what the code would be like if they weren’t being used:

public IEnumerable<Employee> HighlyPaidEmployees()
{
foreach (var employee in employees)
{
if (employee.Salary > 10)
yield return employee;
}
}

When using your own extension methods, the logic contained within them may not be as flat and simple. Just by using a single extension method you may be breaking the Law of Demeter. If the method you are calling breaks the Law of Demeter then by calling that method you also break the Law of Demeter. Think of it as handling stolen goods, just because you didn’t steal them doesn’t mean you’re not culpable.

The Law of Demeter boils down to the number of layers you traverse as part of your statements. Here's a LINQ statement that would make Demeter cry:

company.Employees.Where(employee => employee.Department.Manager == dave)

What we are doing is getting everyone who is in the department managed by Dave. We are traversing down to the employees of the company, up to their department and across to their manager.

company.Employees.Where(employee => employee.Department.Manager == dave)

Conversely, here’s a LINQ statement that is no worse than one we started with:

company.Employees
.Where(employee => employee.Salary > 10)
.Where(employee => employee.Manager == sandra)
.Where(employee => employee.StartDate < DateTime.Today.AddYears(-2))

Why is this just as bad even though there's a hell of a lot more dots? It's because we're looking at a company's employees, then searching for the ones highly paid, that work for Sandra and have been at the company more than two years. The number of traversals is the same even though there's a lot more code.

The one thing I would want you to take away from this is that the Law of Demeter is not a dot counting exercise. Think about what your code is doing rather than what it looks like and you’ll have a much better idea of whether Demeter wants to hurt you.