I have moved my active blog over to tumblr. I've maintained this blog for reference but will be posting to http://www.robustsoftware.co.uk instead. I've pointed my Feedburner feed to tumblr so if you're subscribed already you should already have switched with me.

Currying with C#

At the Brighton altnetbeers on Tuesday night we ended up talking quite a bit about functional languages. One of the concepts I often forget the meaning of is currying. Mike Hadlow was kind enough to remind me of what it was and I thought I’d write a blog post explaining what it was in terms of C# to help me remember in the future.

Currying is what happens when you call a function without providing all the required parameters. Instead of failing to compile functional languages such as F# will instead return a function that takes the unspecified arguments as parameters.

Now that probably makes bugger all sense without knowing what currying is in the first place. As this is meant to be an explanation I’ll show you an example of how we can emulate currying in C#. This is almost better than showing you it in a functional language as it will show the nuts and bolts of what functional languages do for you.

I’ll be dropping my usual use of var as specifying the types should make it easier to follow what’s going on.

We’ll start with a simple method to add two numbers together:

int Sum(int x, int y)
{
return x + y;
}

At the moment we can only call our method by providing both arguments:

int sum = Sum(2, 3);

What currying does is create an overload that lets you call the method with one argument. But rather than pass a default value for the second argument and call the original function, it instead returns a function that will provide the second argument.

Wow, this is hard to explain in words. Maybe some code will make it clearer:

Func<int, int> Sum(int x)
{
return y => Sum(x, y);
}

This lets you store an instance of a function that performs a certain manipulation like this:

Func<int, int> addTwo = Sum(2);

int three = addTwo(1);
int five = addTwo(3);

With a single function this is a novelty. The power comes when you can curry functions together. To demonstrate this we'll need to add a multiply method:

int Multiply(int x, int y)
{
return x * y;
}

Func<int, int> Multiply(int x)
{
return y => Multiply(x, y);
}

Functional languages also let you specify functions as arguments in place of specific values. That starts getting a little trickier to emulate in C#:

An hour or two passes whilst I work out how to do this!

It took me a while to work it out but you need to get a whole lot more functional to do this bit. That means all our methods have to start using Func<int> instead of plain int. In hindsight this was obvious because in functional languages everything is a function, hence the name. Even numbers are represented by functions that return the number in question rather than the number itself, referred to as literals.

This is what this actually looks like:

Func<int> Sum(Func<int> x, Func<int> y)
{
return Literal(x() + y());
}

Func<Func<int>, Func<int>> Sum(Func<int> x)
{
return y => Sum(x, y);
}

Func<int> Multiply(Func<int> x, Func<int> y)
{
return Literal(x() * y());
}

Func<Func<int>, Func<int>> Multiply(Func<int> x)
{
return y => Multiply(x, y);
}

Func<int> Literal(int x)
{
return () => x;
}

The literal method isn’t really required but makes the code a lot clearer rather than having empty lambdas dotted about the place.

To take a step back for a second our original sums:

Func<int, int> addTwo = Sum(2);

int three = addTwo(1);
int five = addTwo(3);

Will now look like this:

Func<Func<int>, Func<int>> addTwo = Sum(Literal(2));

Func<int> three = addTwo(Literal(1));
Func<int> five = addTwo(Literal(3));

It's more verbose but it lets us curry functions together which was our original aim.

On to the currying function itself. Now if you're allergic to angled brackets and lambdas you are going to want to skip this chunk of code. It's pretty awesome horrendous.

Func<Func<int>, Func<int>> Curry(Func<Func<int>, Func<int>> x, Func<Func<int>, Func<int>> y)
{
return a => y(x(a));
}

What this is doing is creating a new function which when it gets passed a literal it determines the result of invoking the x function and returns the result of passing that on to the y function. Again, an example will probably help demonstrate what's going on.

Func<Func<int>, Func<int>> addTwo = Sum(Literal(2));
Func<Func<int>, Func<int>> timesThree = Multiply(Literal(3));
Func<Func<int>, Func<int>> addTwoTimesThree = Curry(addTwo, timesThree);

Func<int> twentyOne = addTwoTimesThree(Literal(5));

Now because of how the code is now structured we could keep nesting these simple functions together to create much more complex functions. This is the root of the power of functional languages.

Well that was interesting. I definitely understand currying and functional languages a bit better now and I hope someone else has learnt something in the process.

I’ve put a full code listing up on gist in case anyone wants to grab it and play about with it.

Mapping a GUID to a string with NHibernate

In a project I’m working on at the moment I needed to back a GUID to a string as it didn’t seem to be working out of the box with my SQLite database. It was saving it to whatever type of column I gave it in what looked like byte form leading to some crazy characters. However, it was falling over when trying to hydrate the field from that column.

Now I’m probably epic failing and there’s a much easier way to do this but I ended up creating a custom user type for doing this which I've made available on gist. I won’t be surprised if there’s an easier way and please let me know about it with a comment. I’ve also included the Fluent NHibernate class map to show how it use it if you’re not familiar with custom user types.

This got me moving again and googling how to do this came up blank, hence the blog post.

LINQ and the Law of Demeter

During Mike Wagg and Mark Needham’s talk on applying functional constructs to your C# code I had a bit of an epiphany. Derick Bailey recently wrote about how extension methods don’t count towards the Law of Demeter. For those of you unfamiliar with the Law of Demeter it says that you shouldn’t play with your friends members. You can look at them but you shouldn’t touch them. For a much drier and complete description you should read the wikipedia article on the Law of Demeter.

Now a lot the application of the Law of Demeter comes down to feel. As a geek this is annoying. I would much prefer something you can measure and then say you are breaking the Law of Demeter. Instead we have to have conversations about how I feel you’re breaking the Law of Demeter and you can argue that you feel your not. How… unscientific.

So all I can offer is a bit of insight into what I feel contributes to the Law of Demeter in terms of LINQ and extension methods in a more general sense.

Take this example LINQ statement:

company.Employees.Where(employee => employee.Salary > 10)

If we embark on a purely dot counting exercise we’ll see that we have three. In my eyes there are only two relevant dots as I look at the statement like this:

company.Employees.Where(employee => employee.Salary > 10)

If you consider what we are doing we are looking at the employees of the company, which is one step. Then for each employee we are inspecting their salary, which is a second step. The LINQ statement is not doing extra traversal for us, it’s just helping us write less code.

Does that mean this LINQ statement is ok? No. Two relevant dots starts my spider sense tingling. I can smell some encapsulation is required. What we really want is to pull this business logic into our domain:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why would you do this? It seems like a lot of additional work for such a simple LINQ statement. This is one of the things that Mike Wagg highlighted that I would echo. If you don’t encapsulate this logic from the start you will end up with 10 easy-to-write LINQ statements at various places in your code that do the exact same thing. After all it’s easier to write the query again rather than refactor other calls back into your domain.

Also notice how we now only a single dot per method, Demeter would be pleased:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why can we ignore most LINQ statements when thinking about Demeter? It’s because of what the code would be like if they weren’t being used:

public IEnumerable<Employee> HighlyPaidEmployees()
{
foreach (var employee in employees)
{
if (employee.Salary > 10)
yield return employee;
}
}

When using your own extension methods, the logic contained within them may not be as flat and simple. Just by using a single extension method you may be breaking the Law of Demeter. If the method you are calling breaks the Law of Demeter then by calling that method you also break the Law of Demeter. Think of it as handling stolen goods, just because you didn’t steal them doesn’t mean you’re not culpable.

The Law of Demeter boils down to the number of layers you traverse as part of your statements. Here's a LINQ statement that would make Demeter cry:

company.Employees.Where(employee => employee.Department.Manager == dave)

What we are doing is getting everyone who is in the department managed by Dave. We are traversing down to the employees of the company, up to their department and across to their manager.

company.Employees.Where(employee => employee.Department.Manager == dave)

Conversely, here’s a LINQ statement that is no worse than one we started with:

company.Employees
.Where(employee => employee.Salary > 10)
.Where(employee => employee.Manager == sandra)
.Where(employee => employee.StartDate < DateTime.Today.AddYears(-2))

Why is this just as bad even though there's a hell of a lot more dots? It's because we're looking at a company's employees, then searching for the ones highly paid, that work for Sandra and have been at the company more than two years. The number of traversals is the same even though there's a lot more code.

The one thing I would want you to take away from this is that the Law of Demeter is not a dot counting exercise. Think about what your code is doing rather than what it looks like and you’ll have a much better idea of whether Demeter wants to hurt you.