I have moved my active blog over to tumblr. I've maintained this blog for reference but will be posting to http://www.robustsoftware.co.uk instead. I've pointed my Feedburner feed to tumblr so if you're subscribed already you should already have switched with me.

Currying with C#

At the Brighton altnetbeers on Tuesday night we ended up talking quite a bit about functional languages. One of the concepts I often forget the meaning of is currying. Mike Hadlow was kind enough to remind me of what it was and I thought I’d write a blog post explaining what it was in terms of C# to help me remember in the future.

Currying is what happens when you call a function without providing all the required parameters. Instead of failing to compile functional languages such as F# will instead return a function that takes the unspecified arguments as parameters.

Now that probably makes bugger all sense without knowing what currying is in the first place. As this is meant to be an explanation I’ll show you an example of how we can emulate currying in C#. This is almost better than showing you it in a functional language as it will show the nuts and bolts of what functional languages do for you.

I’ll be dropping my usual use of var as specifying the types should make it easier to follow what’s going on.

We’ll start with a simple method to add two numbers together:

int Sum(int x, int y)
{
return x + y;
}

At the moment we can only call our method by providing both arguments:

int sum = Sum(2, 3);

What currying does is create an overload that lets you call the method with one argument. But rather than pass a default value for the second argument and call the original function, it instead returns a function that will provide the second argument.

Wow, this is hard to explain in words. Maybe some code will make it clearer:

Func<int, int> Sum(int x)
{
return y => Sum(x, y);
}

This lets you store an instance of a function that performs a certain manipulation like this:

Func<int, int> addTwo = Sum(2);

int three = addTwo(1);
int five = addTwo(3);

With a single function this is a novelty. The power comes when you can curry functions together. To demonstrate this we'll need to add a multiply method:

int Multiply(int x, int y)
{
return x * y;
}

Func<int, int> Multiply(int x)
{
return y => Multiply(x, y);
}

Functional languages also let you specify functions as arguments in place of specific values. That starts getting a little trickier to emulate in C#:

An hour or two passes whilst I work out how to do this!

It took me a while to work it out but you need to get a whole lot more functional to do this bit. That means all our methods have to start using Func<int> instead of plain int. In hindsight this was obvious because in functional languages everything is a function, hence the name. Even numbers are represented by functions that return the number in question rather than the number itself, referred to as literals.

This is what this actually looks like:

Func<int> Sum(Func<int> x, Func<int> y)
{
return Literal(x() + y());
}

Func<Func<int>, Func<int>> Sum(Func<int> x)
{
return y => Sum(x, y);
}

Func<int> Multiply(Func<int> x, Func<int> y)
{
return Literal(x() * y());
}

Func<Func<int>, Func<int>> Multiply(Func<int> x)
{
return y => Multiply(x, y);
}

Func<int> Literal(int x)
{
return () => x;
}

The literal method isn’t really required but makes the code a lot clearer rather than having empty lambdas dotted about the place.

To take a step back for a second our original sums:

Func<int, int> addTwo = Sum(2);

int three = addTwo(1);
int five = addTwo(3);

Will now look like this:

Func<Func<int>, Func<int>> addTwo = Sum(Literal(2));

Func<int> three = addTwo(Literal(1));
Func<int> five = addTwo(Literal(3));

It's more verbose but it lets us curry functions together which was our original aim.

On to the currying function itself. Now if you're allergic to angled brackets and lambdas you are going to want to skip this chunk of code. It's pretty awesome horrendous.

Func<Func<int>, Func<int>> Curry(Func<Func<int>, Func<int>> x, Func<Func<int>, Func<int>> y)
{
return a => y(x(a));
}

What this is doing is creating a new function which when it gets passed a literal it determines the result of invoking the x function and returns the result of passing that on to the y function. Again, an example will probably help demonstrate what's going on.

Func<Func<int>, Func<int>> addTwo = Sum(Literal(2));
Func<Func<int>, Func<int>> timesThree = Multiply(Literal(3));
Func<Func<int>, Func<int>> addTwoTimesThree = Curry(addTwo, timesThree);

Func<int> twentyOne = addTwoTimesThree(Literal(5));

Now because of how the code is now structured we could keep nesting these simple functions together to create much more complex functions. This is the root of the power of functional languages.

Well that was interesting. I definitely understand currying and functional languages a bit better now and I hope someone else has learnt something in the process.

I’ve put a full code listing up on gist in case anyone wants to grab it and play about with it.

Mapping a GUID to a string with NHibernate

In a project I’m working on at the moment I needed to back a GUID to a string as it didn’t seem to be working out of the box with my SQLite database. It was saving it to whatever type of column I gave it in what looked like byte form leading to some crazy characters. However, it was falling over when trying to hydrate the field from that column.

Now I’m probably epic failing and there’s a much easier way to do this but I ended up creating a custom user type for doing this which I've made available on gist. I won’t be surprised if there’s an easier way and please let me know about it with a comment. I’ve also included the Fluent NHibernate class map to show how it use it if you’re not familiar with custom user types.

This got me moving again and googling how to do this came up blank, hence the blog post.

LINQ and the Law of Demeter

During Mike Wagg and Mark Needham’s talk on applying functional constructs to your C# code I had a bit of an epiphany. Derick Bailey recently wrote about how extension methods don’t count towards the Law of Demeter. For those of you unfamiliar with the Law of Demeter it says that you shouldn’t play with your friends members. You can look at them but you shouldn’t touch them. For a much drier and complete description you should read the wikipedia article on the Law of Demeter.

Now a lot the application of the Law of Demeter comes down to feel. As a geek this is annoying. I would much prefer something you can measure and then say you are breaking the Law of Demeter. Instead we have to have conversations about how I feel you’re breaking the Law of Demeter and you can argue that you feel your not. How… unscientific.

So all I can offer is a bit of insight into what I feel contributes to the Law of Demeter in terms of LINQ and extension methods in a more general sense.

Take this example LINQ statement:

company.Employees.Where(employee => employee.Salary > 10)

If we embark on a purely dot counting exercise we’ll see that we have three. In my eyes there are only two relevant dots as I look at the statement like this:

company.Employees.Where(employee => employee.Salary > 10)

If you consider what we are doing we are looking at the employees of the company, which is one step. Then for each employee we are inspecting their salary, which is a second step. The LINQ statement is not doing extra traversal for us, it’s just helping us write less code.

Does that mean this LINQ statement is ok? No. Two relevant dots starts my spider sense tingling. I can smell some encapsulation is required. What we really want is to pull this business logic into our domain:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why would you do this? It seems like a lot of additional work for such a simple LINQ statement. This is one of the things that Mike Wagg highlighted that I would echo. If you don’t encapsulate this logic from the start you will end up with 10 easy-to-write LINQ statements at various places in your code that do the exact same thing. After all it’s easier to write the query again rather than refactor other calls back into your domain.

Also notice how we now only a single dot per method, Demeter would be pleased:

company.HighlyPaidEmployees()

public IEnumerable<Employee> HighlyPaidEmployees()
{
return employees.Where(employee => employee.Salary > 10);
}

Why can we ignore most LINQ statements when thinking about Demeter? It’s because of what the code would be like if they weren’t being used:

public IEnumerable<Employee> HighlyPaidEmployees()
{
foreach (var employee in employees)
{
if (employee.Salary > 10)
yield return employee;
}
}

When using your own extension methods, the logic contained within them may not be as flat and simple. Just by using a single extension method you may be breaking the Law of Demeter. If the method you are calling breaks the Law of Demeter then by calling that method you also break the Law of Demeter. Think of it as handling stolen goods, just because you didn’t steal them doesn’t mean you’re not culpable.

The Law of Demeter boils down to the number of layers you traverse as part of your statements. Here's a LINQ statement that would make Demeter cry:

company.Employees.Where(employee => employee.Department.Manager == dave)

What we are doing is getting everyone who is in the department managed by Dave. We are traversing down to the employees of the company, up to their department and across to their manager.

company.Employees.Where(employee => employee.Department.Manager == dave)

Conversely, here’s a LINQ statement that is no worse than one we started with:

company.Employees
.Where(employee => employee.Salary > 10)
.Where(employee => employee.Manager == sandra)
.Where(employee => employee.StartDate < DateTime.Today.AddYears(-2))

Why is this just as bad even though there's a hell of a lot more dots? It's because we're looking at a company's employees, then searching for the ones highly paid, that work for Sandra and have been at the company more than two years. The number of traversals is the same even though there's a lot more code.

The one thing I would want you to take away from this is that the Law of Demeter is not a dot counting exercise. Think about what your code is doing rather than what it looks like and you’ll have a much better idea of whether Demeter wants to hurt you.

Describing the journey

Often when explaining something it’s easy to straight to the end goal saying: “This is how to do it. Isn’t it awesome?!” At this point you will get a blank stare and quite possibly a “why?” This can be frustrating, you’re showing something that’s an order of magnitude better than the current solution but they can’t see it.

This isn’t their problem. They aren’t idiots. It’s your fault.

In order to understand a solution you have to understand each step that was taken in order to reach it. You may think you’re doing them a favour by helping them avoid all the trials and tribulations you undertook; but you’re not. You’re doing them a disservice. Understanding the journey taken will give them a deeper knowledge and understanding of the destination.

If when you learnt to ride a bike your dad rode past you going: “Look isn’t this great? Now you do it.” You’ll try, faceplant and go home crying. The pain of you learning to ride a bike would be your burden alone. You might have still learnt to ride a bike, in a rather painful way, but you would be more likely to have given up. What your dad actually did was hold on to you, put stabilisers on your bike and taught you each part of the process individually. He shared in your pain, invested in you and ultimately he shared in your success.

If you really invest in teaching something to someone you’ll tell the story. Make sure they understand the initial problem, explain the various approaches you took before selecting the final one. Describe the ones you dismissed out-of-hand and why you did that. With all this knowledge then the student will be in the position to say “this is awesome” and the teacher won’t have to.

Do anything less and you aren’t sharing your knowledge, you’re sharing a solution. If all you share are solutions you aren’t enabling people to create their own.

What’s so good about OpenRasta?

I’ve been proclaiming the greatness of OpenRasta to anyone unfortunate to start talking to me about web development recently. I thought it was about time that I recorded the reasons I love it so much somewhere everyone can see.

OpenRasta is by no means a finished product but it has principles at its core that I value greatly and give it massive potential.

Testability

Unless you’ve been hiding under a rock for the past few years you’ll know how important I and many others think testability. If you don’t know why it’s important I suggest you go and check out Scott Bellware’s blog. He has written a series of post recently about how accountability and constant verification are vital in the search for productivity.

OpenRasta has been built with tests from the ground up and this fact shines through throughout the code base. I’ve never seen a code base like it; it’s beautiful. The entire framework is a testament to SOLID design principles which means extending it is childs play. When you’ve been used to frameworks like ASP.NET MVC where you will end up coding yourself into a sealed or extremely hard to test corner; OpenRasta is like a typhoon of fresh air.

Building upon a testable framework makes your code easier to test. For example, how many magically bound properties do you think your handlers (the equivalent of controllers) have? Zero. Zilch. They don’t even have an interface, never mind a base class. I’ll let you digest that one for a moment. No base class, no monolithic controller context, no magically appearing model state.

So what happens when you want to look at the details of a request or any of the state that OpenRasta has created as part of your request? OpenRasta, to enable it’s testability, revolves around a dependency resolver. All the details of the request are registered to this dependency resolver so all you need to do is take the relevant interface (yes I did say interface, not class, welcome to a world where everything you wish had an interface does) as a dependency to your handler. It really is that simple.

Extensibility

From the testablity you get a great deal of extensibility. There are several ways to jump in at various points of the request in order to manipulate it. The main ones are pipeline contributors and operation interceptors which I’ve shown how to implement before. OpenRasta comes with a default set of contributors and interceptors but gives you the ability to remove or replace any of them. Sure you can get yourself in a real mess if you don’t know what you’re doing but I like my tools sharp, powerful and dangerous. Especially something as pivotal as the framework I’m building upon.

You can of course implement your own binders, URI resolvers, etc. if you so wish. You could alter OpenRasta to work just like ASP.NET MVC if you desired. Though I wouldn’t recommend it!

As part of extensibility you can’t overlook the fact that this is true open source. Wish there was an enhancement to the core of OpenRasta or an extra method on an interface? Discuss it on the mailing list, code it (with tests of course) and submit a patch.

RESTful

What I want from to see from a web framework is a deep understanding of web standards and the HTTP protocol. You will not find many people who know more about RFCs and HTTP than Sebastien Lambla, it’s actually quite disturbing how much he knows about them.

OpenRasta revolves around resources not controllers, generating URIs from types and individual objects rather than strings or at best lambda expressions. This takes a little time to get used to but it’s more powerful and easier to test. Why is it easier to test? You probably saw this coming but URI resolution is of course done behind an interface. Ahhhhh. So much easier than mocking a controller context.

OpenRasta also ships with content type negotiation. Your handler operations (akin to controller actions) return straight resources or operation results with a response resource attached. How this resource is returned to the client is a entirely separate function. Want to get a response in JSON instead of XHTML? Add .AsJsonDataContract() to your resource registration and just ask for it with the relevant Accept header. OpenRasta will transcode your resource to JSON for you and return it to the client.

If you directly return a resource from your operation, it is assumed that the request went ok and so a response code of 200 OK is sent back to the client. If you want to report a missing resource you can return an OperationResult.NotFound which will return 404 Not Found rather than just displaying your 404 view with a response of 200 OK. Why does this matter? It makes your javascript cleaner as you can request the resource in JSON format and check the response code rather than having to send a flag or some other work around.

Community

There’s a growing following for OpenRasta, it’s being presented by Kyle Baley at MIX10 and this has caused the lovers of OpenRasta to pull together. We’re intending to improve the documentation giving more example of it’s use and generally helping Seb knock the edges off what is a great framework.

The OpenRasta mailing list is very active with people providing very rapid responses to any questions. If you uncover a bug Seb will often have a fix committed the very same day, with regression tests of course.

What next?

In the near future I’m going to churn out a series of blog posts or perhaps screen casts showing how I leveraged vanilla OpenRasta to create a simple wiki. Hopefully, this will help more people get started with OpenRasta.

If there’s anything in particular you’d like me to cover leave a comment below.

Questioning commonly held opinions

I’ve been looking into rails recently, partly to learn rails itself, partly to try and gain inspiration for the direction to take both the framework we use at work and Jamaica. It’s triggered me to question a lot of things that I’m doing in .NET so I’m going to attempt to record my thoughts.

Having testing and persistence baked in

One thing that has struck me is that I’m watching railscasts from years ago and they are demonstrating solutions to problems ASP.NET MVC is struggling with now. It really does have testing baked in but often not through the use of abstractions, instead they have made integration tests easy to write. It appears the entire request can be processed in memory without having to fire up a server all the way through to rendering the view. This is something OpenRasta is capable of and one of the reasons I get so excited about it.

Another benefit rails has is that it comes with a “model” baked in. This is something that is sorely lacking from ASP.NET MVC. Whereas the .NET world (particularly the ALT.NET sphere) favours NHibernate for persistence, they instead use a simple but well implemented active record implementation. Why have they chosen this? Partially because it works so well with a dynamic language, everything is DRY as code can be generated from the database at runtime. It’s also damn quick, something that has always bugged me about NHibernate is the 10-20 seconds it takes to create the session factory before you can get started.

Because NHibernate is so powerful and configurable it can quite frankly be bloody confusing for most developers, I get asked to help with problems almost daily. With active record everything is kept simple, this object is a row in that table. There aren’t often complex mapping files, it just works. Rails’ active record implementation has a layer of syntactic sugar based around pattern matching on missing methods which looks nice and works well. However, in .NET we have LINQ which is statically typed, very powerful and can serve the same purpose.

Why do we chose NHibernate?

The reasons given for choosing NHibernate are persistence ignorance, reducing the impedance mismatch, amongst others. Why do we insist on ignoring the fact that our applications are running on top of a relational database? What ever happened to embracing the technologies we are using to get the best out of them?

NHibernate lets you persist your domain models directly. So what? Who’s domain models are that much beyond dumb data stores in reality? I’d bargain it’s a hell of a lot less than those that claim to be doing domain driven design. Also, why does your domain have to be persisted directly? Why couldn’t you provide an empty domain object with a few active record objects and have it process them and spit other active record objects out the other side? Something CQRS-ish. Then your domain would contain nothing but business logic.

What does this mean?

I’m looking away from NHibernate, no tool should be chosen without question, even the great NHibernate. I just don’t think it is serving my needs as well as it could so I’m going to look for alternatives. Subsonic is top of my list, followed by revisiting LINQ to SQL, hell I might even take a look at the Entity Framework.

While we’re at it, why do we chose relational databases without question? I’ve been working with Lucene.NET a lot recently, it’s really easy to use and you get full-text indexing for free if you were to use it as a data store. Only want to deal with aggregate roots? Store all the data of that aggregate root in a single document, it can easily emulate nested structures.

Random deviation onto REST

I’m also being influenced by how REST appeals to me. It seems like a simple yet powerful way of dealing with the web. If we consider everything we work with as a resource, a representation of an idea, why can’t that representation be stored as an active record? The transition of an idea between resources could actually be a record transferring between tables. The old representation could even hold a reference to what the idea became, allowing you to redirect to the current representation. Hell, wasn’t the web made for storing and linking documents before it got bastardised into the beautiful beast we have today? Perhaps a document database would be ideally suited!

Pain points of ASP.NET MVC

Talking of REST brings me back to another problem I have with ASP.NET MVC. The routing revolves around controllers and actions rather than the resources and HTTP methods. In order to generate URIs, you will at some point end up having to manipulate strings or create anonymous objects. How do rails and OpenRasta deal with this? By using resources and their types to determine the URIs. This is so much more powerful and dare I say easier to use and understand. I can see it empowering the use of generics within .NET and it cuts out a hell of a lot of crap around URI generation.

Ah, cutting out the crap. ASP.NET MVC is testable, until you want to test something that isn’t a just a controller action. Like a HTML helper, URI generation or store the IP address of the request. Most of these things are kind of testable, but you have to mock out a complex object graph like HttpContextBase or ControllerContext. “We got rid of HttpContext.Current and replaced it with something slightly less monolithic and slight less sealed, go us!”. Know how OpenRasta deals with this? It has dependency injection at it’s core and there’s an interface that lets you retrieve everything you need, either separately or together. There’s everything from IRequest to ICommunicationContext (the equivalent of ControllerContext) and everything in between. You need it, you can retrieve it and it’s bloody easy to test.

Wrapping up

There’s a lot of random ideas here, I can’t say which I’ll end up using. Perhaps none of them, perhaps an unforeseen combination. If nothing else writing this down has helped me distil a few of them into clearer thoughts.

Here’s to questioning everything you believe to be true.

Test-Driven Development – 3 Years On

I’ve been questioning my own practices recently, seeing if there were places that I could improve, both in quality and productivity. Part of this process involved an evaluation of my test-driven approach (TDD) to developing software.

I have gained and learnt so much by practicing TDD. I believe there is no better way to instil the SOLID principle within both yourself and your code than to practice TDD. Violate any of the principles and you feel the pain in your tests, they will either become monolithic and hard to write or continuously break.

Here lies the chicken and the egg situation: to write good, robust tests you have to understand SOLID; to understand SOLID you have to be testing your code. This creates quite a hurdle that many people don’t put forth the effort to overcome. This is a shame but I think an excellent way of splitting the wheat from the chaff. In order to become proficient at testing you either need to have a natural talent for writing good code or the persistence to break through to the required level of understanding. Both of these qualities are equally valuable, having both would be fantastic.

The quality of code produced by TDD has never been in doubt in my mind. What I am questioning is the return-on-investment (ROI) of each test I write. Sometimes I feel I am writing a test for the sake of writing a test, producing very little value in the process. The kind of scenario where this is most apparent is the sort of code where if you were to not have written it properly nothing would happen or it would blow up in your face. Code with real binary levels of success, often with a single path through it.

This is the elephant in the room of all TDD discussions. When am I testing too much? The standard response of “NEVER!” is a lie but you have to have practiced TDD for a decent while until you can judge when you’re going to be writing tests with little worth. I’ve identified a subset of tests that I’m probably wasting my time in writing which are clouding the overall message of my test suite. The problem is that I am responsible for mentoring several developers in how and when to craft tests. I have to lead by example and until the whole team reaches a higher level of understanding of TDD I have to test everything despite knowing some of the tests I write have little to no value.

How can I test everything without writing these low value tests? Currently I’m looking at integration acceptance tests that describe the behaviour of the system. These will verify that the several layers of the application interact as expected in given scenarios to produce the desired behaviour of the system. These will likely have meaty setups and meaty verifications but they will remove the need for multiple low ROI tests. They are likely to be more brittle than unit tests but so long as I verify outcomes rather than interactions they should be fairly robust. What I’d love to experiment with is getting the stakeholders to help me write these acceptance tests, but this will have to wait until I’ve settled on a style for writing them. There’s nothing that harms adoption of a practice more than the first interaction to be a bumbling mess as you are not sure of what you’re doing!

As a company, we are also looking to hire testers this year. This is another thing stopping me from cutting out the low ROI tests. If I do not test my code, it will not get tested through anything but a manual process. Once we have testers I may be able to just write the tests that drive the behaviour and design of my code, leaving the full suite of tests to be developed by the testers.

So what have I learnt over the past few years? Is test-driven development worth the effort? Most definitely. The quality of my code has come on leaps and bounds and each time you do a major refactor of a well tested code base is a revelation. I find I write less code to do more and writing less code is always a good thing.

It’s harder to be sure you spend less time fixing bugs if I’m honest but I’m confident it’s the case. If you’re just starting out with TDD or unit testing in any form, start tracking the hours spent developing versus bug fixing. It would be interesting to see some empirical values on the subject.

Here’s to future years of TDD. It’s going to be interesting to see where my practices go.