Seeing how Chris Houser, aka Chouser, aka @chrishouser, one of the co-authors of the Joy Of Clojure, shared a link to my post on twitter, and with a few tweets:
I figured I should follow up the start of that conversation, and tease apart my current thinking about loops. The short version is:
I never want to have to write a loop again.
As this is quite the bold statement, some elaboration is in order.
First I realized a number of years ago that every time I would write a loop, I would write a lot of duplicate code. The code wasn’t always identical, but it had the same structure, so much so, that many IDEs now have ‘for loop’ templates. I had realized there were certain structures to the loops, but I never had the “AH-HA!” moment to truly recognize the patterns.
Then a series of events started to come together: I read Structure and Interpretation of Computer Programs for the first time, we moved on to .NET 3.5 at work and were able to take advantage of LINQ, I started looking into Ruby, and finally, I started digging into Clojure.
Now I had heard people talk about some of the looping patterns before, hearing people talk about Smalltalk and Ruby, but it wasn’t until I started getting into Clojure that it fully clicked in that there really are only a handful of different loops: projection (map, select, transformation), collection (reduce, collect, aggregate, accumulate, sum, multiple), filtering (filter, exclude, include, where), grouping (group, frequencies), and maybe a couple of others I am leaving out now, as well as composition of the different looping constructs.
Projection
The projection looping pattern simply takes a collection of elements and applies some projection against them to get a new view of the objects. The Hello World example of this is the squares of a set of numbers.
The imperative version in C#, or any C-style language would be written as:
var squares = new List<int>();
for(int i = 1; i <= 10; 1++)
{
squares.Add(i*i);
}
return squares;
In Clojure this would be:
(map #(* % %) (range 1 11))
In C# with LINQ this is:
Enumerable.Range(1, 10).Select(x => x * x);
Collection
The collection looping pattern is about getting a single value out, though there may be cases when multiple values come out. The Hello World example of this tends to be the sum of the numbers in a sequence, or the multiple of the numbers.
The imperative version in C#, or any C-style language would be written as:
var sum = 0;
for(int i = 1; i <= 10; 1++)
{
sum += i;
}
return sum;
In Clojure this would be:
(reduce + (range 1 11))
In C# with LINQ this is:
Enumerable.Range(1, 10).Sum();
Filtering
The filter looping pattern is to find a subset of items that match a criteria.
The Hello World example of this tends to be getting only the even numbers in a sequence.
The imperative version in C#, or any C-style language would be written as:
var numbers = new List<int>();
for(int i = 1; i <= 10; 1++)
{
if (!IsEven(i))
continue;
numbers.Add(i);
}
return numbers;
In Clojure this would be:
(filter even? (range 1 11))
In C# with LINQ this is:
Enumerable.Range(1, 10).Where(x => IsEven(x));
Grouping
The grouping looping pattern is to create a set of groups of items, where each group match the same criteria.
The Hello World example of this tends to be grouping numbers of a sequence into those that are even, and those that are not.
The imperative version in C#, or any C-style language would be written as:
var groups = new Dictionary<bool, List<int>>();
for(int i = 1; i <= 10; 1++)
{
var key = IsEven(i);
if (!groups.ContainsKey(key))
{
groups[key] = new List<int>();
}
groups[key].Add(i);
}
return groups;
In Clojure this would be:
(group-by even? (range 1 11))
In C# with LINQ this is:
Enumerable.Range(1, 10).GroupBy(x => IsEven(x));
In this type of looping, the differences between the imperative and the declarative styles really start to show, even in a simple example such as this.
What about for loops as a counter?
As you can see above, I was just going against a range of numbers, and if needed, would go against range generated with a increment.
OK. So!?
Well…
First, my disclaimer. I am sure I have some of the pattern names wrong, but I have tried to identify them generically to help describe the differences though name only. Second, I am by far not the first one to identify these, I am only trying to document my understanding at the time, and help spread knowledge about these concepts. I would love feedback on the correct pattern names, and the others that I might have left out, or missed, so that I can help keep this as accurate as possible.
Second, I realize that these are all trivial examples, but I hope they illustrate enough to clarify following point I am about to make, if you haven’t already bought into it.
The beauty of these is that the loops become easily composable, as well as the program is now freed from the implementation of these operations, which can be changed separately from the program. The operations could be lazily evaluated and only done on demand; they could be parsed and fed into one giant function that exhibits the same results but tuned to be able to require only one pass through the items; if the language, or program is idempotent, then they could parallelized either by splitting off work into smaller pieces and assembling the results together, or farmed out to multiple worker processes so that we have multiple subprograms working against a problem and then we get a result from those (first one wins, or some kind of voting system). But when I am writing the program those are details I don’t care about.
This allows me to first express what I want the program to do, and then (after) I have expressed my intent and am sure that it is correct, I, or preferably someone much smarter than me, can go back and determine how, and any better ways, to execute my intent.
So in summary, to take Steve McConnell’s quote in Code Complete:
…if you work for 10 years, do you get 10 years of experience or do you get 1 year of experience 10 times?
and ask:
…if you have been writing loops for 10 loops, have you just been writing same loop all 10 years?
–Proctor