Category Archives: Functional Programming

Clojure function has-factors-in?

Just another quick post this evening to share a new function I created as part of cleaning up my solution to Problem 1 of Project Euler.

Was just responding to a comment on Google+ on my update sharing the post Project Euler in Clojure – Problem 16, and I saw the commenter had his own solution to problem 1. In sharing my solution I realized that I could clean up my results even further, and added a function has-factors-in?. These updates have also been pushed to my Project Euler in Clojure Github repository for those interested.

(defn has-factors-in? [n coll]
  (some #(factor-of? % n) coll))

Where before I had:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(or (factor-of? 3 %) (factor-of? 5 %))) (range n))))

It now becomes:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(has-factors-in? % [3 5]) (range n)))))

This change makes my solution read even more like the problem statement given.

Your thoughts?

–Proctor

John Backus on the Assignment Statement

The assignment statement is the von Neumann bottle-neck of programming languages and keeps us thinking in word-at-a-time terms in much the same way the computer’s bottleneck does.

John Backus, 1977
ACM Turing Award Lecture,
Communications of the ACM
August 1978, Volume 2, Number 8

How Clojure is breaking my brain – Loops

Seeing how Chris Houser, aka Chouser, aka @chrishouser, one of the co-authors of the Joy Of Clojure, shared a link to my post on twitter, and with a few tweets:

I figured I should follow up the start of that conversation, and tease apart my current thinking about loops.  The short version is:

I never want to have to write a loop again.

As this is quite the bold statement, some elaboration is in order.

First I realized a number of years ago that every time I would write a loop, I would write a lot of duplicate code. The code wasn’t always identical, but it had the same structure, so much so, that many IDEs now have ‘for loop’ templates. I had realized there were certain structures to the loops, but I never had the “AH-HA!” moment to truly recognize the patterns.

Then a series of events started to come together: I read Structure and Interpretation of Computer Programs for the first time, we moved on to .NET 3.5 at work and were able to take advantage of LINQ, I started looking into Ruby, and finally, I started digging into Clojure.

Now I had heard people talk about some of the looping patterns before, hearing people talk about Smalltalk and Ruby, but it wasn’t until I started getting into Clojure that it fully clicked in that there really are only a handful of different loops: projection (map, select, transformation), collection (reduce, collect, aggregate, accumulate, sum, multiple), filtering (filter, exclude, include, where), grouping (group, frequencies), and maybe a couple of others I am leaving out now, as well as composition of the different looping constructs.

Projection

The projection looping pattern simply takes a collection of elements and applies some projection against them to get a new view of the objects. The Hello World example of this is the squares of a set of numbers.
The imperative version in C#, or any C-style language would be written as:

var squares = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  squares.Add(i*i);
}
return squares;

In Clojure this would be:

(map #(* % %) (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Select(x => x * x);

Collection

The collection looping pattern is about getting a single value out, though there may be cases when multiple values come out. The Hello World example of this tends to be the sum of the numbers in a sequence, or the multiple of the numbers.

The imperative version in C#, or any C-style language would be written as:

var sum = 0;
for(int i = 1; i <= 10; 1++)
{
  sum += i;
}
return sum;

In Clojure this would be:

(reduce + (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Sum();

Filtering

The filter looping pattern is to find a subset of items that match a criteria.

The Hello World example of this tends to be getting only the even numbers in a sequence.

The imperative version in C#, or any C-style language would be written as:

var numbers = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  if (!IsEven(i))
    continue;

  numbers.Add(i);
}
return numbers;

In Clojure this would be:

(filter even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Where(x => IsEven(x));

Grouping

The grouping looping pattern is to create a set of groups of items, where each group match the same criteria.

The Hello World example of this tends to be grouping numbers of a sequence into those that are even, and those that are not.

The imperative version in C#, or any C-style language would be written as:

var groups = new Dictionary<bool, List<int>>();
for(int i = 1; i <= 10; 1++)
{
  var key = IsEven(i);
  if (!groups.ContainsKey(key))
  {
    groups[key] = new List<int>();
  }
  groups[key].Add(i);
}
return groups;

In Clojure this would be:

(group-by even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).GroupBy(x => IsEven(x));

In this type of looping, the differences between the imperative and the declarative styles really start to show, even in a simple example such as this.

What about for loops as a counter?

As you can see above, I was just going against a range of numbers, and if needed, would go against range generated with a increment.

OK. So!?

Well…

First, my disclaimer. I am sure I have some of the pattern names wrong, but I have tried to identify them generically to help describe the differences though name only. Second, I am by far not the first one to identify these, I am only trying to document my understanding at the time, and help spread knowledge about these concepts. I would love feedback on the correct pattern names, and the others that I might have left out, or missed, so that I can help keep this as accurate as possible.

Second, I realize that these are all trivial examples, but I hope they illustrate enough to clarify following point I am about to make, if you haven’t already bought into it.

The beauty of these is that the loops become easily composable, as well as the program is now freed from the implementation of these operations, which can be changed separately from the program. The operations could be lazily evaluated and only done on demand; they could be parsed and fed into one giant function that exhibits the same results but tuned to be able to require only one pass through the items; if the language, or program is idempotent, then they could parallelized either by splitting off work into smaller pieces and assembling the results together, or farmed out to multiple worker processes so that we have multiple subprograms working against a problem and then we get a result from those (first one wins, or some kind of voting system). But when I am writing the program those are details I don’t care about.

This allows me to first express what I want the program to do, and then (after) I have expressed my intent and am sure that it is correct, I, or preferably someone much smarter than me, can go back and determine how, and any better ways, to execute my intent.

So in summary, to take Steve McConnell’s quote in Code Complete:
…if you work for 10 years, do you get 10 years of experience or do you get 1 year of experience 10 times?
and ask:
…if you have been writing loops for 10 loops, have you just been writing same loop all 10 years?

–Proctor

How Clojure is breaking my brain – Javascript

As I have been digging into Clojure, and working through the Project Euler problems for having something to program in Clojure, I have discovered that Clojure is really starting to break my brain.

One example of this I encountered recently was in some JavaScript I had to modify and extend.

The specific JavaScript that I had to modify was to extend functionality whose purpose was to try and find an identifier that corresponded to a given range. As it existed though, the data I needed to operate against was defined in two different arrays, one which had all of the numbers for the ranges for the different options, and a second arrays which had the identifiers for the different options.

So for a given set of data (the following ranges are adjoining in this example, but that does not always hold true), which would be outlined as such:

  • When the value is between a min of 0 and max of 4 the identifier is ‘A’,
  • When the value is between a min of 5 and max of 17 the identifier is ‘B’,
  • When the value is between a min of 18 and a max of 33 the identifier is ‘C’, and
  • When the value is between a min of 34 and a max of 51 the identifier is ‘D’.

So given the criteria outlined above, the previous version of the JavaScript had two arrays that were defined as:

var ranges = [0, 4, 5, 17, 18, 33, 34, 51];
var identifiers = ['A', 'B', 'C', 'D'];

And the code that did the checks for the the identifier that corresponded to a given value were defined as follows:

function someFunctionFor2Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1]) 
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  return null;
};
function someFunctionFor3Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  return null;
};
function someFunctionFor4Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  if (currentValue >= ranges[6] && currentValue <= ranges[7])
    return identifiers[3];

  return null;
};

If you look at the three functions above, you can see that they are the same, just with different number of if clauses depending on how many criteria were possible.

This is the first way that Clojure has broken my brain. I saw this, and immediately started thinking:

“Shouldn’t this be expressed in a different structure, one that makes the relationship between the id and the range explicit? Something like a map with the keys and values that I could destructure? With that nested in some kind of sequence, where I don’t care about the number of items, but just filter out the ones that match…”

As I am doing this in JavaScript this became an array (sequence) of JSON objects (maps).

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

This now allowed me to iterate over the sequence of JSON objects, and filter out the ones that meet a criteria, and then get the id for the first one. Which leads me to the second way that Clojure has started to break my brain and wriggle into it like the larvae of a Ceti Eel of Ceti Alpha V.

Where the way Clojure was twisting my brain, was that I started this by writing a couple of for loops in JavaScript to get the different results needed. As I was starting to type out the the third loop, I realized I was writing Yet Another Loop, and what I really wanted were higher order functions for JavaScript that could operate on a collection of items. I pulled out the search engine, and started looking for what higher order functions were available in JavaScript, but didn’t find any built in. I debated on writing my own, but decided I should investigate further to see if I could find any libraries as this seemed like it should be a solved problem for as long as JavaScript has been around, and found UnderscoreJS. This gave me the ability to not only use higher order functions but to be able to chain them and compose them in a reader friendly way, resulting in functions that now look like:

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

function isInRange(currentValue, candidate) {
  return (candidate.min <= currentValue && curentValue <= candidate.max);
}

function findIdentifierFor(currentValue) {
  return = _.chain(newStructure)
            .find(function(candidate) {return isInRange(currentValue, candidate);})
            .value()
            .id;
}

So in ending this segment of How Clojure is Breaking My Brain, I am yet again reminded of the quote that is quite popular in the functional programming circles from Alan J. Perlis in his “Epigrams in Programming” article for ACM SIGPLAN.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

–Proctor