Tag Archives: Clojure

DFW Clojure User Group

This is for those of you not on the Clojure Google Group mailing list.

There is now a Clojure User Group for the Dallas/Fort Worth metroplex. Alex Robbins has setup a meetup group which can be found here http://www.meetup.com/DFW-Clojure/.

The kickoff meeting is set for Wednesday, September 19, 2012.

Thanks to Alex and the sponsors of this user group. Let’s make it a good one.

–Proctor

Clojure function has-factors-in?

Just another quick post this evening to share a new function I created as part of cleaning up my solution to Problem 1 of Project Euler.

Was just responding to a comment on Google+ on my update sharing the post Project Euler in Clojure – Problem 16, and I saw the commenter had his own solution to problem 1. In sharing my solution I realized that I could clean up my results even further, and added a function has-factors-in?. These updates have also been pushed to my Project Euler in Clojure Github repository for those interested.

(defn has-factors-in? [n coll]
  (some #(factor-of? % n) coll))

Where before I had:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(or (factor-of? 3 %) (factor-of? 5 %))) (range n))))

It now becomes:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(has-factors-in? % [3 5]) (range n)))))

This change makes my solution read even more like the problem statement given.

Your thoughts?

–Proctor

Project Euler in Clojure – Problem 16

Here is my solution to Problem 16 of Project Euler. As always, my progress you can tracked on GitHub at https://github.com/stevenproctor/project-euler-clojure.

Problem 16 of Project Euler is:

What is the sum of the digits of the number 2^1000

This problem was straight forward since I already had the function digits-of defined from problem8. I was able to be very declarative in my problem, so much so, that it reads as the problem statement you are asked to solve.

(defn problem16
  ([] (problem16 1000))
  ([n] (sum (digits-of (expt 2 n)))))

As always, any feedback you have for me is greatly appreciated.

–Proctor

Project Euler in Clojure – Problem 15

Here is my solution to Problem 15 of Project Euler. As always, my progress you can tracked on GitHub at https://github.com/stevenproctor/project-euler-clojure.

Problem 15 of Project Euler is to find the starting number with the longest Collatz sequence, summarized from the problem page as:

Starting in the top left corner of a 22 grid,
there are 6 routes (without backtracking) to the bottom right corner.

How many routes are there through a 2020 grid?

I started this problem, by trying to tracing and counting the routes through grids of 2×2, 3×3, and 4×4, and even setled in and did a 5×5 square. Having these numbers, and knowing I had two choices for ever position I was in, except for when the path got to the far edge and bottom, I had a hint at the growth rate of the problem. I tried some powers of 2 with the relationship of the numbers, and some factorials with the numbers. After seeing some possible relationships with the factorials that might be leading me in the right direction, I tried a number of permutation calculations, and the combination calculations. Having seen the numbers show up in different combination results, I then spent time back-calculating from those numbers into my ns, and found that the pattern seemed to match 2n Choose n.

The source code to this was the factorial function:

(defn factorial [n]
  (multiply (range 1M (inc (bigdec n)))))

And, I could have done it recursively, but I figured I would just operate against the sequence of numbers, especially now that the reducers are available in the Clojure 1.5-Alpha 3 release (at the time of this writing) of Clojure. After I get through a few more problems (of which I am working ahead of these posts), I am thinking it would be interesting to run the same Project Euler Problems against 1.4 and 1.5 using the reducers library, just substituting map/reduce for the reduce/combine functionality, and seeing how much effort it takes to move them over, as well as the differences in the timings of the different problems.

The other base function I needed was a combination function:

(defn combination [n k]
  (cond (zero? n) 0
        (zero? k) 1
        :else (/ (factorial n) (* (factorial (- n k)) (factorial k)))))

This function just does the basic calculation for combinations, from the formula:

\frac{n!}{\big((n-k)! * k!\big)}

With that, and my stumbling upon the matching of the fact that {2n}\choose{n} is the solution to the number of paths through the square the function problem15 is defined as:

(defn problem15
  ([] (problem15 20))
  ([n] (combination (+ n n) n)))

As always, any feedback you have for me is greatly appreciated.

–Proctor

Aspect Oriented Timing in C# with Castle Windsor

I was making some refurbishments on some reporting code in our application that used EF and was suffering from the Select N+1 problem. If truth, it was much worse, as it was an Select N+1 problem up to 6 levels deep depending on where the report was run from.

I was changing the code to use a denormalized view from the database, and then run a SQL Query using Entity Framework. When doing this I was asked to get the timings of the report, both against the new way, and the existing way.

As this is incidental to what I was really trying to do, I did not want to litter timing code, and logging mechanisms into classes that already existed. This smelled of Aspect Oriented Programming (AOP). While I had not done anything using AOP before, I knew that it was great for cross-cutting concerns like logging, timings, etc. Having been digging into Clojure and LISP recently, this also seemed like cases of the :before, :after and :around methods in Common LISP, or the similar behavior in Eiffel as pointed out in Bertrand Meyer’s Object Oriented Software Construction, not to mention the time function in Clojure which is a function whose single concern is simply the to manage capturing the timing a function passed into it. My hope was to simplify, or un-complect, the code, and keep those concerns separate.

In our project, we have Castle Windsor setup as the IOC container, and Windsor supports a type of Aspect Oriented Programming using something called Interceptors. I found documentation on setting it up on a post by Ayende, and one Andre Loker. The issue was some of the places I wanted to setup the capturing of the timings were in different areas than where the handlers were registered for Windsor.

After some hunting around, I managed to come up with being able to add an interceptor to an already registered component by using the following line of code, where the IReportingService is the class I want to time the method calls around, and the LogTimingInterceptor is the class that captures the timing of the method call and sends it to the logger:

container.Kernel.GetHandler(typeof(IReportingService)).ComponentModel.Interceptors.Add(new InterceptorReference(typeof(LogTimingInterceptor)));

Hope someone else can find this useful,
–Proctor

John Backus on the Assignment Statement

The assignment statement is the von Neumann bottle-neck of programming languages and keeps us thinking in word-at-a-time terms in much the same way the computer’s bottleneck does.

John Backus, 1977
ACM Turing Award Lecture,
Communications of the ACM
August 1978, Volume 2, Number 8

How Clojure is breaking my brain – Loops

Seeing how Chris Houser, aka Chouser, aka @chrishouser, one of the co-authors of the Joy Of Clojure, shared a link to my post on twitter, and with a few tweets:

I figured I should follow up the start of that conversation, and tease apart my current thinking about loops.  The short version is:

I never want to have to write a loop again.

As this is quite the bold statement, some elaboration is in order.

First I realized a number of years ago that every time I would write a loop, I would write a lot of duplicate code. The code wasn’t always identical, but it had the same structure, so much so, that many IDEs now have ‘for loop’ templates. I had realized there were certain structures to the loops, but I never had the “AH-HA!” moment to truly recognize the patterns.

Then a series of events started to come together: I read Structure and Interpretation of Computer Programs for the first time, we moved on to .NET 3.5 at work and were able to take advantage of LINQ, I started looking into Ruby, and finally, I started digging into Clojure.

Now I had heard people talk about some of the looping patterns before, hearing people talk about Smalltalk and Ruby, but it wasn’t until I started getting into Clojure that it fully clicked in that there really are only a handful of different loops: projection (map, select, transformation), collection (reduce, collect, aggregate, accumulate, sum, multiple), filtering (filter, exclude, include, where), grouping (group, frequencies), and maybe a couple of others I am leaving out now, as well as composition of the different looping constructs.

Projection

The projection looping pattern simply takes a collection of elements and applies some projection against them to get a new view of the objects. The Hello World example of this is the squares of a set of numbers.
The imperative version in C#, or any C-style language would be written as:

var squares = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  squares.Add(i*i);
}
return squares;

In Clojure this would be:

(map #(* % %) (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Select(x => x * x);

Collection

The collection looping pattern is about getting a single value out, though there may be cases when multiple values come out. The Hello World example of this tends to be the sum of the numbers in a sequence, or the multiple of the numbers.

The imperative version in C#, or any C-style language would be written as:

var sum = 0;
for(int i = 1; i <= 10; 1++)
{
  sum += i;
}
return sum;

In Clojure this would be:

(reduce + (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Sum();

Filtering

The filter looping pattern is to find a subset of items that match a criteria.

The Hello World example of this tends to be getting only the even numbers in a sequence.

The imperative version in C#, or any C-style language would be written as:

var numbers = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  if (!IsEven(i))
    continue;

  numbers.Add(i);
}
return numbers;

In Clojure this would be:

(filter even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Where(x => IsEven(x));

Grouping

The grouping looping pattern is to create a set of groups of items, where each group match the same criteria.

The Hello World example of this tends to be grouping numbers of a sequence into those that are even, and those that are not.

The imperative version in C#, or any C-style language would be written as:

var groups = new Dictionary<bool, List<int>>();
for(int i = 1; i <= 10; 1++)
{
  var key = IsEven(i);
  if (!groups.ContainsKey(key))
  {
    groups[key] = new List<int>();
  }
  groups[key].Add(i);
}
return groups;

In Clojure this would be:

(group-by even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).GroupBy(x => IsEven(x));

In this type of looping, the differences between the imperative and the declarative styles really start to show, even in a simple example such as this.

What about for loops as a counter?

As you can see above, I was just going against a range of numbers, and if needed, would go against range generated with a increment.

OK. So!?

Well…

First, my disclaimer. I am sure I have some of the pattern names wrong, but I have tried to identify them generically to help describe the differences though name only. Second, I am by far not the first one to identify these, I am only trying to document my understanding at the time, and help spread knowledge about these concepts. I would love feedback on the correct pattern names, and the others that I might have left out, or missed, so that I can help keep this as accurate as possible.

Second, I realize that these are all trivial examples, but I hope they illustrate enough to clarify following point I am about to make, if you haven’t already bought into it.

The beauty of these is that the loops become easily composable, as well as the program is now freed from the implementation of these operations, which can be changed separately from the program. The operations could be lazily evaluated and only done on demand; they could be parsed and fed into one giant function that exhibits the same results but tuned to be able to require only one pass through the items; if the language, or program is idempotent, then they could parallelized either by splitting off work into smaller pieces and assembling the results together, or farmed out to multiple worker processes so that we have multiple subprograms working against a problem and then we get a result from those (first one wins, or some kind of voting system). But when I am writing the program those are details I don’t care about.

This allows me to first express what I want the program to do, and then (after) I have expressed my intent and am sure that it is correct, I, or preferably someone much smarter than me, can go back and determine how, and any better ways, to execute my intent.

So in summary, to take Steve McConnell’s quote in Code Complete:
…if you work for 10 years, do you get 10 years of experience or do you get 1 year of experience 10 times?
and ask:
…if you have been writing loops for 10 loops, have you just been writing same loop all 10 years?

–Proctor

How Clojure is breaking my brain – Javascript

As I have been digging into Clojure, and working through the Project Euler problems for having something to program in Clojure, I have discovered that Clojure is really starting to break my brain.

One example of this I encountered recently was in some JavaScript I had to modify and extend.

The specific JavaScript that I had to modify was to extend functionality whose purpose was to try and find an identifier that corresponded to a given range. As it existed though, the data I needed to operate against was defined in two different arrays, one which had all of the numbers for the ranges for the different options, and a second arrays which had the identifiers for the different options.

So for a given set of data (the following ranges are adjoining in this example, but that does not always hold true), which would be outlined as such:

  • When the value is between a min of 0 and max of 4 the identifier is ‘A’,
  • When the value is between a min of 5 and max of 17 the identifier is ‘B’,
  • When the value is between a min of 18 and a max of 33 the identifier is ‘C’, and
  • When the value is between a min of 34 and a max of 51 the identifier is ‘D’.

So given the criteria outlined above, the previous version of the JavaScript had two arrays that were defined as:

var ranges = [0, 4, 5, 17, 18, 33, 34, 51];
var identifiers = ['A', 'B', 'C', 'D'];

And the code that did the checks for the the identifier that corresponded to a given value were defined as follows:

function someFunctionFor2Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1]) 
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  return null;
};
function someFunctionFor3Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  return null;
};
function someFunctionFor4Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  if (currentValue >= ranges[6] && currentValue <= ranges[7])
    return identifiers[3];

  return null;
};

If you look at the three functions above, you can see that they are the same, just with different number of if clauses depending on how many criteria were possible.

This is the first way that Clojure has broken my brain. I saw this, and immediately started thinking:

“Shouldn’t this be expressed in a different structure, one that makes the relationship between the id and the range explicit? Something like a map with the keys and values that I could destructure? With that nested in some kind of sequence, where I don’t care about the number of items, but just filter out the ones that match…”

As I am doing this in JavaScript this became an array (sequence) of JSON objects (maps).

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

This now allowed me to iterate over the sequence of JSON objects, and filter out the ones that meet a criteria, and then get the id for the first one. Which leads me to the second way that Clojure has started to break my brain and wriggle into it like the larvae of a Ceti Eel of Ceti Alpha V.

Where the way Clojure was twisting my brain, was that I started this by writing a couple of for loops in JavaScript to get the different results needed. As I was starting to type out the the third loop, I realized I was writing Yet Another Loop, and what I really wanted were higher order functions for JavaScript that could operate on a collection of items. I pulled out the search engine, and started looking for what higher order functions were available in JavaScript, but didn’t find any built in. I debated on writing my own, but decided I should investigate further to see if I could find any libraries as this seemed like it should be a solved problem for as long as JavaScript has been around, and found UnderscoreJS. This gave me the ability to not only use higher order functions but to be able to chain them and compose them in a reader friendly way, resulting in functions that now look like:

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

function isInRange(currentValue, candidate) {
  return (candidate.min <= currentValue && curentValue <= candidate.max);
}

function findIdentifierFor(currentValue) {
  return = _.chain(newStructure)
            .find(function(candidate) {return isInRange(currentValue, candidate);})
            .value()
            .id;
}

So in ending this segment of How Clojure is Breaking My Brain, I am yet again reminded of the quote that is quite popular in the functional programming circles from Alan J. Perlis in his “Epigrams in Programming” article for ACM SIGPLAN.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

–Proctor

Setting the title in Terminal on OS X

Working through my Project Euler problems in Clojure, I wound up having a number of Terminal windows open at the same time: one for the REPL, one for a prompt to commit to my Project Euler git repo, one for a Midje program, one to start the Light Table Playground, and various others with other things I had going on.

With all of these windows, getting stacked on top of one another I was losing track of which Terminal window was which when I went to the Window menu for Terminal to get to the window I wanted.

I found this post Mac OS X Change the Terminal Window Title and then followed the instructions to get a nice little shell program setup to set the title of the window.

echo -n -e "33]0;$107"

Where I use the $1 instead the title itself, or a environment variable, per his two examples, to be able to have the first argument to the script, so now I can just run:

title "Light Table Playground"

I just figured I would share my find for anyone out there who hasn’t yet stumbled across it yet.

–Proctor

Project Euler in Clojure – Problem 14

Here is my solution to Problem 14 of Project Euler. As always, my progress you can tracked on GitHub at https://github.com/stevenproctor/project-euler-clojure.

Problem 14 of Project Euler is to find the starting number with the longest Collatz sequence, summarized from the problem page as:

The following iterative sequence is defined for the set of positive integers:
n -> n/2 (n is even)
n -> 3n + 1 (n is odd)
[...]
Which starting number, under one million, produces the longest chain?

The first function I needed was a function to get the next number in the Collatz sequence.

(defn next-collatz [n]
  (cond (<= n 1) nil
   	(even? n) (/ n 2)
        (odd? n) (+ (* 3 n) 1)))

This function mimics the above definition of the sequence, with the distinction of the check to see if the number is less than, or equal to, 1. While not strictly necessary, when I was generating numbers from the REPL, I noticed that the sequence never really terminates, but ends in a repeating sequence of 4 -> 2 -> 1 -> 4 -> 2 -> 1 ->.... This can be seen by running the following at the REPL.

(take 15 (iterate next-collatz 8))

Wich results in the sequence (8 4 2 1 4 2 1 4 2 1 4 2 1 4 2). I used the condition (<= n 1) as a guard condition to as a way to be able to terminate the sequence and make it finite. This could probably be removed at this point, as I did not even use the nil return as a check, except that it would serve as a reminder of the function when playing at the REPL.

The next function I defined was a function to get the count of the numbers in the sequence. I defined it this way in the hopes of trying to memoize the values, but I kept getting an OutOfMemoryException when I tried to use it so I took the memoization away.

(defn collatz-count-for [n]
  {:pre  [(pos? n)]}
  (if (= n 1) 1
      (inc (collatz-count-for (next-collatz n)))))

In this function, I decided I would add a precondition to the function requiring that the parameter to the function be positive. This way I could avoid having to try and generate a Collatz sequence for negative numbers, as they are outside the scope of this problem. The function first checks if the number to generate the sequence for is 1, and if so, the count of the sequence returns 1. For all other positive numbers, we get the length of the sequence for the next Collatz number in the sequence, and increment that value, and return that as the Collatz count.

As I was writing this up, I decided I would take another stab at this, and did some searching to try and find a way to change the memory allocation for the project in the REPL. I found the following post How to Set JVM Memory for Clojure REPL in Emacs, and added the following to my project.clj file.

:jvm-opts ["-Xmx768M"]

When I ran with the following memoized version of the collatz-count-for, it now runs much quicker than it did before. Previously it was a rough mental counting of about 10-15 seconds, and now the first time is runs was 2-3 seconds, and sub second on runs after that even.

(def collatz-count-for (memoize 
  (fn [n]
    {:pre [(pos? n)]}
    (if (= n 1) 1
      (inc (collatz-count-for (next-collatz n)))))))

The changes needed to memoize the function collatz-count-for is that the defn is now a def, which assigns the identifier of collatz-count-for to the result of passing a now nameless function, with the previous function body, to the function memoize. This allows the function body to remain the same, and only the declaration of the function has to now change.

With those pieces to the problem in place, we can now find the number that generates the longest Collatz sequence.

(defn problem14
  ([] (problem14 1000000))
  ([n] (first (reduce
                (fn [memo x]
                  (if (> (second x) (second memo)) x memo))
                (map #(vector % (collatz-count-for %)) (range 1 (inc n)))))))

I started by mapping over the numbers from 1 to 1000000, to create vectors representing the pair of the number and the sequence length for that number. An example of which results in the following sequence (shown to the first 10 numbers):

([1 1] [2 2] [3 8] [4 3] [5 6] [6 9] [7 17] [8 4] [9 20] [10 7] ...)

I then feed that sequence into reduce, with a function to find the pair with the largest sequence length by calling second on the vector, resulting in the pair of number and sequence length with the largest sequence length. And having found that vector, I use first to find what the number with the largest sequence length is.

This seems like it might be generic enough (idiomatic pattern) that I was missing something that was already defined to do the map and reduce on a structure like this, but I could not find anything. If anybody knows of anything I would love to see how that might be made more expressive.

–Proctor