Category Archives: Clojure

Running Clojure shell scripts in *nix environments

I was recently trying to create a basic piece of Clojure code to play with “real-time” log file parsing by playing with futures. The longer term goal of the experiment is to be able to tail -f a log file pipe that into my Clojure log parser as input.

As I wasn’t sure exactly what I would need to be doing, I wanted an easy way to run some code quickly without having to rebuild the jars through Leiningen every time I wanted to try something, in a manner similar to the way I am thinking I will be using it if the experiment succeeds.

I created a file test_input with the following lines:

1 hello
2 test
3 abacus
4 qwerty
5 what
6 dvorak

With this in place, my goal was to be able to run something like cat test_file | parser_concept. After a bit of searching I found the lein-exec plugin for Leiningen, and after very minor setup I was able to start iterating with input piped in from elsewhere.

The first step was to open my profiles.clj file in my ~/.lein directory. I made sure lein-exec was specified in my user plugins as so:

{:user {:plugins [[lein-exec "0.2.1"]
                  ;other plugins for lein
                 ]}}

With this in place I just put the following line at the top of my script.clj file:

#!/usr/bin/env lein exec

I then changed the permissions of script.clj file to make it executable, I was able to run the following and have my code run against the input.

cat test_input | ./script.clj

I will be posting a follow up entry outlining my next step of experimenting with “processing” each line read in as a future.

Looking for a new job

The company I have been working for had to just go through a round of layoffs, due to not getting one of their main contracts renewed do to lack of funding by the state, and I have been included in that round.

My primary experience has been in C# on the Microsoft .Net framework for the past 10 years. Of those 10 years, 9 of them were working against the same product, and product suite allowing me to learn how some inconsequential decisions are not so inconsequential long term, and the value of care and commitment to a code base can be.

I am thinking that this would be a good opportunity to open myself up in trying to venture into a new language and toolset such as Ruby or Clojure. This is my open announcement for anybody who would like to poach a .Net developer if you are having problems finding developers fluent in your programming language.

–Proctor

DFW Clojure User Group

This is for those of you not on the Clojure Google Group mailing list.

There is now a Clojure User Group for the Dallas/Fort Worth metroplex. Alex Robbins has setup a meetup group which can be found here http://www.meetup.com/DFW-Clojure/.

The kickoff meeting is set for Wednesday, September 19, 2012.

Thanks to Alex and the sponsors of this user group. Let’s make it a good one.

–Proctor

Clojure function has-factors-in?

Just another quick post this evening to share a new function I created as part of cleaning up my solution to Problem 1 of Project Euler.

Was just responding to a comment on Google+ on my update sharing the post Project Euler in Clojure – Problem 16, and I saw the commenter had his own solution to problem 1. In sharing my solution I realized that I could clean up my results even further, and added a function has-factors-in?. These updates have also been pushed to my Project Euler in Clojure Github repository for those interested.

(defn has-factors-in? [n coll]
  (some #(factor-of? % n) coll))

Where before I had:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(or (factor-of? 3 %) (factor-of? 5 %))) (range n))))

It now becomes:

(defn problem1
  ([] (problem1 1000))
  ([n] (sum (filter #(has-factors-in? % [3 5]) (range n)))))

This change makes my solution read even more like the problem statement given.

Your thoughts?

–Proctor

Project Euler in Clojure – Problem 16

Here is my solution to Problem 16 of Project Euler. As always, my progress you can tracked on GitHub at https://github.com/stevenproctor/project-euler-clojure.

Problem 16 of Project Euler is:

What is the sum of the digits of the number 2^1000

This problem was straight forward since I already had the function digits-of defined from problem8. I was able to be very declarative in my problem, so much so, that it reads as the problem statement you are asked to solve.

(defn problem16
  ([] (problem16 1000))
  ([n] (sum (digits-of (expt 2 n)))))

As always, any feedback you have for me is greatly appreciated.

–Proctor

Project Euler in Clojure – Problem 15

Here is my solution to Problem 15 of Project Euler. As always, my progress you can tracked on GitHub at https://github.com/stevenproctor/project-euler-clojure.

Problem 15 of Project Euler is to find the starting number with the longest Collatz sequence, summarized from the problem page as:

Starting in the top left corner of a 22 grid,
there are 6 routes (without backtracking) to the bottom right corner.

How many routes are there through a 2020 grid?

I started this problem, by trying to tracing and counting the routes through grids of 2×2, 3×3, and 4×4, and even setled in and did a 5×5 square. Having these numbers, and knowing I had two choices for ever position I was in, except for when the path got to the far edge and bottom, I had a hint at the growth rate of the problem. I tried some powers of 2 with the relationship of the numbers, and some factorials with the numbers. After seeing some possible relationships with the factorials that might be leading me in the right direction, I tried a number of permutation calculations, and the combination calculations. Having seen the numbers show up in different combination results, I then spent time back-calculating from those numbers into my ns, and found that the pattern seemed to match 2n Choose n.

The source code to this was the factorial function:

(defn factorial [n]
  (multiply (range 1M (inc (bigdec n)))))

And, I could have done it recursively, but I figured I would just operate against the sequence of numbers, especially now that the reducers are available in the Clojure 1.5-Alpha 3 release (at the time of this writing) of Clojure. After I get through a few more problems (of which I am working ahead of these posts), I am thinking it would be interesting to run the same Project Euler Problems against 1.4 and 1.5 using the reducers library, just substituting map/reduce for the reduce/combine functionality, and seeing how much effort it takes to move them over, as well as the differences in the timings of the different problems.

The other base function I needed was a combination function:

(defn combination [n k]
  (cond (zero? n) 0
        (zero? k) 1
        :else (/ (factorial n) (* (factorial (- n k)) (factorial k)))))

This function just does the basic calculation for combinations, from the formula:

\frac{n!}{\big((n-k)! * k!\big)}

With that, and my stumbling upon the matching of the fact that {2n}\choose{n} is the solution to the number of paths through the square the function problem15 is defined as:

(defn problem15
  ([] (problem15 20))
  ([n] (combination (+ n n) n)))

As always, any feedback you have for me is greatly appreciated.

–Proctor

Aspect Oriented Timing in C# with Castle Windsor

I was making some refurbishments on some reporting code in our application that used EF and was suffering from the Select N+1 problem. If truth, it was much worse, as it was an Select N+1 problem up to 6 levels deep depending on where the report was run from.

I was changing the code to use a denormalized view from the database, and then run a SQL Query using Entity Framework. When doing this I was asked to get the timings of the report, both against the new way, and the existing way.

As this is incidental to what I was really trying to do, I did not want to litter timing code, and logging mechanisms into classes that already existed. This smelled of Aspect Oriented Programming (AOP). While I had not done anything using AOP before, I knew that it was great for cross-cutting concerns like logging, timings, etc. Having been digging into Clojure and LISP recently, this also seemed like cases of the :before, :after and :around methods in Common LISP, or the similar behavior in Eiffel as pointed out in Bertrand Meyer’s Object Oriented Software Construction, not to mention the time function in Clojure which is a function whose single concern is simply the to manage capturing the timing a function passed into it. My hope was to simplify, or un-complect, the code, and keep those concerns separate.

In our project, we have Castle Windsor setup as the IOC container, and Windsor supports a type of Aspect Oriented Programming using something called Interceptors. I found documentation on setting it up on a post by Ayende, and one Andre Loker. The issue was some of the places I wanted to setup the capturing of the timings were in different areas than where the handlers were registered for Windsor.

After some hunting around, I managed to come up with being able to add an interceptor to an already registered component by using the following line of code, where the IReportingService is the class I want to time the method calls around, and the LogTimingInterceptor is the class that captures the timing of the method call and sends it to the logger:

container.Kernel.GetHandler(typeof(IReportingService)).ComponentModel.Interceptors.Add(new InterceptorReference(typeof(LogTimingInterceptor)));

Hope someone else can find this useful,
–Proctor

John Backus on the Assignment Statement

The assignment statement is the von Neumann bottle-neck of programming languages and keeps us thinking in word-at-a-time terms in much the same way the computer’s bottleneck does.

John Backus, 1977
ACM Turing Award Lecture,
Communications of the ACM
August 1978, Volume 2, Number 8

How Clojure is breaking my brain – Loops

Seeing how Chris Houser, aka Chouser, aka @chrishouser, one of the co-authors of the Joy Of Clojure, shared a link to my post on twitter, and with a few tweets:

I figured I should follow up the start of that conversation, and tease apart my current thinking about loops.  The short version is:

I never want to have to write a loop again.

As this is quite the bold statement, some elaboration is in order.

First I realized a number of years ago that every time I would write a loop, I would write a lot of duplicate code. The code wasn’t always identical, but it had the same structure, so much so, that many IDEs now have ‘for loop’ templates. I had realized there were certain structures to the loops, but I never had the “AH-HA!” moment to truly recognize the patterns.

Then a series of events started to come together: I read Structure and Interpretation of Computer Programs for the first time, we moved on to .NET 3.5 at work and were able to take advantage of LINQ, I started looking into Ruby, and finally, I started digging into Clojure.

Now I had heard people talk about some of the looping patterns before, hearing people talk about Smalltalk and Ruby, but it wasn’t until I started getting into Clojure that it fully clicked in that there really are only a handful of different loops: projection (map, select, transformation), collection (reduce, collect, aggregate, accumulate, sum, multiple), filtering (filter, exclude, include, where), grouping (group, frequencies), and maybe a couple of others I am leaving out now, as well as composition of the different looping constructs.

Projection

The projection looping pattern simply takes a collection of elements and applies some projection against them to get a new view of the objects. The Hello World example of this is the squares of a set of numbers.
The imperative version in C#, or any C-style language would be written as:

var squares = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  squares.Add(i*i);
}
return squares;

In Clojure this would be:

(map #(* % %) (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Select(x => x * x);

Collection

The collection looping pattern is about getting a single value out, though there may be cases when multiple values come out. The Hello World example of this tends to be the sum of the numbers in a sequence, or the multiple of the numbers.

The imperative version in C#, or any C-style language would be written as:

var sum = 0;
for(int i = 1; i <= 10; 1++)
{
  sum += i;
}
return sum;

In Clojure this would be:

(reduce + (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Sum();

Filtering

The filter looping pattern is to find a subset of items that match a criteria.

The Hello World example of this tends to be getting only the even numbers in a sequence.

The imperative version in C#, or any C-style language would be written as:

var numbers = new List<int>();
for(int i = 1; i <= 10; 1++)
{
  if (!IsEven(i))
    continue;

  numbers.Add(i);
}
return numbers;

In Clojure this would be:

(filter even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).Where(x => IsEven(x));

Grouping

The grouping looping pattern is to create a set of groups of items, where each group match the same criteria.

The Hello World example of this tends to be grouping numbers of a sequence into those that are even, and those that are not.

The imperative version in C#, or any C-style language would be written as:

var groups = new Dictionary<bool, List<int>>();
for(int i = 1; i <= 10; 1++)
{
  var key = IsEven(i);
  if (!groups.ContainsKey(key))
  {
    groups[key] = new List<int>();
  }
  groups[key].Add(i);
}
return groups;

In Clojure this would be:

(group-by even? (range 1 11))

In C# with LINQ this is:

  Enumerable.Range(1, 10).GroupBy(x => IsEven(x));

In this type of looping, the differences between the imperative and the declarative styles really start to show, even in a simple example such as this.

What about for loops as a counter?

As you can see above, I was just going against a range of numbers, and if needed, would go against range generated with a increment.

OK. So!?

Well…

First, my disclaimer. I am sure I have some of the pattern names wrong, but I have tried to identify them generically to help describe the differences though name only. Second, I am by far not the first one to identify these, I am only trying to document my understanding at the time, and help spread knowledge about these concepts. I would love feedback on the correct pattern names, and the others that I might have left out, or missed, so that I can help keep this as accurate as possible.

Second, I realize that these are all trivial examples, but I hope they illustrate enough to clarify following point I am about to make, if you haven’t already bought into it.

The beauty of these is that the loops become easily composable, as well as the program is now freed from the implementation of these operations, which can be changed separately from the program. The operations could be lazily evaluated and only done on demand; they could be parsed and fed into one giant function that exhibits the same results but tuned to be able to require only one pass through the items; if the language, or program is idempotent, then they could parallelized either by splitting off work into smaller pieces and assembling the results together, or farmed out to multiple worker processes so that we have multiple subprograms working against a problem and then we get a result from those (first one wins, or some kind of voting system). But when I am writing the program those are details I don’t care about.

This allows me to first express what I want the program to do, and then (after) I have expressed my intent and am sure that it is correct, I, or preferably someone much smarter than me, can go back and determine how, and any better ways, to execute my intent.

So in summary, to take Steve McConnell’s quote in Code Complete:
…if you work for 10 years, do you get 10 years of experience or do you get 1 year of experience 10 times?
and ask:
…if you have been writing loops for 10 loops, have you just been writing same loop all 10 years?

–Proctor

How Clojure is breaking my brain – Javascript

As I have been digging into Clojure, and working through the Project Euler problems for having something to program in Clojure, I have discovered that Clojure is really starting to break my brain.

One example of this I encountered recently was in some JavaScript I had to modify and extend.

The specific JavaScript that I had to modify was to extend functionality whose purpose was to try and find an identifier that corresponded to a given range. As it existed though, the data I needed to operate against was defined in two different arrays, one which had all of the numbers for the ranges for the different options, and a second arrays which had the identifiers for the different options.

So for a given set of data (the following ranges are adjoining in this example, but that does not always hold true), which would be outlined as such:

  • When the value is between a min of 0 and max of 4 the identifier is ‘A’,
  • When the value is between a min of 5 and max of 17 the identifier is ‘B’,
  • When the value is between a min of 18 and a max of 33 the identifier is ‘C’, and
  • When the value is between a min of 34 and a max of 51 the identifier is ‘D’.

So given the criteria outlined above, the previous version of the JavaScript had two arrays that were defined as:

var ranges = [0, 4, 5, 17, 18, 33, 34, 51];
var identifiers = ['A', 'B', 'C', 'D'];

And the code that did the checks for the the identifier that corresponded to a given value were defined as follows:

function someFunctionFor2Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1]) 
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  return null;
};
function someFunctionFor3Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  return null;
};
function someFunctionFor4Options(currentValue) {
  if (currentValue >= ranges[0] && currentValue <= ranges[1])
    return identifiers[0];
  if (currentValue >= ranges[2] && currentValue <= ranges[3])
    return identifiers[1];
  if (currentValue >= ranges[4] && currentValue <= ranges[5])
    return identifiers[2];
  if (currentValue >= ranges[6] && currentValue <= ranges[7])
    return identifiers[3];

  return null;
};

If you look at the three functions above, you can see that they are the same, just with different number of if clauses depending on how many criteria were possible.

This is the first way that Clojure has broken my brain. I saw this, and immediately started thinking:

“Shouldn’t this be expressed in a different structure, one that makes the relationship between the id and the range explicit? Something like a map with the keys and values that I could destructure? With that nested in some kind of sequence, where I don’t care about the number of items, but just filter out the ones that match…”

As I am doing this in JavaScript this became an array (sequence) of JSON objects (maps).

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

This now allowed me to iterate over the sequence of JSON objects, and filter out the ones that meet a criteria, and then get the id for the first one. Which leads me to the second way that Clojure has started to break my brain and wriggle into it like the larvae of a Ceti Eel of Ceti Alpha V.

Where the way Clojure was twisting my brain, was that I started this by writing a couple of for loops in JavaScript to get the different results needed. As I was starting to type out the the third loop, I realized I was writing Yet Another Loop, and what I really wanted were higher order functions for JavaScript that could operate on a collection of items. I pulled out the search engine, and started looking for what higher order functions were available in JavaScript, but didn’t find any built in. I debated on writing my own, but decided I should investigate further to see if I could find any libraries as this seemed like it should be a solved problem for as long as JavaScript has been around, and found UnderscoreJS. This gave me the ability to not only use higher order functions but to be able to chain them and compose them in a reader friendly way, resulting in functions that now look like:

var newStructure = [ {id: 'A', min: 0, max: 4 },
                     {id: 'B', min: 5, max: 17 },
                     {id: 'C', min: 18, max: 33 },
                     {id: 'D', min: 34, max: 51 } ];

function isInRange(currentValue, candidate) {
  return (candidate.min <= currentValue && curentValue <= candidate.max);
}

function findIdentifierFor(currentValue) {
  return = _.chain(newStructure)
            .find(function(candidate) {return isInRange(currentValue, candidate);})
            .value()
            .id;
}

So in ending this segment of How Clojure is Breaking My Brain, I am yet again reminded of the quote that is quite popular in the functional programming circles from Alan J. Perlis in his “Epigrams in Programming” article for ACM SIGPLAN.

It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures.

–Proctor