Category Archives: Clojure

Papillon – a new interceptor library for Clojure(Script)

Announcing Papillon, 0.0.1-PREVIEW.

This is an early alpha release to get feedback and discussion in the community. We are using it in production, but we went with an PREVIEW version to show still this is early days of the library, and although we are going to strive for the stability of libraries in the Clojure ecosystem, we recognize that as people other than us use it, it may require some updates.

Why a new interceptor library?

At work, Guaranteed Rate, who allowed us to open source the project, we run ClojureScript on AWS Lambdas (Node runtime) for many of the micro-services our team owns. The Lambdas are a combination of Lambdas triggered by AWS API Gateway which put items on either and SNS Topic or SQS Queue, with other Lambdas consuming from SQS Queues.

There was a decent portion of the codebase that was based of JavaScript Promises, and using Promise chaining using a “state” map (until finally style error handling for resource clean-up was needed), so we thought Interceptors was not a far reach for the team to adapt to.

We looked at Pedestal and Sieppari, but various aspects didn’t look to fit in with our desire to use the Interceptor pattern in AWS Lambdas, so we took a shot at rolling our own.

Goals

Clojure Common

As mentioned above, we run ClojureScript on Node runtime for our AWS Lambdas, so we needed a solution that covers both Clojure and ClojureScript.

While we are currently only targeting Clojure and ClojureScript support, as that is what we use in our deployments at work, our goal is to stick to Clojure core a much as possible to help keep the Papillon available across as many of the different Clojure runtimes as possible (e.g. Clojure.NET, ClojurErl, ClojureDart, Babashka, and more would also be welcomed).

Interceptor focused

Pedestal interceptors are fantastic, but they are part of Pedestal, and while we could have used Pedestal’s interceptor namespace only, it does have dependencies on logging in Pedestal, and the interceptors in Pedestal are not quite as isolated from the rest of Pedestal as we would have liked.

Decouple the interceptors from HTTP requests.

Sieppari was more focused on Interceptors only, but was based on the idea of a Request/Response model.

With our goal to have interceptors be the prevalent pattern in our AWS Lambdas, we needed something that would fit with both the HTTP style of synchronous AWS Lambdas, as well as the asynchronous AWS Lambdas that consume their items from SQS queues, the idea of contorting a SQS message into a HTTP Request/Response was something we wanted to avoid.

Minimal

We focused on what seemed to be the core of interceptors which is having an execution chain of interceptors to run, and running a context map through that chain and back out.

We have tried to leave out most everything else, as we found the interceptor chains can easily be modified to include many orthogonal concerns, allowing us to decomplect the interceptor execution from other useful but separate concerns.

One example of something we have left out is logging. While it is useful to log the path through the interceptor chain, and modifications to the execution chain, we didn’t want to pick a logging library that consuming applications would inherit our decision on.

Data First

We also found that given the concept of the interceptor chain being just data, we could get logging (and various other concerns, e.g. benchmarking, tracing, etc.), included by manipulating the interceptor chain using interleave with a repeated sequence of a logging specific interceptor.

This ability to treat both the context as data and the control flow as data, allowed us to keep the core flow of domain logic as interceptors, distinct from logging and other developer related concerns, allowing us to highlight the core context.

Given that the control flow is data, and available on the context, it allowed us to play with ideas like setting up support for a Common Lisp style Condition System as seen in the examples folder.

Clojure Core Libraries Based

We stuck with core.async and the ReadPort as our asynchronous mechanism. If we are a library, we didn’t want to commit you to a library because you use Papillon. Clojure (JVM) already has various asynchronous constructs that work with ReadPort and we piggy-backed on ReadPort in ClojureScript land to allow you to use JavaScript Promises as a ReadPort.

The Goal was to give you something that would work out of the box with the various tools to do asynchronous programming that Clojure and ClojureScript give you without making you implement yet another protocol to adapt to.

Get more discussion on Interceptors starting again

We don’t expect that this will become the next big hit and everyone will start using this in their code, but we do hope that by publishing and promoting “Yet Another Interceptor Library”

Side Note:  I almost did name it yail (Yet Another Interceptor Library), but it didn't feel like it hit 'yet another' level so happy to keep that one available for someone else to use in the hope that the idea of interceptors gets to that point 🤞.

Those of us in our group who were pushing this project forward think interceptors are a valuable and a “well kept secret” of the Clojure ecosystem, and would love to see more usages of them in the community.

We also would love to see some more abuses of interceptors as well, because it helps find the edges of what can(not) and should (not) be done with them.

Thank You to Guaranteed Rate

We are thankful that Guaranteed Rate has allowed us to take this project to make our work lives better and open source it as Papillon to share with the Clojure community.

Ibid.

During my interview with Gene Kim on for Functional Geekery, Episode 128, Gene talked about how he had a problem he was asking different people for how they would solve it nicely with a functional approach, to see how to improve his Clojure solution to be more idiomatic.

His problem was on “rewriting” Ibid. entries in citation references, to get the authors names instead of the Ibid. value, as Ibid. is a shorthand that stands for “the authors listed in the entry before this”.

As he was describing this problem, I was picturing the general pseudo-code with a pattern match in my head. To be fair, this has come from a number of years of getting used to thinking in a functional style as well as thinking in a pattern matching style.

The following Erlang code is a close representation to the pseudo-code that was in my head.

-module(ibid).

-export([ibid/1]).

ibid(Authors) ->
    ibid(Authors, []).

ibid([], UpdatedAuthors) ->
    {ok, lists:reverse(UpdatedAuthors)};
ibid(["Ibid." | _], []) ->
    {error, "No Previous Author for 'Ibid.' citation"};
ibid(["Ibid." | T], UpdatedAuthors=[H | _]) ->
    ibid(T, [H | UpdatedAuthors]);
ibid([H | T], UpdatedAuthors) ->
    ibid(T, [H | UpdatedAuthors]).

Running this in the Erlang shell using erl results in the following

> ibid:ibid(["Mike Nygard", "Gene Kim", "Ibid.", "Ibid.", "Nicole Forsgren", "Ibid.", "Jez Humble", "Gene Kim", "Ibid."]).
{ok,["Mike Nygard","Gene Kim","Gene Kim","Gene Kim",
     "Nicole Forsgren","Nicole Forsgren","Jez Humble","Gene Kim",
     "Gene Kim"]}
> ibid:ibid(["Ibid."]).
{error,"No Previous Author for 'Ibid.' citation"}

Throughout the editing of the podcast, I continued to think about his problem, and how I would approach it in Clojure without built-in pattern matching, and came up with the following using a cond instead of a pure pattern matching solution:

(defn
  update_ibids
  ([authors] (update_ibids authors []))
  ([[citation_author & rest_authors :as original_authors] [last_author & _ :as new_authors]]
    (let [ibid? (fn [author] (= "Ibid." author))]
      (cond
        (empty? original_authors) (reverse new_authors)
        (and (ibid? citation_author) (not last_author))
          (throw (Exception. "Found `Ibid.` with no previous author"))
        :else (recur
          rest_authors
          (cons
            (if (ibid? citation_author)
                last_author
                citation_author)
            new_authors))))))

And if we run this in the Clojure REPL we get the following:

user=> (def references ["Gene Kim", "Jez Humble", "Ibid.", "Gene Kim", "Ibid.", "Ibid.", "Nicole Forsgren", "Micheal Nygard", "Ibid."])

user=> (update_ibids [])
()
user=> (update_ibids ["Ibid."])
Execution error at user/update-ibids (REPL:8).
Found `Ibid.` with no previous author
user=> (update_ibids references)
("Gene Kim" "Jez Humble" "Jez Humble" "Gene Kim" "Gene Kim" "Gene Kim" "Nicole Forsgren" "Micheal Nygard" "Micheal Nygard")

That solution didn’t sit well with me (and if there is a more idiomatic way to write it I would love some of your solutions as well), and because of that, I wanted to see what could be done using the core.match library, which moves towards the psuedo-code I was picturing.

(ns ibid
  (:require [clojure.core.match :refer [match]]))


(defn
  update_ibids
  ([authors] (update_ibids authors []))
  ([orig updated]
    (match [orig updated]
      [[] new_authors] (reverse new_authors)
      [["Ibid." & _] []] (throw (Exception. "Found `Ibid.` with no previous author"))
      [["Ibid." & r] ([last_author & _] :seq) :as new_authors] (recur r (cons last_author new_authors))
      [[author & r] new_authors] (recur r (cons author new_authors)) )))

And if you are trying this yourself, don’t forget to add to your deps.edn file:

{:deps
  {org.clojure/core.match {:mvn/version "0.3.0"}}

After the first couple of itches were scratched, Gene shared on Twitter Stephen Mcgill’s solution and his solution inspired by Stephen’s.

https://twitter.com/RealGeneKim/status/1201922587346866176

(Edit 2022-05-02 : I took out the Twitter embed and changed the embed to be an HTML link to Twitter if you are interested in seeing the post as it was pointed out that tracking cookies were being dropped by Twitter, in an effort to reduce cookies being dropped by this site.)

And then, just for fun (or “just for defun” if you prefer the pun intended version), I did a version in LFE (Lisp Flavored Erlang) due to it being a Lisp with built in pattern matching from being on the Erlang runtime.

(defmodule ibid
  (export (ibid 1)))


(defun ibid [authors]
  (ibid authors '[]))


(defun ibid
  ([[] updated]
    (tuple 'ok (: lists reverse updated)))
  (((cons "Ibid." _) '[])
    (tuple 'error "No Previous Author for 'Ibid.' citation"))
  ([(cons "Ibid." authors) (= (cons h _) updated)]
    (ibid authors (cons h updated)))
  ([(cons h rest) updated]
    (ibid rest (cons h updated))))

Which if we call it in LFE’s REPL gives us the following:

lfe> (: ibid ibid '["Mike Nygard" "Gene Kim" "Ibid." "Ibid." "Nicole Forsgren" "Ibid." "Jez Humble" "Gene Kim" "Ibid."])
#(ok
  ("Mike Nygard"
   "Gene Kim"
   "Gene Kim"
   "Gene Kim"
   "Nicole Forsgren"
   "Nicole Forsgren"
   "Jez Humble"
   "Gene Kim"
   "Gene Kim"))
lfe> (: ibid ibid '["Ibid."])
#(error "No Previous Author for 'Ibid.' citation")

If you have different solutions shoot them my way as I would love to see them, and if there looks to be interest, and some responses, I can create a catalog of different solutions similar to what Eric Normand does on his weekly challenges with his PurelyFunctional.tv Newsletter.

Guest on Three Devs and a Maybe

I was recently asked to be on the podcast Three Devs and a Maybe, and am happy to announce that episode is now live.

I talked about my background of getting into computers, my route into functional programming, Clojure, Erlang, and a little bit off background on how I decided to start Functional Geekery.

Make sure to take a listen to Episode 71 – Erlang and Clojure with Steven Proctor and let me know what you think.

Thanks to Edd Mann and Michael Budd for the invite, and a great conversation.

–Proctor

Ten Episodes of Functional Geekery Live!

I just released the tenth episode of my podcast Functional Geekery.

I had been thinking about doing a podcast for a while, in return of all the information I get out of listening to other podcasts as part of my “Automobile University”, but could never come up with the niche. It occurred to me about 3am in the morning when I was taking a shift to get our little one, Ada (yes, after that Ada) who was about 5 months old at the time, back to sleep; the best ideas to come when you are least prepared to think about them.

I told my wife about my “crazy idea”, and explained to her what my goal was, and got her support for this experiment I was wanting to do, and told her I could do this in a very lean manner. I had a headset with microphone already, and told her it would just be a domain, hosting, and recording setup. My goal was to see if I couldn’t start a podcast for only about $100 investment with all things totaled. I would shoot to see if I could get at least 10 episodes done, to amortize the cost to be about $10 an episode.

I figured if nobody listened, but I could have 10 interesting conversations, the learning and exposure to ideas from those conversations would easily outweigh that initial investment, and the podcast would give me a good way to reach out to people I would love to talk to but probably wouldn’t have had the opportunity to talk to anytime soon.

I want to give a sincere heartfelt Thank You to all my guests so far, and everybody else I have reached out to so far to get initial conversations going about being a guest. Everyone has been much more receptive and open than I could have ever imagined. Everybody has been kind, and the worst I have gotten was some deferrals due to busy schedules, which I can appreciate. This is been even more honoring, as most of the people I have reached out to had likely never heard of me when I sent my emails to them asking if they would do me the honor of being a guest on the podcast I was starting. Thank you all for your support, and kindness, and if you ever have more things you would like to talk about, all of you always have an open invitation back.

I also want to thank everybody who has listened, and shared the podcast with others. I have gotten much better reception and response that I realistically imagined. Thank you for your shares, (re)tweets, comments, and suggestions. If you have anything else you want to share I would love to hear from you. If you need to find the best way to contact me, just head over to Functional Geekery’s About page.

Don’t worry, I am not planning on going anywhere at this point. I have another recording “in the can”, and am working to line up some more great guests. I also have a large list of people I would love to talk to at some point, and would hate to end before I got to use the podcast as a reason to be able to have a interesting conversation with them as well. 😉

As always, a giant Thank You goes out to David Belcher for the logo, who took my rough idea of a logo, and transformed it into something brilliant.

And an even bigger THANK YOU goes out to my wife, who has let me pursue this “crazy idea”.

Your host,
–Proctor

Functional Geekery

In a previous post I mentioned I had a new project in the works. I was a guest on the Ruby Rogues podcast and made an announcement there, but for those who didn’t catch that episode, I am now announcing Functional Geekery, a podcast about functional programming.

After some issues with getting the hosting setup properly, and working with the hosting provider’s support for a couple of issues, the first episode is ready to go live! I will be working on getting it in the iTunes store, and some of the other podcasting services, but in the meantime, you can find it online.

I am hoping to have a wide range of guests and topics, from Clojure, to Erlang, to JavaScript, to F#, as well as Scala, Haskell, and functional programming in languages like C# and Ruby. If you have any suggestions on shows, topics, or guests, check out the About Page on the site to submit ideas.

–Proctor

New Project in the Works….

I have a new project in the works.

If you are interested in functional programming I am hoping that it will be of interest to you. One of my goals of the yet to be announced project is that it will provide something for all levels of experience.

–Proctor

Software Development Podcasts – 2013 Edition

I was recently chatting with some coworkers about podcasts I listen to, so I thought I should document that list for easy sharing and to find some gems I am missing.

I have taken advantage of my commute time and turned my commute into Automobile University as talked about by Zig Ziglar. I heard this idea via some fitness blogs I was reading where the trainers were talking about ways to continuously improve, and decided I would apply that idea to my commute, walks, or even running errands.

The other thing I have started taking advantage of is the ability of podcast players to play at double speed. Most podcasts out there do well at one-and-a-half or double speed, and have heard that some players even support three-times speed. This allows you to dramatically increase your consumption rate if you can follow along at those speeds. You may not understand everything that is said, but you can always go back and re-listen to sections if needed, let it broaden your known unknowns, and at the least it should help to remove some of your unknown unknowns.

I did a listing of Software Development Podcasts previously, and am going to try and make this a yearly or bi-yearly update based off how frequently this list of podcasts change in my rotation.

.NET Podcasts

.NET Rocks
The Tablet Show – The same two hosts of .NET Rocks, but with a focus on tablets and tablet development
Run As Radio – One of the hosts of .NET Rocks and The Tablet Show, focused on server administration mostly focused on the Mircosoft platform.
Hanselminutes
Herding Code
Deep Fried Bytes

Ruby Podcasts

Ruby Rogues – Panel discussion on various Ruby related topics and projects.

Clojure Podcasts

The Cognicast – Formerly Think Relevance podcast
Mostly λazy – Infrequent updates, but enjoyed the episodes that have been released

JavaScript Podcasts

JavaScript Jabber – Panel discussion on JavaScript topics, started by the host who started Ruby Rogues. The first episodes were hard to listen to due to some negativity, but have picked up listening to it again in the 50’s episode numbers, and working my way back as I get a chance.

Erlang Podcasts

Mostly Erlang – Panel discussion mostly about Erlang, but touches on related topics and other functional programming languages and how they relate to Erlang.

General

The Changelog – Podcast about Open Source projects from The Changelog
The Wide Teams Podcast – Hosted by one of the panelists of Ruby Rogues, with a focus on distributed software development, with the goal to find out the good and the bad experiences and help share information on how distributed teams work.
Software Engineering Radio – Recently I have only been finding a few shows on topics that seem interesting, but have a large backlog of shows with interesting topics.
GitMinutes – Podcast covering Git source control management.

New Comers

These are podcasts that I have only listened to a couple of episodes of, either because they have only released a couple, or have just started trying them.

Think Distributed – They only have a few episodes out so far, but some good info from the ones they have done.
Giant Robots Smashing into other Giant Robots – Podcast from Thoughtbot on technical topics. Just started listening to it recently, and plan on digging into the backlog of shows.

On my list to check out

Food Fight – Podcast on DevOps
The Freelancers Show – Started by the same host of JavaScript Jabber and Ruby Rogues about freelancing. I would think the information would be relevant to full time employees even for working to build ones career.

If you have any other podcasts that are good please feel free to add your list of podcasts that I have left out to the comments.

**Updated 2013-10-24 7:54 CDT to include link to previous list of Software Development Podcasts
**Updated 2013-10-24 22:13 CDT to include The Changelog, a “podcast covering what’s new and interesting in open source”
**Updated 2013-10-24 22:28 CDT to include GitMinutes

Lumberjack – lumberjack.nginx (version 0.1.0)

As I posted last time, lumberjack is my start of a log line analyzer/visualizer project in Clojure. This write up will cover the version 0.1.0 lumberjack.nginx namespace.

As this is a version 0.1.0, and to get it out, I am parsing Nginx log lines that take the following format, as all the log lines that I have been needing to parse match it.

173.252.110.27 - - [18/Mar/2013:15:20:10 -0500] "PUT /logon" 404 1178 "http://shop.github.com/products/octopint-set-of-2" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"

The function nginx-logs takes a sequence of Nginx log filenames to convert to a hash representing a log line by calling process-logfile on each one.

(defn nginx-logs [filenames]
  (mapcat process-logfile filenames))

The function process-logfile takes a single filename and gets the lines from the file using slurp, and then maps over each of the lines using the function parse-line.

(defn- logfile-lines [filename]
  (string/split-lines (slurp filename)))

(defn process-logfile [filename]
    (map parse-line (logfile-lines filename)))

At this point, this is sufficient for what I am needing, but have created an issue on the Github project to address large log files, and the ability to lazily read in the lines so the whole log file does not have to reside in memory.

The function parse-line, holds a regex, and does a match of each line against the pattern. It takes each part of the match and associates to a hash using the different parts of the log entry as a vector of the keywords that represent each part of the regex. This is done by reducing against an empty hash and taking the index of the part into match, the result of re-find.

(def parts [:original
            :ip
            :timestamp
            :request-method
            :request-uri
            :status-code
            :response-size
            :referrer])

(defn parse-line [line]
  (let [parsed-line {}
        pattern #"(d{1,3}.d{1,3}.d{1,3}.d{1,3})? - - [(.*)] "(w+) ([^"]*)" (d{3}) (d+) "([^"]*)".*"
        match (re-find pattern line)]
    (reduce (fn [memo [idx part]]
                (assoc memo part (nth match idx)))
            parsed-line (map-indexed vector parts))))

Looking at this again a few days later, I went and created and issue to pull out the definition of pattern into a different definition, outside of the let, and even the parse-line function. I also want to go back and clean up the parsed-line from the let statement as it does not need to be declared inside the let, but can just pass the empty hash to the reduce. This was setup there before I refactored to a reduce, and was just associating keys one at a time to the index of matched as I was adding parts of the log entry.

Any comments on this are welcome, and I will be posting details on the other files soon as well.

Thanks,
–Proctor

Lumberjack – Log file parsing and analysis for Clojure

I have just pushed a 0.1.0 version of a new project called Lumberjack. The goal is to be a library of functions to help parse and analyze log files in Clojure.

At work I have to occasionally pull down log files and do some visualization of log files from our Nginx webservers. I decided that this could be a useful project to play with to help me on my journey with Clojure and Open Source Software.

This library will read in a set of Nginx log files from a sequence, and parse them to a structure to be able to analyze them. It currently also provides functionality to be able to visualize the data as a set of time series graphs using Incanter, as that is currently the only graphing library I have seen so far.

A short future list of things I would like to be able to support that come to mind very quickly, and not at all comprehensive:

Update to support use of BufferedReader for very long log files so the whole file does not have to reside in memory before parsing, and take advantage of lazyness.
The ability to only construct records with a subset of the parsed data, such as request type, and timestamp.
The ability to parse log lines of different types, e.g. Apache, IIS or other formats
Additional graphs other than time series, e.g. bar graphs to show number of hits based off of IP Address.
Possibility of using futures, or another concurrency mechanism, to do some of the parsing and transformation of log lines into the data structures when working on large log files.

The above are just some of my thoughts on things that might fit well as updates to this as I start to use this more and flush out more use cases.

I would love comments on my code, and any other feedback that you may have. This is still early but I wanted to put something out there that might be of some use to others as well.

You can find Lumberjack on my Github account at https://github.com/stevenproctor/lumberjack.

Thanks for your comments and support.
–Proctor

Log File parsing with Futures in Clojure

As the follow up to my post Running Clojure shell scripts in *nix enviornments, here is how I implemented an example using futures to parse lines read in from standard in as if the input was piped from a tail and writing out the result of parsing the line to standard out.

First due to wanting to run this a script from the command line I add this a the first line of the script:

 
!/usr/bin/env lein exec

As well, I will also be wanting to use the join function from the clojure.string namespace.

 
(use '[clojure.string :only (join)])

When dealing with futures I knew I would need an agent to adapt standard out.

(def out (agent *out*))

I also wanted to separate each line by a new line so I created a function writeln. The function takes a Java Writer and calls write and flush on each line passed in to the function:

(defn writeln [^java.io.Writer w line]
  (doto w
    (.write (str line "n"))
    .flush))

Next I have my function to analyze the line, as well as sending the result of that function to the agent via the send-off function.

(defn analyze-line [line]
  (str line "   " (join "  " (map #(join ":" %) (sort-by val > (frequencies line))))))

(defn process-line [line]
  (send-off out writeln (analyze-line line)))

The analyze-line function is just some sample code to return a string of the line and the frequencies of each character in the line passed in. The process-line function takes a line and calls send-off to the agent out for the function writeln with the results of calling the function analyze-line.

With all of these functions defined I now need to just loop continuously and process lines that are not empty, and call process-line for each line as a future.

(loop []
  (let [line (read-line)]
    (when line
      (future (process-line line)))
      (recur)))