Papillon – a new interceptor library for Clojure(Script)

Announcing Papillon, 0.0.1-PREVIEW.

This is an early alpha release to get feedback and discussion in the community. We are using it in production, but we went with an PREVIEW version to show still this is early days of the library, and although we are going to strive for the stability of libraries in the Clojure ecosystem, we recognize that as people other than us use it, it may require some updates.

Why a new interceptor library?

At work, Guaranteed Rate, who allowed us to open source the project, we run ClojureScript on AWS Lambdas (Node runtime) for many of the micro-services our team owns. The Lambdas are a combination of Lambdas triggered by AWS API Gateway which put items on either and SNS Topic or SQS Queue, with other Lambdas consuming from SQS Queues.

There was a decent portion of the codebase that was based of JavaScript Promises, and using Promise chaining using a “state” map (until finally style error handling for resource clean-up was needed), so we thought Interceptors was not a far reach for the team to adapt to.

We looked at Pedestal and Sieppari, but various aspects didn’t look to fit in with our desire to use the Interceptor pattern in AWS Lambdas, so we took a shot at rolling our own.

Goals

Clojure Common

As mentioned above, we run ClojureScript on Node runtime for our AWS Lambdas, so we needed a solution that covers both Clojure and ClojureScript.

While we are currently only targeting Clojure and ClojureScript support, as that is what we use in our deployments at work, our goal is to stick to Clojure core a much as possible to help keep the Papillon available across as many of the different Clojure runtimes as possible (e.g. Clojure.NET, ClojurErl, ClojureDart, Babashka, and more would also be welcomed).

Interceptor focused

Pedestal interceptors are fantastic, but they are part of Pedestal, and while we could have used Pedestal’s interceptor namespace only, it does have dependencies on logging in Pedestal, and the interceptors in Pedestal are not quite as isolated from the rest of Pedestal as we would have liked.

Decouple the interceptors from HTTP requests.

Sieppari was more focused on Interceptors only, but was based on the idea of a Request/Response model.

With our goal to have interceptors be the prevalent pattern in our AWS Lambdas, we needed something that would fit with both the HTTP style of synchronous AWS Lambdas, as well as the asynchronous AWS Lambdas that consume their items from SQS queues, the idea of contorting a SQS message into a HTTP Request/Response was something we wanted to avoid.

Minimal

We focused on what seemed to be the core of interceptors which is having an execution chain of interceptors to run, and running a context map through that chain and back out.

We have tried to leave out most everything else, as we found the interceptor chains can easily be modified to include many orthogonal concerns, allowing us to decomplect the interceptor execution from other useful but separate concerns.

One example of something we have left out is logging. While it is useful to log the path through the interceptor chain, and modifications to the execution chain, we didn’t want to pick a logging library that consuming applications would inherit our decision on.

Data First

We also found that given the concept of the interceptor chain being just data, we could get logging (and various other concerns, e.g. benchmarking, tracing, etc.), included by manipulating the interceptor chain using interleave with a repeated sequence of a logging specific interceptor.

This ability to treat both the context as data and the control flow as data, allowed us to keep the core flow of domain logic as interceptors, distinct from logging and other developer related concerns, allowing us to highlight the core context.

Given that the control flow is data, and available on the context, it allowed us to play with ideas like setting up support for a Common Lisp style Condition System as seen in the examples folder.

Clojure Core Libraries Based

We stuck with core.async and the ReadPort as our asynchronous mechanism. If we are a library, we didn’t want to commit you to a library because you use Papillon. Clojure (JVM) already has various asynchronous constructs that work with ReadPort and we piggy-backed on ReadPort in ClojureScript land to allow you to use JavaScript Promises as a ReadPort.

The Goal was to give you something that would work out of the box with the various tools to do asynchronous programming that Clojure and ClojureScript give you without making you implement yet another protocol to adapt to.

Get more discussion on Interceptors starting again

We don’t expect that this will become the next big hit and everyone will start using this in their code, but we do hope that by publishing and promoting “Yet Another Interceptor Library”

Side Note:  I almost did name it yail (Yet Another Interceptor Library), but it didn't feel like it hit 'yet another' level so happy to keep that one available for someone else to use in the hope that the idea of interceptors gets to that point 🤞.

Those of us in our group who were pushing this project forward think interceptors are a valuable and a “well kept secret” of the Clojure ecosystem, and would love to see more usages of them in the community.

We also would love to see some more abuses of interceptors as well, because it helps find the edges of what can(not) and should (not) be done with them.

Thank You to Guaranteed Rate

We are thankful that Guaranteed Rate has allowed us to take this project to make our work lives better and open source it as Papillon to share with the Clojure community.

Ibid.

During my interview with Gene Kim on for Functional Geekery, Episode 128, Gene talked about how he had a problem he was asking different people for how they would solve it nicely with a functional approach, to see how to improve his Clojure solution to be more idiomatic.

His problem was on “rewriting” Ibid. entries in citation references, to get the authors names instead of the Ibid. value, as Ibid. is a shorthand that stands for “the authors listed in the entry before this”.

As he was describing this problem, I was picturing the general pseudo-code with a pattern match in my head. To be fair, this has come from a number of years of getting used to thinking in a functional style as well as thinking in a pattern matching style.

The following Erlang code is a close representation to the pseudo-code that was in my head.

-module(ibid).

-export([ibid/1]).

ibid(Authors) ->
    ibid(Authors, []).

ibid([], UpdatedAuthors) ->
    {ok, lists:reverse(UpdatedAuthors)};
ibid(["Ibid." | _], []) ->
    {error, "No Previous Author for 'Ibid.' citation"};
ibid(["Ibid." | T], UpdatedAuthors=[H | _]) ->
    ibid(T, [H | UpdatedAuthors]);
ibid([H | T], UpdatedAuthors) ->
    ibid(T, [H | UpdatedAuthors]).

Running this in the Erlang shell using erl results in the following

> ibid:ibid(["Mike Nygard", "Gene Kim", "Ibid.", "Ibid.", "Nicole Forsgren", "Ibid.", "Jez Humble", "Gene Kim", "Ibid."]).
{ok,["Mike Nygard","Gene Kim","Gene Kim","Gene Kim",
     "Nicole Forsgren","Nicole Forsgren","Jez Humble","Gene Kim",
     "Gene Kim"]}
> ibid:ibid(["Ibid."]).
{error,"No Previous Author for 'Ibid.' citation"}

Throughout the editing of the podcast, I continued to think about his problem, and how I would approach it in Clojure without built-in pattern matching, and came up with the following using a cond instead of a pure pattern matching solution:

(defn
  update_ibids
  ([authors] (update_ibids authors []))
  ([[citation_author & rest_authors :as original_authors] [last_author & _ :as new_authors]]
    (let [ibid? (fn [author] (= "Ibid." author))]
      (cond
        (empty? original_authors) (reverse new_authors)
        (and (ibid? citation_author) (not last_author))
          (throw (Exception. "Found `Ibid.` with no previous author"))
        :else (recur
          rest_authors
          (cons
            (if (ibid? citation_author)
                last_author
                citation_author)
            new_authors))))))

And if we run this in the Clojure REPL we get the following:

user=> (def references ["Gene Kim", "Jez Humble", "Ibid.", "Gene Kim", "Ibid.", "Ibid.", "Nicole Forsgren", "Micheal Nygard", "Ibid."])

user=> (update_ibids [])
()
user=> (update_ibids ["Ibid."])
Execution error at user/update-ibids (REPL:8).
Found `Ibid.` with no previous author
user=> (update_ibids references)
("Gene Kim" "Jez Humble" "Jez Humble" "Gene Kim" "Gene Kim" "Gene Kim" "Nicole Forsgren" "Micheal Nygard" "Micheal Nygard")

That solution didn’t sit well with me (and if there is a more idiomatic way to write it I would love some of your solutions as well), and because of that, I wanted to see what could be done using the core.match library, which moves towards the psuedo-code I was picturing.

(ns ibid
  (:require [clojure.core.match :refer [match]]))


(defn
  update_ibids
  ([authors] (update_ibids authors []))
  ([orig updated]
    (match [orig updated]
      [[] new_authors] (reverse new_authors)
      [["Ibid." & _] []] (throw (Exception. "Found `Ibid.` with no previous author"))
      [["Ibid." & r] ([last_author & _] :seq) :as new_authors] (recur r (cons last_author new_authors))
      [[author & r] new_authors] (recur r (cons author new_authors)) )))

And if you are trying this yourself, don’t forget to add to your deps.edn file:

{:deps
  {org.clojure/core.match {:mvn/version "0.3.0"}}

After the first couple of itches were scratched, Gene shared on Twitter Stephen Mcgill’s solution and his solution inspired by Stephen’s.

https://twitter.com/RealGeneKim/status/1201922587346866176

(Edit 2022-05-02 : I took out the Twitter embed and changed the embed to be an HTML link to Twitter if you are interested in seeing the post as it was pointed out that tracking cookies were being dropped by Twitter, in an effort to reduce cookies being dropped by this site.)

And then, just for fun (or “just for defun” if you prefer the pun intended version), I did a version in LFE (Lisp Flavored Erlang) due to it being a Lisp with built in pattern matching from being on the Erlang runtime.

(defmodule ibid
  (export (ibid 1)))


(defun ibid [authors]
  (ibid authors '[]))


(defun ibid
  ([[] updated]
    (tuple 'ok (: lists reverse updated)))
  (((cons "Ibid." _) '[])
    (tuple 'error "No Previous Author for 'Ibid.' citation"))
  ([(cons "Ibid." authors) (= (cons h _) updated)]
    (ibid authors (cons h updated)))
  ([(cons h rest) updated]
    (ibid rest (cons h updated))))

Which if we call it in LFE’s REPL gives us the following:

lfe> (: ibid ibid '["Mike Nygard" "Gene Kim" "Ibid." "Ibid." "Nicole Forsgren" "Ibid." "Jez Humble" "Gene Kim" "Ibid."])
#(ok
  ("Mike Nygard"
   "Gene Kim"
   "Gene Kim"
   "Gene Kim"
   "Nicole Forsgren"
   "Nicole Forsgren"
   "Jez Humble"
   "Gene Kim"
   "Gene Kim"))
lfe> (: ibid ibid '["Ibid."])
#(error "No Previous Author for 'Ibid.' citation")

If you have different solutions shoot them my way as I would love to see them, and if there looks to be interest, and some responses, I can create a catalog of different solutions similar to what Eric Normand does on his weekly challenges with his PurelyFunctional.tv Newsletter.

Deeper Down the Custom Built Keyboard Rabbit Hole

In my last post, Starting Towards A Custom Built Keyboard, I left off with my pining over the idea of an ErgoDox; a split keyboard, with full programmability of the key mappings.

A new co-worker and some Obsessive Compulsive Syndrome

Not long after starting a new job, a co-worker was intrigued by my Das Keyboard, but couldn’t touch-type, so I let him borrow my Code Keyboard with Cherry clear switches.

After an extended weekend with it, he was determined to learn to touch-type so he could try out my Das Keyboard with its Cherry blue switches.

It was at this point his (officially diagnosed) Obsessive Compulsive Syndrome kicked in.

He dove in and learned to touch-type, and soon he was ready to try out the unlabeled keys of the Das Keyboard which had been “taunting him” from my desk at work.

After just a few days with the Code, and then the Das Keyboard, he fell (back) in love with mechanical keyboards.

Of course, since he was learning to appreciate touch-typing, instead of just hunting-and-pecking, I introduced him to Steve Losh’s Modern Space Cadet post.

Due to his Obsessive Compulsive Syndrome, combined with love of the hunt at thrift stores, he found some older keyboards like the IBM Model M, and and old cop car keyboard by TG3.

As the old IBM Model M’s don’t support USB, he found the TMK Firmware project by Hasu, and assembled some converters which would allow him to use his older keyboards with his current computers.

If you are curious on his journey down the rabbit hole, he has been starting to document it on his blog at www.clintgeek.com.

The Ultimate Hacking Keyboard

Meanwhile, I was still on the hunt for a split ergonomic programmable keyboard.

In August of 2015 I stumbled across a pre-announcement of the Ultimate Hacking Keyboard, and after listening to Laci on The Changelog in December of 2015, I was intrigued enough to pre-order a board.

My thought was it would net me a programmable keyboard, as well as a split keyboard, without having to assemble it myself, and if it would turn out I wasn’t a fan, I could likely sell it to someone who missed the pre-order.

To top it off, it outlined that it was going to have magnets that would hold it together in a non-split mode.

Being uncertain of if I would even like a split keyboard, I saw a reduced risk in the purchase, in that if I tried the split mode and didn’t like it, I could always fall back to non-split mode for the keyboard, yet still have something that was programmable.

And of course, hardware being as it is, it has (understandably) been delayed a number of times as the project has progressed, but should hopefully be arriving this August (2017) as of last update.

Sierra and Karabiner

While awaiting the Ultimate Hacking Keyboard, and watching my co-worker dive deeper in to customizing his keyboards, I was sitting in a holding pattern before upgrading to OSX Sierra as it meant incompatibility with Karabiner due to changes in OSX.

Karabiner Elements was eventually announced as the replacement for Seil and Karabiner, but didn’t give the full extensibility of Karabiner, and I wasn’t going to lose my “Hyper key” from my keyboard without a battle.

Because of this, my co-worker was busy trying to sell me on the idea of building converters for my keyboards myself, but kept un-selling himself on the idea, as “a USB-to-USB protocol wasn’t quite supported”.

The “sales” cycle continued in this manner until he finally mentioned:

“Now, if your keyboards supported the PS/2 protocol, that would be very easy.”

Getting ready to make the dive

With those magic words, he convinced me to build a set of converters for my keyboards, as building the converters would mean:

I can specify the key-mappings I want at the firmware level in the converter, and not have to worry about operating system compatibility of key remapping software;
it also gives the added bonus of being able to bring my keyboard to someone else’s desk to pair, plug it in, and have my keyboard work as I am used to;
remove the converter so that someone can use my computer with the standard key-mappings they are used to;
or possibly even using someone else’s keyboard and using my converter with it to get my key mappings.

With a previous order of a nicer soldering iron over the $3 one from Harbor Freight I had, and an order from Amazon for:

I was ready to start on building some converters for my keyboards.

Starting Towards A Custom Built Keyboard

As someone who works with computers and uses a keyboard, on a near daily basis, and for 10-12+ hours in most cases, having a great keyboard that fits you is right up there with a great bed, really nice monitor, and good desk environment.

Distant Early Warning

My fascination with keyboards go back a number of years, in that whenever I would go into Best Buy, CompUSA, or some other computer sales related stores, I would make sure to try out the keyboards they had on display.

Given all that I am about to lay out, in retrospect, I knew there was something about the importance of a good keyboard, but wasn’t yet cognizant of what a good keyboard even was, settling on nice rubber-dome Logitect keyboards usually.

Enter Das Keyboard

This all started to change about the end of 2009, when Das Keyboards Ultimate Model S was announced.

I was a touch typist, mostly, and I specify mostly, as that I would touch type, but then start letting my eyes wander from the screen where they would eventually wind up looking at the keyboard and my typing pace would slow to a crawl.

Seeing that there was a supposedly really nice keyboard at a reasonable price, given the expected lifetime of a good keyboard, I cashed in my “Christmas Credit”, and asked for only one present, a Das Keyboard Ultimate Model S.

After some selling on the idea, in that professionals invest in professional grade tools; that I wanted the blank keycaps to force myself to not let my eyes wander, because there would be nothing to watch; and that this should last me years of use, if not decades, on Christmas Morning I opened up and started click-clacking away.

Jeff Attwood, Steve Losh, and a co-worker named Pete Young

Then 4 years ago happened, I was following Jeff Atwood’s (a.k.a. CodingHorror) blog posts and he announced the Code Keyboard.

After hearing about how nice the Cherry MX Clears are if you like the Blues (which were the key switches in the Das Model S that I had), and catching the first glimpse of customizability with the Code having dip-switches to change between Mac/Windows Mode, QWERTY/Dvorak/Colemak options, and others, I was able to get an order in on the second round to get a clears.

While we were waiting, my co-worker Pete, who was also excited about keyboards, and further down the path than I, started bringing in his various keyboards, and we would exchange and see how different keyboards felt after a day of usage, getting an initial feel of what we like.

And I believe it was Pete, (and if it wasn’t him that shared it with me and I shared it with him instead, he will still get the credit), showed me Steve Losh’s A Modern Space Cadet post, and I started the path down to using some customization of my keyboard layout as outlined in the post.

Pining over the ErgoDox

At some point I came across the ErgoDox online. I don’t remember where it was originally, but that put the Atreus on my radar and a few other variants of “travel” keyboards.

Somewhere in this time range as well, I heard someone refer to building and customizing your own keyboard as a software developer was akin to a Jedi building their own lightsaber.

Truth be told, I was intimidated. The cost for the parts, and me having to do it myself, made it both a dream, and also very intimidating.

Intimidating as my knowledge of soldering was:
– that there was an soldering iron (which wasn’t spelled how it sounded to my ears),
– flux (not to be confused with the capacitor),
– the solder itself,
– used for connecting electrical “stuff” together,
– and that a soldering iron could easily be confused with a very badly operating wood burner.

But the ErgoDox still remains the dream, and that dream is getting closer to being reality.

In the next few posts, I will outline how I have since started making faster progress to the dream of a fully custom built keyboard and having my own perfect keyboard.

Ruby Tuesday – composing filter

As mentioned in the last Ruby Tuesday on using compose and map, I ended with saying that we would take a look at how we would take advantage of composition for our filter.

As a refresher, here is our Sequence module.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

So lets start taking a look at how we would be able to compose our filter lambdas together by seeing if we can find all numbers that are multiples of three, and then multiples of five.

multiple_of_three = ->(x) { x % 3 == 0}
# => #<Proc:0x007faeb28926b0@(pry):1 (lambda)>
multiple_of_five = ->(x) { x % 5 == 0}
# => #<Proc:0x007faeb3866c58@(pry):2 (lambda)>

Sequence.compose.([Sequence.filter.(multiple_of_three),
                   Sequence.filter.(multiple_of_five)]).((1..100))
=> [15, 30, 45, 60, 75, 90]

The catch again is that we are traversing the first list for all 100 elements to find those items that are multiples of 3, and then filtering the filtered list to find only those that are multiples of 5.

What if we could do this in one pass, and compose the predicates together?

And since we want to test this to make sure we get a good result, we will compose those predicates together and just test our composed function with the number 15.

three_and_five = Sequence.compose.([multiple_of_three, multiple_of_five])
=> #<Proc:0x007faeb4905d20 (lambda)>
three_and_five.(15)
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

It fails with a NoMethodError telling us that we can’t use the method % on an object of type TrueClass.

Don’t we want to get back a true instead of an error though?

If we rollback the compose into it’s nested function call version, it starts to become clear where the error is coming from.

multiple_of_five.(multiple_of_three.(15))
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

So what is happening is that we get back true from calling multiple_of_three with 15, and we then pass that true into multiple_of_five, when what we wanted to do was to pass in the 15 and make sure it was a multiple of 5 as well.

What we really want is to evaluate every lambda with the value of 15, and then then see if all of the predicate functions succeed in their checks.

So we will start with a very naive, and “un-curried” version, to prove out our concept.

Our first pass will be to map over each predicate and invoke it with the value we want to test resulting in a list of booleans if it succeeded or failed. We then reduce all of those items together via an and operation to get an overall success status.

def my_all_succeed_naive(predicates, value)
  check_results = Sequence.map.(->(f) {f.(value)}, predicates)
  Sequence.reduce.(true, ->(accum, item) {accum && item}, check_results)
end

my_all_succeed_naive([multiple_of_three, multiple_of_five], 15)
# => true
my_all_succeed_naive([multiple_of_three, multiple_of_five], 14)
# => false
my_all_succeed_naive([multiple_of_three, multiple_of_five], 5)
# => false

This looks like this works, as we get 15 is a multiple of 3 and a multiple of 5, but we still check if the item is a multiple of 5, even if our multiple of 3 check failed.

Could we do better, and short circuit our evaluation if we get a false?

Let’s try.

def my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

So let’s take a look at the difference between the two, just to make sure we are on the right track.

First we will create a “long running predicate check” to add to our chain.

[22] pry(main)> pass_after = ->(x, value) { sleep(x); true }.curry
=> #<Proc:0x007faeb310d498 (lambda)>

Then we will time it using Benchmark#measure (and don’t forget to require 'benchmark' first though). First with a success, and then with a failure against the first predicate.

Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 15)
end
# => #<Benchmark::Tms:0x007faeb28f24c0
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0046792929642834,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>


Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb6018c88
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=2.5073997676372528e-05,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

We can see that it takes three seconds to run if we succeed, but only a fraction of a second if we fail on the first check. And just to make sure our earlier assumptions were correct, we will benchmark the same predicate list against the naive version with the value of 14, so it will fail on the first check of a multiple of three.

Benchmark.measure do
  my_all_succeed_naive([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb487d218
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0028793679666705,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

And it does indeed take just over 3 seconds to complete.

So let’s add this to our Sequence module, and get it to be able to be curried.

So what if we did that for just a single string, not even in an list of some sort?

def self.my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

@@all_succeed = method(:my_all_succeed).curry

def self.all_succeed
  @@all_succeed
end

And we check that we can use it off our Sequence module partially applied.

Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(14)
# => false
Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(15)
# => true

So now we can get back to how we would use this with filter by using our new Sequence::all_succeed.

three_and_five_multiple = Sequence.all_succeed.([multiple_of_three, multiple_of_five])
# => #<Proc:0x007faeb28685e0 (lambda)>
Sequence.filter.(three_and_five_multiple).((1..100))
# => [15, 30, 45, 60, 75, 90]

And there we go, we have now composed our predicates into one function which we can then pass to Sequence::filter and only have to walk through the list once.

As we have seen how to compose predicates together for an “and” style composition, next week we will look at building an “or” style composition of predicates.

–Proctor

Erlang Thursday – ETS selects, continuations, and concurrent inserts

At the end of last week’s Erlang Thursday, I said we would continue looking at the behavior of the select functions in the ets module.

So before we do any experimentation, we setup our test ETS tables, and this time we will also create a table of type ordered_set.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
TestOrderedSetTable = ets:new(ordered_set_table, [public, ordered_set]).
% 16402
TestTable = ets:new(ets_table, [public]).
% 20499
ets:give_away(TestTable, SomeProcess, []).
% true
ets:give_away(TestOrderedSetTable, SomeProcess, []).
% true

Next we will load our test ETS table with some dummy data, leaving some gaps in the sequence, allowing us to fill those gaps in later.

[[ets:insert(TestTable, {X, X}) || X <- lists:seq(1, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]
[[ets:insert(TestOrderedSetTable, {X, X}) || X <- lists:seq(1, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]

We then do a select to get all of the records from the table so we can see how the results are ordered for the different table types.

ets:select(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{15,15},
%  {25,25},
%  {13,13},
%  {21,21},
%  {11,11},
%  {1,1},
%  {23,23},
%  {7,7},
%  {3,3},
%  {9,9},
%  {19,19},
%  {29,29},
%  {27,27},
%  {17,17},
%  {5,5}]
ets:select(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{1,1},
%  {3,3},
%  {5,5},
%  {7,7},
%  {9,9},
%  {11,11},
%  {13,13},
%  {15,15},
%  {17,17},
%  {19,19},
%  {21,21},
%  {23,23},
%  {25,25},
%  {27,27},
%  {29,29}]

The ets module also has a function ets:select_reverse, so let’s take a quick stop and see what that does for our ETS tables.

ets:select_reverse(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{15,15},
%  {25,25},
%  {13,13},
%  {21,21},
%  {11,11},
%  {1,1},
%  {23,23},
%  {7,7},
%  {3,3},
%  {9,9},
%  {19,19},
%  {29,29},
%  {27,27},
%  {17,17},
%  {5,5}]
ets:select_reverse(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{29,29},
%  {27,27},
%  {25,25},
%  {23,23},
%  {21,21},
%  {19,19},
%  {17,17},
%  {15,15},
%  {13,13},
%  {11,11},
%  {9,9},
%  {7,7},
%  {5,5},
%  {3,3},
%  {1,1}]

If we look at the results of ets:select/2 and ets:select_reverse/2, we see that for TestTable we get the same result, and for TestOrderedSetTable we get the results in a reverse order, which is what the documentation for ets:select_reverse/2 states. Which makes sense if you think about it,

With that brief diversion out of the way, lets run our same match_spec()s from above, but limit the results to 5 records so we get a continuation back.

{Result, Continuation} = ets:select(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}], 5).
% {[{19,19},{29,29},{27,27},{17,17},{5,5}],
% {20499,214,5,<<>>,[],0}}
{OrdSetResult, OrdSetContinuation} = ets:select(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}], 5).
% {[{1,1},{3,3},{5,5},{7,7},{9,9}],{16402,9,[],5,<<>>,[],0,0}}

And with those continuations, we will see what the next results we would fetch would be.

ets:select(Continuation).
% {[{1,1},{23,23},{7,7},{3,3},{9,9}],{20499,111,5,<<>>,[],0}}
ets:select(OrdSetContinuation).
% {[{11,11},{13,13},{15,15},{17,17},{19,19}],
%  {16402,19,[],5,<<>>,[],0,0}}

Remember those “gaps” we left in our sequence of numbers we used to create tuples?

Time to “fill in” those gaps of the sequence to see what happens if we fetch with our existing continuation as data gets populated concurrently.

[[ets:insert(TestOrderedSetTable, {X, X}) || X <- lists:seq(2, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]
[[ets:insert(TestTable, {X, X}) || X <- lists:seq(2, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]

Now we re-run our ets:select/1 functions with the same continuations as before.

ets:select(Continuation).
% {[{12,12},{7,7},{3,3},{10,10},{9,9}],
%  {20499,224,5,<<>>,[],0}}
ets:select(OrdSetContinuation).
% {[{10,10},{11,11},{12,12},{13,13},{14,14}],
%  {16402,14,[],5,<<>>,[],0,0}}

If we compare that to before we can see the we now have even number items in the list. For our TestTable if we look above at the Continuation value itself, we ahve the continuation point as 214, since that is the only thing that has changed between that continuation and the resulting continuations from calling ets:select(Continuation).. So with just a number it is hard to infer just how we might expect the continuation to change.

The OrdSetContinuation on the other hand, has a 9 as its second element in the tuple, after the ETS table id of 16402. This also happens to be the key of the last tuple in the result set, which matches up with the 19 and 14 in the other continuations. So in the case of the ordered set, we can infer that as part of the continuation for an ETS table of type ordered_set, the continuation tells us the specific key of the last record that was returned, and we continue from that record regardless of any concurrent inserts that may have taken place.

Next time we will take a look at ets:is_compiled_ms/1 and how match specs might play in with continuations based off reading the documentation about ets:is_compiled_ms/1.

–Proctor

Ruby Tuesday – compose and map

As mentioned last week, now that we have our compose function we will take a look at some of the properties of we get when using our map and compose functions together.

Here is our Sequence class as we left it with compose added to it.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

Like last week we have a list of names, and our goal is to get the list of “first” names back and have them be capitalized.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]
# => ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]

So we start with our base functions that we are going to use in our map calls.

split = ->(delimiter, str) {str.split(delimiter)}.curry
# => #<Proc:0x007ffdfa203a00 (lambda)>
whitespace_split = split.(" ")
# => #<Proc:0x007ffdfa8391e0 (lambda)>
first = ->(xs){ xs[0] }.curry
# => #<Proc:0x007ffdfa2a26a0 (lambda)>
capitalize = ->(s) { s.capitalize }
# => #<Proc:0x007ffdfa210ed0@(pry):13 (lambda)>

And we create our different instances of map calls, which are partially applied map functions with the appropriate lambda passed to them.

name_parts = Sequence.map.(whitespace_split)
# => #<Proc:0x007ffdfa1a0248 (lambda)>
firsts = Sequence.map.(first)
# => #<Proc:0x007ffdfa169568 (lambda)>
capitalize_all = Sequence.map.(capitalize)
# => #<Proc:0x007ffdfa86a1f0 (lambda)>

And as we saw last week, we can nest the function calls together,

capitalize_all.(firsts.(name_parts.(names)))
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

or we can use compose to create a “pipeline” of function calls.

capitalized_first_names = Sequence.compose.([name_parts, firsts, capitalize_all])
# => #<Proc:0x007ffdfa1c1c18 (lambda)>
capitalized_first_names.(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Here’s where things can start to get interesting.

In our capitalized first names example, we go through the list once per transformation we want to apply.

First we transform the list of names into a list of split names, which we transform into a list of only the first items from the source list, and then finally we transform that into a list of the capitalized names.

That could be a lot of processing if we had more transformations, and/or a much longer list. This seems like a lot of work.

Let’s look at it from near the other extreme; the case of if we only had one name in our list.

capitalized_first_names.(['tARa cHAsE'])
=> ["Tara"]

In this case, the fact that we have a list at all is almost incidental.

For a list of only one item, we split the string, get the first element, and capitalize that value.

So what if we did that for just a single string, not even in an list of some sort?

capitalize_first_name = Sequence.compose.([whitespace_split, first, capitalize])
# => #<Proc:0x007ffdfa121100 (lambda)>
capitalize_first_name.("tARa cHAsE")
# => "Tara"

We can compose each of these individual operations together to get a function that will transform a name into the result we want.

And since we have a “transformation” function, we can pass that function to our map function for a given list of names.

Sequence.map.(capitalize_first_name, names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Lo and behold, we get the same results as above for when we composed our map function calls.

Sequence.compose.([Sequence.map.(whitespace_split),
                   Sequence.map.(first),
                   Sequence.map.(capitalize)]
                 ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]
Sequence.map.(Sequence.compose.([whitespace_split,
                                 first,
                                 capitalize])
             ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

This leads us to the property that the composition of the map of function f and the map of function g is equivalent to the map of the composition of functions f and g.

$\big(map\ f \circ map\ g\big)\ list = map\big( f \circ g \big) list$

Where the circle ( $\circ$ ) symbol represents the function compose expressed using mathematical notation.

Because of this, we can now only traverse the sequence via map once and do the composed transformation functions on each item as we encounter it without having to go revisit it again.

Next time we will take a look at how we can do the same kind of operation on filter, as we can’t just pipeline the results of one filter on through to another, since filter returns a boolean value which is not what we would want to feed through to the next filter call.

–Proctor

Erlang Thursday – Using ETS select with a limit

In last week’s Erlang Thursday we continued to explore ets:select/2 and seeing its use when combined with using ets:fun2ms to generate the match_spec()s.

This week we will take a look at the other versions of select that the ets module provides.

Yet again we will do our new playground ETS table setup, so if we crash our shell session we don’t loose the table.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
TestTable = ets:new(ets_table, [public]).
% 16402
ets:give_away(TestTable, SomeProcess, []).
% true

Next we will load our test ETS table with a bunch of test “products”. For ease of example, we will just use a number for the product id, and a random price ending in .99.

[[ets:insert(TestTable, {ProductId, random:uniform(100) + 0.99})
  || ProductId <- lists:seq(1, 10000) ]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true|...]]

We will create a match_spec() to find items in their twenties (and we will go ahead and round 19.99 up to 20 just because).

ProductsInTheTwenties = ets:fun2ms(fun({Product, Price})
                                     when Price >= 19.99 andalso Price < 30
                                     -> {Product, Price}
                                   end).
% [{{'$1','$2'},
%   [{'andalso',{'>=','$2',19.99},{'<','$2',30}}],
%   [{{'$1','$2'}}]}]

And if we use ets:select/2 against our table with this match spec, we get all of the results back in one query as we saw previously.

ets:select(TestTable, ProductsInTheTwenties).
% [{4351,29.99},
%  {635,19.99},
%  {6005,20.99},
%  {3742,27.99},
%  {5956,29.99},
%  {3753,28.99},
%  {6653,25.99},
%  {5151,28.99},
%  {2693,27.99},
%  {4253,21.99},
%  {7636,23.99},
%  {1935,19.99},
%  {9044,22.99},
%  {7797,22.99},
%  {2147,23.99},
%  {2574,26.99},
%  {7575,29.99},
%  {2130,28.99},
%  {4908,27.99},
%  {2218,22.99},
%  {9848,21.99},
%  {7632,26.99},
%  {3562,21.99},
%  {3130,27.99},
%  {575,26.99},
%  {4622,28.99},
%  {5678,25.99},
%  {4022,...},
%  {...}|...]

But the ets module also gives us a way to limit the results if we would like, using ets:select/3 and giving a limit of the number of results to return at a time.

So let’s use ets:select/3 and give a limit of 10 and see what happens.

ets:select(TestTable, ProductsInTheTwenties, 10).
% {[{9027,27.99},
%   {7347,29.99},
%   {7282,20.99},
%   {9386,24.99},
%   {5415,25.99},
%   {4032,29.99},
%   {8105,25.99},
%   {4634,24.99},
%   {1275,20.99},
%   {234,20.99}],
%  {16402,576,10,<<>>,[],0}}

We get a tuple back instead of a list of results. The first item in the tuple is a list of our first ten results we specified, the second is some bizarre looking tuple, which if we look at the documentation for ets:select/3 represents something referred to as a continuation.

So we run our query again, and bind the results this time.

{Results, Continuation} = ets:select(TestTable, ProductsInTheTwenties, 10).
% {[{9027,27.99},
%   {7347,29.99},
%   {7282,20.99},
%   {9386,24.99},
%   {5415,25.99},
%   {4032,29.99},
%   {8105,25.99},
%   {4634,24.99},
%   {1275,20.99},
%   {234,20.99}],
%  {16402,576,10,<<>>,[],0}}

So we have this continuation, but what is it and what does it mean for us to have it.

In short, it can be thought of as an immutable bookmark. It represents not only what page we are in for our query results, but also the book we are reading (our query).

This allows us to quickly pick up where we previously left off in our results set by passing the continuation to ets:select/1.

ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And because it is our special immutable bookmark, every time we use that bookmark it takes us to the same starting point in the same book, and we only read the same number of maximum pages as originally set as our limit.

So no matter how many times we call ets:select/1 with our same continuation, we will get the same results each time.

ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}
ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}
ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And if we look at the resulting tuple, we see that we get a different tuple for our next continuation.

{SecondResults, SecondContinuation} = ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And we can pick up that new continuation, and use that in our next call to ets:select/1 to get the next set of results, with another continuation.

ets:select(SecondContinuation).
% {[{8569,19.99},
%   {1805,28.99},
%   {6819,23.99},
%   {9313,28.99},
%   {9527,27.99},
%   {1737,29.99},
%   {700,26.99},
%   {142,25.99},
%   {6792,29.99},
%   {9295,29.99}],
%  {16402,513,10,<<>>,[],0}}

And if we have a query for which we have exhausted our results set, we get an '$end_of_table' atom.

ets:select(TestTable, [{{'$1', '$2'}, [{'<', '$2', 0}], ['$$']}], 10).
% '$end_of_table'

The ability to specify a limit and have a continuation is also available via on match with ets:match/3 and ets:match/1, and match_object via ets:match_object/3 and ets:match_object/1.

Next week, we will continue looking at the various select functions in ets as we look into their behavior with and ordered set, will look at select vs select_reverse, and play with and see how continuations work if we get some new entries inserted in the results when using a continuation.
–Proctor

Ruby Tuesday – Refactoring towards compose

We have seen filter, map, reduce, partial application, and updating the former functions to be able to take advantage of partial application, so how much further can we go?

In this case, we will take a look at how we can chain the functions together to build up bigger building blocks out of their smaller components.

First as a reminder, here is our Sequence class that we have built up over the past few posts.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end
end

Suppose we want to get a list of capitalized “first” names from a list of names, and we have a bunch of smaller functions that handle the different part of the transformation process that we can reuse.

It might look like the following.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris"]

name_parts_map = Sequence.map.(->(name) {name.split})
# => #<Proc:0x007fea82200638 (lambda)>
first_map = Sequence.map.(->(xs) {xs[0]})
#=> #<Proc:0x007fea82842780 (lambda)>
capitalize_map = Sequence.map.(->(s) {s.capitalize})
# => #<Proc:0x007fea82a05ce8 (lambda)>
initials_map = Sequence.map.(->(strings) {first_map.(strings)})
# => #<Proc:0x007fea82188638 (lambda)>

capitalize_map.(first_map.(name_parts_map.(names)))
# => ["Jane", "John", "Arthur", "Lori"]

And if we wanted to get a list of the initials as a list themselves, we might have something like this.

initials_map.(name_parts_map.(names))
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

And maybe somewhere else, we need to do some mathematical operations, like transform numbers into one less than their square.

square_map = Sequence.map.(->(i) {i*i})
# => #<Proc:0x007fea82183ef8 (lambda)>
dec_map = Sequence.map.(->(i) {i-1})
# => #<Proc:0x007fea82a37018 (lambda)>
dec_map.(square_map.([1,2,3,4,5]))
# => [0, 3, 8, 15, 24]

And yet another place, we have a calculation to turn Fahrenheit into Celsius.

minus_32 = ->(x) {x-32}
# => #<Proc:0x007fea8298d4c8@(pry):36 (lambda)>
minus_32_map = Sequence.map.(minus_32)
# => #<Proc:0x007fea82955488 (lambda)>
five_ninths = ->(x) {x*5/9}
# => #<Proc:0x007fea83836330@(pry):38 (lambda)>
five_ninths_map = Sequence.map.(five_ninths)
# => #<Proc:0x007fea821429f8 (lambda)>
five_ninths_map.(minus_32_map.([0, 32, 100, 212]))
# => [-18, 0, 37, 100]

Setting aside for the moment, all of the Procs making up other Procs if this is still foreign to you, there is a pattern here that we have been doing in all of these examples to compose a larger function out of a number of smaller functions. The pattern is that we are taking the return result of calling one function with a value, and feeding that into the next function, lather, rinse, and repeat.

Does this sound familiar?

This is our reduce.

We can use reduce to define a function compose that will take a list of functions as our items, and an initial value for our functions, and our function to apply will be to call the function in item against our accumulated value.

That’s a bit of a mouthful, so let’s look at it as code, and we will revisit that statement.

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@compose = method(:my_compose).curry

  def self.compose
    @@compose
  end

Now that we have the code to reference, let’s go back and inspect against what was described above.

First we create a lambda apply_fn, which will be our “reducing” function to apply to the accumulator and each item in the list, which in the case of my_compose is a list of functions to call.

apply_fn like all our “reducing” functions so far takes in an accumulator value, the result of the composed function calls so far, and the current item, which is the function to call. The result for the new accumulator value is the result of applying the function with the accumulator as its argument.

We were able to build yet another function out of our reduce, but this time we operated on a list of functions as our values.

Let that sink in for a while.

So let’s see how we use that.

We will start with creating a composed function to map the Fahrenheit to Celsius conversion and see what different temperatures are in Celsius, including the past few days of highs and lows here at DFW airport.

f_to_c_map = Sequence.compose.([minus_32_map, five_ninths_map])
# => #<Proc:0x007fea82a1f9b8 (lambda)>
f_to_c_map.([0, 32, 100, 212])
# => [-18, 0, 37, 100]
dfw_highs_in_celsius = f_to_c_map.([66, 46, 55, 48, 64, 68])
# => [18, 7, 12, 8, 17, 20]
dfw_lows_in_celsius = f_to_c_map.([35, 27, 29, 35, 45, 40])
# => [1, -3, -2, 1, 7, 4]

And if we take a look at the initials above and compose the map calls together, we get the following.

get_initials_map = Sequence.compose.([name_parts_map, initials_map])
# => #<Proc:0x007fea82108ff0 (lambda)>
get_initials_map.(names)
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

Doing the same for our capitalized first names we get:

capitalized_first_names_map = Sequence.compose.([name_parts_map, first_map, capitalize_map])
# => #<Proc:0x007fea821d1108 (lambda)>
capitalized_first_names_map.(names)
# => ["Jane", "John", "Arthur", "Lori"]

By having our compose function, we are able to be more explicit that capitalized_first_names_map is, along with the rest of the examples, just a composition of smaller functions that have been assembled together in an data transformation pipeline.

They don’t have any other logic other than being the result of chaining the other functions together to get some intended behavior.

Not only that, but we can now reuse our capitalized_first_names_map mapping function against other lists of names nicely, since we have it able to be partially applied as well.

capitalized_first_names_map.(["bob cratchit", "pete ross", "diana prince", "tara chase"])
# => ["Bob", "Pete", "Diana", "Tara"]

Even better is that compose can work on any function (Proc or lambda) that takes a single argument.

Such as a Fahrenheit to Celsius function that operates against a single value instead of a list.

f_to_c = Sequence.compose.([minus_32, five_ninths])
# => #<Proc:0x007fea8294c4c8 (lambda)>
f_to_c.(212)
# => 100
f_to_c.(32)
# => 0
Sequence.map.(f_to_c, [0, 32, 100, 212])
# => [-18, 0, 37, 100]

Next week well will look at some other properties of our functions and show how compose can potentially help us in those cases as well.

–Proctor

Erlang Thursday – ETS, match_specs, and functions

In last week’s Erlang Thursday I concluded with showing how we can take advantage of using ets:select but take advantage of making our queries more expressive.

First we will need a new ETS table, so we start with a new public “Products” table, and do our standard of creating a new process and giving ownership of the table away.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
Products = ets:new(products, [public]).
% 16402
ets:give_away(Products, SomeProcess, []).
% true

Next we will load our “Products” into the table.

In our case, we are just creating a “product” with the “name” as a binary and a “price in CWC” (Common World Currency) as an integer.

[[ ets:insert(Products, {integer_to_binary(X), X}) || X <- lists:seq(1, 100) ]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true|...]]

As we saw last week, we can manually build up a list of tuples into the match_spec() to run our query, say for items less than 10 CWCs.

ets:select(Products, [{{'$1', '$2'}, [{'<', '$2', 10}], ['$1']}]).
% [<<"8">>,<<"6">>,<<"5">>,<<"3">>,<<"7">>,<<"1">>,<<"4">>,
%  <<"9">>,<<"2">>]

We can also find those item names that are more than 10 CWCs and less than 25 CWCS.

ets:select(Products, [{{'$1', '$2'}, [{'>', '$2', 10}, {'<', '$2', 25}], ['$1']}]).
% [<<"11">>,<<"15">>,<<"23">>,<<"20">>,<<"21">>,<<"14">>,
%  <<"12">>,<<"13">>,<<"16">>,<<"19">>,<<"17">>,<<"18">>,
%  <<"22">>,<<"24">>]

But this isn’t necessarily clear, as we are using numerical values for the items in the tuple, and lists of tuples with lists of tuples inside them.

Enter ets:fun2ms/1 to the rescue.

ets:fun2ms/1 will take a function and will turn that function into a match_spec().

This allows us to write a function that takes a tuple of Product and Cost, and will return the Product if the Cost is less than 10.

ets:fun2ms(fun({Product, Cost}) when Cost < 10 -> Product end).
% [{{'$1','$2'},[{'<','$2',10}],['$1']}]

We can also have compound checks in our guard clauses on the functions we pass to ets:fun2ms/1, such as and clauses,

Between_25_And_35_CWC = ets:fun2ms(fun({Product, Cost}) when Cost > 25, Cost < 35 -> Product end).
% [{{'$1','$2'},[{'>','$2',25},{'<','$2',35}],['$1']}]
ets:select(Products, Between_25_And_35_CWC).
% [<<"30">>,<<"33">>,<<"32">>,<<"29">>,<<"28">>,<<"26">>,
%  <<"34">>,<<"27">>,<<"31">>]

as well as or style clauses.

While this is useful it does have its limitations, as this parse transform on the function, so you can’t use everything that you would be able to with a normal function.

ets:fun2ms(fun({Product, Cost}) when Cost > 90 -> lists:reverse(binary:bin_to_list(Product)) end).
# Error: Unknown error code {122,lists,reverse}
# {error,transform_error}

But then again, the results part of the match_spec(), doesn’t support advanced functionality anyways.

ets:select(Products, [{{'$1', '$2'}, [{'<', 90, '$2'}], [binary:bin_to_list('$1')]}]).
# ** exception error: bad argument
#      in function  binary:bin_to_list/1
#         called as binary:bin_to_list('$1')

But even with its limitations, ets:fun2ms/1 still does a good job to help make our ETS queries more expressive. Not only can we reference a function with expressive variable names over just $X, as well as give guard clauses instead of just guard tuples, but we can also use those variable names in our results as well, and do the basic formatting as part of the function.

And make sure to check back in next week as we continue with looking at the different versions of ets:select.

–Proctor