Deeper Down the Custom Built Keyboard Rabbit Hole

In my last post, Starting Towards A Custom Built Keyboard, I left off with my pining over the idea of an ErgoDox; a split keyboard, with full programmability of the key mappings.

A new co-worker and some Obsessive Compulsive Syndrome

Not long after starting a new job, a co-worker was intrigued by my Das Keyboard, but couldn’t touch-type, so I let him borrow my Code Keyboard with Cherry clear switches.

After an extended weekend with it, he was determined to learn to touch-type so he could try out my Das Keyboard with its Cherry blue switches.

It was at this point his (officially diagnosed) Obsessive Compulsive Syndrome kicked in.

He dove in and learned to touch-type, and soon he was ready to try out the unlabeled keys of the Das Keyboard which had been “taunting him” from my desk at work.

After just a few days with the Code, and then the Das Keyboard, he fell (back) in love with mechanical keyboards.

Of course, since he was learning to appreciate touch-typing, instead of just hunting-and-pecking, I introduced him to Steve Losh’s Modern Space Cadet post.

Due to his Obsessive Compulsive Syndrome, combined with love of the hunt at thrift stores, he found some older keyboards like the IBM Model M, and and old cop car keyboard by TG3.

As the old IBM Model M’s don’t support USB, he found the TMK Firmware project by Hasu, and assembled some converters which would allow him to use his older keyboards with his current computers.

If you are curious on his journey down the rabbit hole, he has been starting to document it on his blog at www.clintgeek.com.

The Ultimate Hacking Keyboard

Meanwhile, I was still on the hunt for a split ergonomic programmable keyboard.

In August of 2015 I stumbled across a pre-announcement of the Ultimate Hacking Keyboard, and after listening to Laci on The Changelog in December of 2015, I was intrigued enough to pre-order a board.

My thought was it would net me a programmable keyboard, as well as a split keyboard, without having to assemble it myself, and if it would turn out I wasn’t a fan, I could likely sell it to someone who missed the pre-order.

To top it off, it outlined that it was going to have magnets that would hold it together in a non-split mode.

Being uncertain of if I would even like a split keyboard, I saw a reduced risk in the purchase, in that if I tried the split mode and didn’t like it, I could always fall back to non-split mode for the keyboard, yet still have something that was programmable.

And of course, hardware being as it is, it has (understandably) been delayed a number of times as the project has progressed, but should hopefully be arriving this August (2017) as of last update.

Sierra and Karabiner

While awaiting the Ultimate Hacking Keyboard, and watching my co-worker dive deeper in to customizing his keyboards, I was sitting in a holding pattern before upgrading to OSX Sierra as it meant incompatibility with Karabiner due to changes in OSX.

Karabiner Elements was eventually announced as the replacement for Seil and Karabiner, but didn’t give the full extensibility of Karabiner, and I wasn’t going to lose my “Hyper key” from my keyboard without a battle.

Because of this, my co-worker was busy trying to sell me on the idea of building converters for my keyboards myself, but kept un-selling himself on the idea, as “a USB-to-USB protocol wasn’t quite supported”.

The “sales” cycle continued in this manner until he finally mentioned:

“Now, if your keyboards supported the PS/2 protocol, that would be very easy.”

Getting ready to make the dive

With those magic words, he convinced me to build a set of converters for my keyboards, as building the converters would mean:

  • I can specify the key-mappings I want at the firmware level in the converter, and not have to worry about operating system compatibility of key remapping software;
  • it also gives the added bonus of being able to bring my keyboard to someone else’s desk to pair, plug it in, and have my keyboard work as I am used to;
  • remove the converter so that someone can use my computer with the standard key-mappings they are used to;
  • or possibly even using someone else’s keyboard and using my converter with it to get my key mappings.

With a previous order of a nicer soldering iron over the $3 one from Harbor Freight I had, and an order from Amazon for:

I was ready to start on building some converters for my keyboards.

Starting Towards A Custom Built Keyboard

As someone who works with computers and uses a keyboard, on a near daily basis, and for 10-12+ hours in most cases, having a great keyboard that fits you is right up there with a great bed, really nice monitor, and good desk environment.

Distant Early Warning

My fascination with keyboards go back a number of years, in that whenever I would go into Best Buy, CompUSA, or some other computer sales related stores, I would make sure to try out the keyboards they had on display.

Given all that I am about to lay out, in retrospect, I knew there was something about the importance of a good keyboard, but wasn’t yet cognizant of what a good keyboard even was, settling on nice rubber-dome Logitect keyboards usually.

Enter Das Keyboard

This all started to change about the end of 2009, when Das Keyboards Ultimate Model S was announced.

I was a touch typist, mostly, and I specify mostly, as that I would touch type, but then start letting my eyes wander from the screen where they would eventually wind up looking at the keyboard and my typing pace would slow to a crawl.

Seeing that there was a supposedly really nice keyboard at a reasonable price, given the expected lifetime of a good keyboard, I cashed in my “Christmas Credit”, and asked for only one present, a Das Keyboard Ultimate Model S.

After some selling on the idea, in that professionals invest in professional grade tools; that I wanted the blank keycaps to force myself to not let my eyes wander, because there would be nothing to watch; and that this should last me years of use, if not decades, on Christmas Morning I opened up and started click-clacking away.

Jeff Attwood, Steve Losh, and a co-worker named Pete Young

Then 4 years ago happened, I was following Jeff Atwood’s (a.k.a. CodingHorror) blog posts and he announced the Code Keyboard.

After hearing about how nice the Cherry MX Clears are if you like the Blues (which were the key switches in the Das Model S that I had), and catching the first glimpse of customizability with the Code having dip-switches to change between Mac/Windows Mode, QWERTY/Dvorak/Colemak options, and others, I was able to get an order in on the second round to get a clears.

While we were waiting, my co-worker Pete, who was also excited about keyboards, and further down the path than I, started bringing in his various keyboards, and we would exchange and see how different keyboards felt after a day of usage, getting an initial feel of what we like.

And I believe it was Pete, (and if it wasn’t him that shared it with me and I shared it with him instead, he will still get the credit), showed me Steve Losh’s A Modern Space Cadet post, and I started the path down to using some customization of my keyboard layout as outlined in the post.

Pining over the ErgoDox

At some point I came across the ErgoDox online. I don’t remember where it was originally, but that put the Atreus on my radar and a few other variants of “travel” keyboards.

Somewhere in this time range as well, I heard someone refer to building and customizing your own keyboard as a software developer was akin to a Jedi building their own lightsaber.

Truth be told, I was intimidated. The cost for the parts, and me having to do it myself, made it both a dream, and also very intimidating.

Intimidating as my knowledge of soldering was:
– that there was an soldering iron (which wasn’t spelled how it sounded to my ears),
– flux (not to be confused with the capacitor),
– the solder itself,
– used for connecting electrical “stuff” together,
– and that a soldering iron could easily be confused with a very badly operating wood burner.

But the ErgoDox still remains the dream, and that dream is getting closer to being reality.

In the next few posts, I will outline how I have since started making faster progress to the dream of a fully custom built keyboard and having my own perfect keyboard.

Ruby Tuesday – composing filter

As mentioned in the last Ruby Tuesday on using compose and map, I ended with saying that we would take a look at how we would take advantage of composition for our filter.

As a refresher, here is our Sequence module.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

So lets start taking a look at how we would be able to compose our filter lambdas together by seeing if we can find all numbers that are multiples of three, and then multiples of five.

multiple_of_three = ->(x) { x % 3 == 0}
# => #<Proc:0x007faeb28926b0@(pry):1 (lambda)>
multiple_of_five = ->(x) { x % 5 == 0}
# => #<Proc:0x007faeb3866c58@(pry):2 (lambda)>
Sequence.compose.([Sequence.filter.(multiple_of_three),
                   Sequence.filter.(multiple_of_five)]).((1..100))
=> [15, 30, 45, 60, 75, 90]

The catch again is that we are traversing the first list for all 100 elements to find those items that are multiples of 3, and then filtering the filtered list to find only those that are multiples of 5.

What if we could do this in one pass, and compose the predicates together?

And since we want to test this to make sure we get a good result, we will compose those predicates together and just test our composed function with the number 15.

three_and_five = Sequence.compose.([multiple_of_three, multiple_of_five])
=> #<Proc:0x007faeb4905d20 (lambda)>
three_and_five.(15)
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

It fails with a NoMethodError telling us that we can’t use the method % on an object of type TrueClass.

Don’t we want to get back a true instead of an error though?

If we rollback the compose into it’s nested function call version, it starts to become clear where the error is coming from.

multiple_of_five.(multiple_of_three.(15))
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

So what is happening is that we get back true from calling multiple_of_three with 15, and we then pass that true into multiple_of_five, when what we wanted to do was to pass in the 15 and make sure it was a multiple of 5 as well.

What we really want is to evaluate every lambda with the value of 15, and then then see if all of the predicate functions succeed in their checks.

So we will start with a very naive, and “un-curried” version, to prove out our concept.

Our first pass will be to map over each predicate and invoke it with the value we want to test resulting in a list of booleans if it succeeded or failed. We then reduce all of those items together via an and operation to get an overall success status.

def my_all_succeed_naive(predicates, value)
  check_results = Sequence.map.(->(f) {f.(value)}, predicates)
  Sequence.reduce.(true, ->(accum, item) {accum && item}, check_results)
end

my_all_succeed_naive([multiple_of_three, multiple_of_five], 15)
# => true
my_all_succeed_naive([multiple_of_three, multiple_of_five], 14)
# => false
my_all_succeed_naive([multiple_of_three, multiple_of_five], 5)
# => false

This looks like this works, as we get 15 is a multiple of 3 and a multiple of 5, but we still check if the item is a multiple of 5, even if our multiple of 3 check failed.

Could we do better, and short circuit our evaluation if we get a false?

Let’s try.

def my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

So let’s take a look at the difference between the two, just to make sure we are on the right track.

First we will create a “long running predicate check” to add to our chain.

[22] pry(main)> pass_after = ->(x, value) { sleep(x); true }.curry
=> #<Proc:0x007faeb310d498 (lambda)>

Then we will time it using Benchmark#measure (and don’t forget to require 'benchmark' first though). First with a success, and then with a failure against the first predicate.

Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 15)
end
# => #<Benchmark::Tms:0x007faeb28f24c0
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0046792929642834,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>


Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb6018c88
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=2.5073997676372528e-05,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

We can see that it takes three seconds to run if we succeed, but only a fraction of a second if we fail on the first check. And just to make sure our earlier assumptions were correct, we will benchmark the same predicate list against the naive version with the value of 14, so it will fail on the first check of a multiple of three.

Benchmark.measure do
  my_all_succeed_naive([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb487d218
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0028793679666705,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

And it does indeed take just over 3 seconds to complete.

So let’s add this to our Sequence module, and get it to be able to be curried.

So what if we did that for just a single string, not even in an list of some sort?

def self.my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

@@all_succeed = method(:my_all_succeed).curry

def self.all_succeed
  @@all_succeed
end

And we check that we can use it off our Sequence module partially applied.

Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(14)
# => false
Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(15)
# => true

So now we can get back to how we would use this with filter by using our new Sequence::all_succeed.

three_and_five_multiple = Sequence.all_succeed.([multiple_of_three, multiple_of_five])
# => #<Proc:0x007faeb28685e0 (lambda)>
Sequence.filter.(three_and_five_multiple).((1..100))
# => [15, 30, 45, 60, 75, 90]

And there we go, we have now composed our predicates into one function which we can then pass to Sequence::filter and only have to walk through the list once.

As we have seen how to compose predicates together for an “and” style composition, next week we will look at building an “or” style composition of predicates.

–Proctor

Erlang Thursday – ETS selects, continuations, and concurrent inserts

At the end of last week’s Erlang Thursday, I said we would continue looking at the behavior of the select functions in the ets module.

So before we do any experimentation, we setup our test ETS tables, and this time we will also create a table of type ordered_set.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
TestOrderedSetTable = ets:new(ordered_set_table, [public, ordered_set]).
% 16402
TestTable = ets:new(ets_table, [public]).
% 20499
ets:give_away(TestTable, SomeProcess, []).
% true
ets:give_away(TestOrderedSetTable, SomeProcess, []).
% true

Next we will load our test ETS table with some dummy data, leaving some gaps in the sequence, allowing us to fill those gaps in later.

[[ets:insert(TestTable, {X, X}) || X <- lists:seq(1, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]
[[ets:insert(TestOrderedSetTable, {X, X}) || X <- lists:seq(1, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]

We then do a select to get all of the records from the table so we can see how the results are ordered for the different table types.

ets:select(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{15,15},
%  {25,25},
%  {13,13},
%  {21,21},
%  {11,11},
%  {1,1},
%  {23,23},
%  {7,7},
%  {3,3},
%  {9,9},
%  {19,19},
%  {29,29},
%  {27,27},
%  {17,17},
%  {5,5}]
ets:select(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{1,1},
%  {3,3},
%  {5,5},
%  {7,7},
%  {9,9},
%  {11,11},
%  {13,13},
%  {15,15},
%  {17,17},
%  {19,19},
%  {21,21},
%  {23,23},
%  {25,25},
%  {27,27},
%  {29,29}]

The ets module also has a function ets:select_reverse, so let’s take a quick stop and see what that does for our ETS tables.

ets:select_reverse(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{15,15},
%  {25,25},
%  {13,13},
%  {21,21},
%  {11,11},
%  {1,1},
%  {23,23},
%  {7,7},
%  {3,3},
%  {9,9},
%  {19,19},
%  {29,29},
%  {27,27},
%  {17,17},
%  {5,5}]
ets:select_reverse(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}]).
% [{29,29},
%  {27,27},
%  {25,25},
%  {23,23},
%  {21,21},
%  {19,19},
%  {17,17},
%  {15,15},
%  {13,13},
%  {11,11},
%  {9,9},
%  {7,7},
%  {5,5},
%  {3,3},
%  {1,1}]

If we look at the results of ets:select/2 and ets:select_reverse/2, we see that for TestTable we get the same result, and for TestOrderedSetTable we get the results in a reverse order, which is what the documentation for ets:select_reverse/2 states. Which makes sense if you think about it,

With that brief diversion out of the way, lets run our same match_spec()s from above, but limit the results to 5 records so we get a continuation back.

{Result, Continuation} = ets:select(TestTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}], 5).
% {[{19,19},{29,29},{27,27},{17,17},{5,5}],
% {20499,214,5,<<>>,[],0}}
{OrdSetResult, OrdSetContinuation} = ets:select(TestOrderedSetTable, [{{'$1', '$2'}, [], [{{'$1', '$2'}}]}], 5).
% {[{1,1},{3,3},{5,5},{7,7},{9,9}],{16402,9,[],5,<<>>,[],0,0}}

And with those continuations, we will see what the next results we would fetch would be.

ets:select(Continuation).
% {[{1,1},{23,23},{7,7},{3,3},{9,9}],{20499,111,5,<<>>,[],0}}
ets:select(OrdSetContinuation).
% {[{11,11},{13,13},{15,15},{17,17},{19,19}],
%  {16402,19,[],5,<<>>,[],0,0}}

Remember those “gaps” we left in our sequence of numbers we used to create tuples?

Time to “fill in” those gaps of the sequence to see what happens if we fetch with our existing continuation as data gets populated concurrently.

[[ets:insert(TestOrderedSetTable, {X, X}) || X <- lists:seq(2, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]
[[ets:insert(TestTable, {X, X}) || X <- lists:seq(2, 30, 2)]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true]]

Now we re-run our ets:select/1 functions with the same continuations as before.

ets:select(Continuation).
% {[{12,12},{7,7},{3,3},{10,10},{9,9}],
%  {20499,224,5,<<>>,[],0}}
ets:select(OrdSetContinuation).
% {[{10,10},{11,11},{12,12},{13,13},{14,14}],
%  {16402,14,[],5,<<>>,[],0,0}}

If we compare that to before we can see the we now have even number items in the list. For our TestTable if we look above at the Continuation value itself, we ahve the continuation point as 214, since that is the only thing that has changed between that continuation and the resulting continuations from calling ets:select(Continuation).. So with just a number it is hard to infer just how we might expect the continuation to change.

The OrdSetContinuation on the other hand, has a 9 as its second element in the tuple, after the ETS table id of 16402. This also happens to be the key of the last tuple in the result set, which matches up with the 19 and 14 in the other continuations. So in the case of the ordered set, we can infer that as part of the continuation for an ETS table of type ordered_set, the continuation tells us the specific key of the last record that was returned, and we continue from that record regardless of any concurrent inserts that may have taken place.

Next time we will take a look at ets:is_compiled_ms/1 and how match specs might play in with continuations based off reading the documentation about ets:is_compiled_ms/1.

–Proctor

Ruby Tuesday – compose and map

As mentioned last week, now that we have our compose function we will take a look at some of the properties of we get when using our map and compose functions together.

Here is our Sequence class as we left it with compose added to it.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

Like last week we have a list of names, and our goal is to get the list of “first” names back and have them be capitalized.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]
# => ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]

So we start with our base functions that we are going to use in our map calls.

split = ->(delimiter, str) {str.split(delimiter)}.curry
# => #<Proc:0x007ffdfa203a00 (lambda)>
whitespace_split = split.(" ")
# => #<Proc:0x007ffdfa8391e0 (lambda)>
first = ->(xs){ xs[0] }.curry
# => #<Proc:0x007ffdfa2a26a0 (lambda)>
capitalize = ->(s) { s.capitalize }
# => #<Proc:0x007ffdfa210ed0@(pry):13 (lambda)>

And we create our different instances of map calls, which are partially applied map functions with the appropriate lambda passed to them.

name_parts = Sequence.map.(whitespace_split)
# => #<Proc:0x007ffdfa1a0248 (lambda)>
firsts = Sequence.map.(first)
# => #<Proc:0x007ffdfa169568 (lambda)>
capitalize_all = Sequence.map.(capitalize)
# => #<Proc:0x007ffdfa86a1f0 (lambda)>

And as we saw last week, we can nest the function calls together,

capitalize_all.(firsts.(name_parts.(names)))
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

or we can use compose to create a “pipeline” of function calls.

capitalized_first_names = Sequence.compose.([name_parts, firsts, capitalize_all])
# => #<Proc:0x007ffdfa1c1c18 (lambda)>
capitalized_first_names.(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Here’s where things can start to get interesting.

In our capitalized first names example, we go through the list once per transformation we want to apply.

First we transform the list of names into a list of split names, which we transform into a list of only the first items from the source list, and then finally we transform that into a list of the capitalized names.

That could be a lot of processing if we had more transformations, and/or a much longer list. This seems like a lot of work.

Let’s look at it from near the other extreme; the case of if we only had one name in our list.

capitalized_first_names.(['tARa cHAsE'])
=> ["Tara"]

In this case, the fact that we have a list at all is almost incidental.

For a list of only one item, we split the string, get the first element, and capitalize that value.

So what if we did that for just a single string, not even in an list of some sort?

capitalize_first_name = Sequence.compose.([whitespace_split, first, capitalize])
# => #<Proc:0x007ffdfa121100 (lambda)>
capitalize_first_name.("tARa cHAsE")
# => "Tara"

We can compose each of these individual operations together to get a function that will transform a name into the result we want.

And since we have a “transformation” function, we can pass that function to our map function for a given list of names.

Sequence.map.(capitalize_first_name, names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Lo and behold, we get the same results as above for when we composed our map function calls.

Sequence.compose.([Sequence.map.(whitespace_split),
                   Sequence.map.(first),
                   Sequence.map.(capitalize)]
                 ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]
Sequence.map.(Sequence.compose.([whitespace_split,
                                 first,
                                 capitalize])
             ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

This leads us to the property that the composition of the map of function f and the map of function g is equivalent to the map of the composition of functions f and g.

\big(map\ f \circ map\ g\big)\ list = map\big( f \circ g \big) list

Where the circle (\circ) symbol represents the function compose expressed using mathematical notation.

Because of this, we can now only traverse the sequence via map once and do the composed transformation functions on each item as we encounter it without having to go revisit it again.

Next time we will take a look at how we can do the same kind of operation on filter, as we can’t just pipeline the results of one filter on through to another, since filter returns a boolean value which is not what we would want to feed through to the next filter call.

–Proctor

Erlang Thursday – Using ETS select with a limit

In last week’s Erlang Thursday we continued to explore ets:select/2 and seeing its use when combined with using ets:fun2ms to generate the match_spec()s.

This week we will take a look at the other versions of select that the ets module provides.

Yet again we will do our new playground ETS table setup, so if we crash our shell session we don’t loose the table.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
TestTable = ets:new(ets_table, [public]).
% 16402
ets:give_away(TestTable, SomeProcess, []).
% true

Next we will load our test ETS table with a bunch of test “products”. For ease of example, we will just use a number for the product id, and a random price ending in .99.

[[ets:insert(TestTable, {ProductId, random:uniform(100) + 0.99})
  || ProductId <- lists:seq(1, 10000) ]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true|...]]

We will create a match_spec() to find items in their twenties (and we will go ahead and round 19.99 up to 20 just because).

ProductsInTheTwenties = ets:fun2ms(fun({Product, Price})
                                     when Price >= 19.99 andalso Price < 30
                                     -> {Product, Price}
                                   end).
% [{{'$1','$2'},
%   [{'andalso',{'>=','$2',19.99},{'<','$2',30}}],
%   [{{'$1','$2'}}]}]

And if we use ets:select/2 against our table with this match spec, we get all of the results back in one query as we saw previously.

ets:select(TestTable, ProductsInTheTwenties).
% [{4351,29.99},
%  {635,19.99},
%  {6005,20.99},
%  {3742,27.99},
%  {5956,29.99},
%  {3753,28.99},
%  {6653,25.99},
%  {5151,28.99},
%  {2693,27.99},
%  {4253,21.99},
%  {7636,23.99},
%  {1935,19.99},
%  {9044,22.99},
%  {7797,22.99},
%  {2147,23.99},
%  {2574,26.99},
%  {7575,29.99},
%  {2130,28.99},
%  {4908,27.99},
%  {2218,22.99},
%  {9848,21.99},
%  {7632,26.99},
%  {3562,21.99},
%  {3130,27.99},
%  {575,26.99},
%  {4622,28.99},
%  {5678,25.99},
%  {4022,...},
%  {...}|...]

But the ets module also gives us a way to limit the results if we would like, using ets:select/3 and giving a limit of the number of results to return at a time.

So let’s use ets:select/3 and give a limit of 10 and see what happens.

ets:select(TestTable, ProductsInTheTwenties, 10).
% {[{9027,27.99},
%   {7347,29.99},
%   {7282,20.99},
%   {9386,24.99},
%   {5415,25.99},
%   {4032,29.99},
%   {8105,25.99},
%   {4634,24.99},
%   {1275,20.99},
%   {234,20.99}],
%  {16402,576,10,<<>>,[],0}}

We get a tuple back instead of a list of results. The first item in the tuple is a list of our first ten results we specified, the second is some bizarre looking tuple, which if we look at the documentation for ets:select/3 represents something referred to as a continuation.

So we run our query again, and bind the results this time.

{Results, Continuation} = ets:select(TestTable, ProductsInTheTwenties, 10).
% {[{9027,27.99},
%   {7347,29.99},
%   {7282,20.99},
%   {9386,24.99},
%   {5415,25.99},
%   {4032,29.99},
%   {8105,25.99},
%   {4634,24.99},
%   {1275,20.99},
%   {234,20.99}],
%  {16402,576,10,<<>>,[],0}}

So we have this continuation, but what is it and what does it mean for us to have it.

In short, it can be thought of as an immutable bookmark. It represents not only what page we are in for our query results, but also the book we are reading (our query).

This allows us to quickly pick up where we previously left off in our results set by passing the continuation to ets:select/1.

ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And because it is our special immutable bookmark, every time we use that bookmark it takes us to the same starting point in the same book, and we only read the same number of maximum pages as originally set as our limit.

So no matter how many times we call ets:select/1 with our same continuation, we will get the same results each time.

ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}
ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}
ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And if we look at the resulting tuple, we see that we get a different tuple for our next continuation.

{SecondResults, SecondContinuation} = ets:select(Continuation).
% {[{2533,24.99},
%   {1357,22.99},
%   {564,21.99},
%   {9086,22.99},
%   {5265,25.99},
%   {4030,22.99},
%   {2802,25.99},
%   {8254,27.99},
%   {7088,26.99},
%   {3062,27.99}],
%  {16402,960,10,<<>>,[{6792,29.99},{9295,29.99}],2}}

And we can pick up that new continuation, and use that in our next call to ets:select/1 to get the next set of results, with another continuation.

ets:select(SecondContinuation).
% {[{8569,19.99},
%   {1805,28.99},
%   {6819,23.99},
%   {9313,28.99},
%   {9527,27.99},
%   {1737,29.99},
%   {700,26.99},
%   {142,25.99},
%   {6792,29.99},
%   {9295,29.99}],
%  {16402,513,10,<<>>,[],0}}

And if we have a query for which we have exhausted our results set, we get an '$end_of_table' atom.

ets:select(TestTable, [{{'$1', '$2'}, [{'<', '$2', 0}], ['$$']}], 10).
% '$end_of_table'

The ability to specify a limit and have a continuation is also available via on match with ets:match/3 and ets:match/1, and match_object via ets:match_object/3 and ets:match_object/1.

Next week, we will continue looking at the various select functions in ets as we look into their behavior with and ordered set, will look at select vs select_reverse, and play with and see how continuations work if we get some new entries inserted in the results when using a continuation.
–Proctor

Ruby Tuesday – Refactoring towards compose

We have seen filter, map, reduce, partial application, and updating the former functions to be able to take advantage of partial application, so how much further can we go?

In this case, we will take a look at how we can chain the functions together to build up bigger building blocks out of their smaller components.

First as a reminder, here is our Sequence class that we have built up over the past few posts.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end
end

Suppose we want to get a list of capitalized “first” names from a list of names, and we have a bunch of smaller functions that handle the different part of the transformation process that we can reuse.

It might look like the following.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris"]

name_parts_map = Sequence.map.(->(name) {name.split})
# => #<Proc:0x007fea82200638 (lambda)>
first_map = Sequence.map.(->(xs) {xs[0]})
#=> #<Proc:0x007fea82842780 (lambda)>
capitalize_map = Sequence.map.(->(s) {s.capitalize})
# => #<Proc:0x007fea82a05ce8 (lambda)>
initials_map = Sequence.map.(->(strings) {first_map.(strings)})
# => #<Proc:0x007fea82188638 (lambda)>

capitalize_map.(first_map.(name_parts_map.(names)))
# => ["Jane", "John", "Arthur", "Lori"]

And if we wanted to get a list of the initials as a list themselves, we might have something like this.

initials_map.(name_parts_map.(names))
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

And maybe somewhere else, we need to do some mathematical operations, like transform numbers into one less than their square.

square_map = Sequence.map.(->(i) {i*i})
# => #<Proc:0x007fea82183ef8 (lambda)>
dec_map = Sequence.map.(->(i) {i-1})
# => #<Proc:0x007fea82a37018 (lambda)>
dec_map.(square_map.([1,2,3,4,5]))
# => [0, 3, 8, 15, 24]

And yet another place, we have a calculation to turn Fahrenheit into Celsius.

minus_32 = ->(x) {x-32}
# => #<Proc:0x007fea8298d4c8@(pry):36 (lambda)>
minus_32_map = Sequence.map.(minus_32)
# => #<Proc:0x007fea82955488 (lambda)>
five_ninths = ->(x) {x*5/9}
# => #<Proc:0x007fea83836330@(pry):38 (lambda)>
five_ninths_map = Sequence.map.(five_ninths)
# => #<Proc:0x007fea821429f8 (lambda)>
five_ninths_map.(minus_32_map.([0, 32, 100, 212]))
# => [-18, 0, 37, 100]

Setting aside for the moment, all of the Procs making up other Procs if this is still foreign to you, there is a pattern here that we have been doing in all of these examples to compose a larger function out of a number of smaller functions. The pattern is that we are taking the return result of calling one function with a value, and feeding that into the next function, lather, rinse, and repeat.

Does this sound familiar?

This is our reduce.

We can use reduce to define a function compose that will take a list of functions as our items, and an initial value for our functions, and our function to apply will be to call the function in item against our accumulated value.

That’s a bit of a mouthful, so let’s look at it as code, and we will revisit that statement.

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@compose = method(:my_compose).curry

  def self.compose
    @@compose
  end

Now that we have the code to reference, let’s go back and inspect against what was described above.

First we create a lambda apply_fn, which will be our “reducing” function to apply to the accumulator and each item in the list, which in the case of my_compose is a list of functions to call.

apply_fn like all our “reducing” functions so far takes in an accumulator value, the result of the composed function calls so far, and the current item, which is the function to call. The result for the new accumulator value is the result of applying the function with the accumulator as its argument.

We were able to build yet another function out of our reduce, but this time we operated on a list of functions as our values.

Let that sink in for a while.

So let’s see how we use that.

We will start with creating a composed function to map the Fahrenheit to Celsius conversion and see what different temperatures are in Celsius, including the past few days of highs and lows here at DFW airport.

f_to_c_map = Sequence.compose.([minus_32_map, five_ninths_map])
# => #<Proc:0x007fea82a1f9b8 (lambda)>
f_to_c_map.([0, 32, 100, 212])
# => [-18, 0, 37, 100]
dfw_highs_in_celsius = f_to_c_map.([66, 46, 55, 48, 64, 68])
# => [18, 7, 12, 8, 17, 20]
dfw_lows_in_celsius = f_to_c_map.([35, 27, 29, 35, 45, 40])
# => [1, -3, -2, 1, 7, 4]

And if we take a look at the initials above and compose the map calls together, we get the following.

get_initials_map = Sequence.compose.([name_parts_map, initials_map])
# => #<Proc:0x007fea82108ff0 (lambda)>
get_initials_map.(names)
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

Doing the same for our capitalized first names we get:

capitalized_first_names_map = Sequence.compose.([name_parts_map, first_map, capitalize_map])
# => #<Proc:0x007fea821d1108 (lambda)>
capitalized_first_names_map.(names)
# => ["Jane", "John", "Arthur", "Lori"]

By having our compose function, we are able to be more explicit that capitalized_first_names_map is, along with the rest of the examples, just a composition of smaller functions that have been assembled together in an data transformation pipeline.

They don’t have any other logic other than being the result of chaining the other functions together to get some intended behavior.

Not only that, but we can now reuse our capitalized_first_names_map mapping function against other lists of names nicely, since we have it able to be partially applied as well.

capitalized_first_names_map.(["bob cratchit", "pete ross", "diana prince", "tara chase"])
# => ["Bob", "Pete", "Diana", "Tara"]

Even better is that compose can work on any function (Proc or lambda) that takes a single argument.

Such as a Fahrenheit to Celsius function that operates against a single value instead of a list.

f_to_c = Sequence.compose.([minus_32, five_ninths])
# => #<Proc:0x007fea8294c4c8 (lambda)>
f_to_c.(212)
# => 100
f_to_c.(32)
# => 0
Sequence.map.(f_to_c, [0, 32, 100, 212])
# => [-18, 0, 37, 100]

Next week well will look at some other properties of our functions and show how compose can potentially help us in those cases as well.

–Proctor

Erlang Thursday – ETS, match_specs, and functions

In last week’s Erlang Thursday I concluded with showing how we can take advantage of using ets:select but take advantage of making our queries more expressive.

First we will need a new ETS table, so we start with a new public “Products” table, and do our standard of creating a new process and giving ownership of the table away.

Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.52.0>
Products = ets:new(products, [public]).
% 16402
ets:give_away(Products, SomeProcess, []).
% true

Next we will load our “Products” into the table.

In our case, we are just creating a “product” with the “name” as a binary and a “price in CWC” (Common World Currency) as an integer.

[[ ets:insert(Products, {integer_to_binary(X), X}) || X <- lists:seq(1, 100) ]].
% [[true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true,true,true,true,true,true,true,true,
%   true,true,true,true,true|...]]

As we saw last week, we can manually build up a list of tuples into the match_spec() to run our query, say for items less than 10 CWCs.

ets:select(Products, [{{'$1', '$2'}, [{'<', '$2', 10}], ['$1']}]).
% [<<"8">>,<<"6">>,<<"5">>,<<"3">>,<<"7">>,<<"1">>,<<"4">>,
%  <<"9">>,<<"2">>]

We can also find those item names that are more than 10 CWCs and less than 25 CWCS.

ets:select(Products, [{{'$1', '$2'}, [{'>', '$2', 10}, {'<', '$2', 25}], ['$1']}]).
% [<<"11">>,<<"15">>,<<"23">>,<<"20">>,<<"21">>,<<"14">>,
%  <<"12">>,<<"13">>,<<"16">>,<<"19">>,<<"17">>,<<"18">>,
%  <<"22">>,<<"24">>]

But this isn’t necessarily clear, as we are using numerical values for the items in the tuple, and lists of tuples with lists of tuples inside them.

Enter ets:fun2ms/1 to the rescue.

ets:fun2ms/1 will take a function and will turn that function into a match_spec().

This allows us to write a function that takes a tuple of Product and Cost, and will return the Product if the Cost is less than 10.

ets:fun2ms(fun({Product, Cost}) when Cost < 10 -> Product end).
% [{{'$1','$2'},[{'<','$2',10}],['$1']}]

We can also have compound checks in our guard clauses on the functions we pass to ets:fun2ms/1, such as and clauses,

Between_25_And_35_CWC = ets:fun2ms(fun({Product, Cost}) when Cost > 25, Cost < 35 -> Product end).
% [{{'$1','$2'},[{'>','$2',25},{'<','$2',35}],['$1']}]
ets:select(Products, Between_25_And_35_CWC).
% [<<"30">>,<<"33">>,<<"32">>,<<"29">>,<<"28">>,<<"26">>,
%  <<"34">>,<<"27">>,<<"31">>]

as well as or style clauses.

While this is useful it does have its limitations, as this parse transform on the function, so you can’t use everything that you would be able to with a normal function.

ets:fun2ms(fun({Product, Cost}) when Cost > 90 -> lists:reverse(binary:bin_to_list(Product)) end).
# Error: Unknown error code {122,lists,reverse}
# {error,transform_error}

But then again, the results part of the match_spec(), doesn’t support advanced functionality anyways.

ets:select(Products, [{{'$1', '$2'}, [{'<', 90, '$2'}], [binary:bin_to_list('$1')]}]).
# ** exception error: bad argument
#      in function  binary:bin_to_list/1
#         called as binary:bin_to_list('$1')

But even with its limitations, ets:fun2ms/1 still does a good job to help make our ETS queries more expressive. Not only can we reference a function with expressive variable names over just $X, as well as give guard clauses instead of just guard tuples, but we can also use those variable names in our results as well, and do the basic formatting as part of the function.

And make sure to check back in next week as we continue with looking at the different versions of ets:select.

–Proctor

Ruby Tuesday – Partial Application of map, filter, and reduce

Now that we have covered how to get to a basic implementation of map, filter, and reduce in Ruby, as well as how to take advantage of Method#curry, we are going to see how we can get some extra power from our code by combining their use.

In the Ruby versions of Enumerables map, reduce, and select, it operates against a specific object, such as an array of users.

class User
  def initialize(name:, active:)
    @name = name
    @active = active
  end

  def active?
    @active
  end

  def name
    @name
  end
end

users = [User.new(name: "johnny b. goode", active: true),
         User.new(name: "jasmine", active: true),
         User.new(name: "peter piper", active: false),
         User.new(name: "mary", active: true),
         User.new(name: "elizabeth", active: true),
         User.new(name: "jennifer", active: false)]


users.map{|user| user.name}
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]

users.select{|user| user.active?}
# => [#<User:0x007fa37a13eb68 @active=true, @name="johnny b. goode">,
#  #<User:0x007fa37a13eaa0 @active=true, @name="jasmine">,
#  #<User:0x007fa37a13e910 @active=true, @name="mary">,
#  #<User:0x007fa37a13e848 @active=true, @name="elizabeth">]

If we want the names of a different collection, we need to call map (and use the same block) directly on that collection as well; like a if we had a collection of active Users.

users.select{|user| user.active?}.map{|user| user.name}
=> ["johnny b. goode", "jasmine", "mary", "elizabeth"]

This can be made a little more generic by having the methods get_user_names and get_active_users defined on User, but this still leaves us a bit shallow, so let’s see what else we can do base of what we have seen so far.

We will try it with our versions of map, filter, and reduce, and see how we can distill some of this logic, and raise the level of abstraction hight to make it more generic.

module Sequence
  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end
end

And we look at how we call it using our Sequence module defined above.

Sequence.my_map(->(user) {user.name}, users)
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]
Sequence.my_filter(->(user) {user.active?}, users)
# => [#<User:0x007fa37a13eb68 @active=true, @name="johnny b. goode">,
#  #<User:0x007fa37a13eaa0 @active=true, @name="jasmine">,
#  #<User:0x007fa37a13e910 @active=true, @name="mary">,
#  #<User:0x007fa37a13e848 @active=true, @name="elizabeth">]

Granted, at this point, this is not a great improvement, if any, on it’s own.

BUT….

Since we take the collection to operate on as the last argument to our methods, we can combine our versions of my_map, my_filter, my_reduce with partial application to get a Proc that will do a specific operation against any Enumerable.

Let’s see how this would work.

First, we will update our Sequence module to have a map, filter, and reduce that can be partially applied.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end
end

Next with the ability to partially applied versions of our map, filter, and reduce, we can now save these off to variables that we can invoke later and just pass in the Users enumerable we wish to operate against.

names = Sequence.map.(->(user) {user.name})
# => #<Proc:0x007f9c53155990 (lambda)>
names.(users)
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]
get_active = Sequence.filter.(->(user) {user.active?})
# => #<Proc:0x007f9c531af3f0 (lambda)>
get_active.(users)
# => [#<User:0x007f9c5224e960 @active=true, @name="johnny b. goode">,
#  #<User:0x007f9c5224e8c0 @active=true, @name="jasmine">,
#  #<User:0x007f9c5224e780 @active=true, @name="mary">,
#  #<User:0x007f9c5224e6e0 @active=true, @name="elizabeth">]

Or if we want to, we can use the Symbol#to_proc so we don’t have to define our lambda for checking if an item is active?.

get_active = Sequence.filter.(:active?.to_proc)
=> #<Proc:0x007f9c531e62d8 (lambda)>
get_active.(users)
=> [#<User:0x007f9c5224e960 @active=true, @name="johnny b. goode">,
 #<User:0x007f9c5224e8c0 @active=true, @name="jasmine">,
 #<User:0x007f9c5224e780 @active=true, @name="mary">,
 #<User:0x007f9c5224e6e0 @active=true, @name="elizabeth">]

And now that we have our partial applied functions, we can also chain our calls together to get the names of active Users.

names.(get_active.(users))
=> ["johnny b. goode", "jasmine", "mary", "elizabeth"]

Not only that, but say we have some collections of Product objects,

class Product
  attr_reader :id, :name, :brand

  def initialize(id:, name:, active:, brand:)
    @id = id
    @name = name
    @active = active
    @brand = brand
  end

  def active?
    @active
  end
end

products = [Product.new(id: 0, name: "Prefect", active: false, brand: "Ford"),
            Product.new(id: 7, name: "SICP", active: true, brand: "MIT Press"),
            Product.new(id: 16, name: "HTDP", active: true, brand: "MIT Press"),
            Product.new(id: 17, name: "MRI", active: true, brand: "Ruby"),
            Product.new(id: 42, name: "HHGTTG", active: true, brand: "HHGTTG"),
            Product.new(id: 53, name: "Windows 3.1", active: false, brand: "Microsoft")]

Session objects,

class Session
  def initialize(name:, duration:)
    @name = name
    @duration = duration
  end

  def name
    @name
  end

  def active?
    @duration < 15
  end
end

sessions = [Session.new(name: "session A", duration: 3),
            Session.new(name: "session A", duration: 30),
            Session.new(name: "session A", duration: 17),
            Session.new(name: "session A", duration: 9),
            Session.new(name: "session A", duration: 1)]

and even SalesLead objects.

class SalesLead
  def initialize(lead:, active:)
    @lead = lead
    @active = active
  end

  def active?
    @active && @lead.active?
  end

  def name
    @lead.name
  end
end

leads = [SalesLead.new(lead: User.new(name: "lead 1", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 2", active: true), active: false),
         SalesLead.new(lead: User.new(name: "lead 3", active: false), active: true),
         SalesLead.new(lead: User.new(name: "lead 4", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 5", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 6", active: false), active: false)]

And because our Product class has the methods name and active?, we can use our names and get_active variables that hold our partially applied Procs against the list of Products,

names.(products)
# => ["Prefect", "SICP", "HTDP", "MRI", "HHGTTG", "Windows 3.1"]
get_active.(products)
# => [#<Product:0x007f9c530b7dd0 @active=true, @brand="MIT Press", @id=7, @name="SICP">,
#  #<Product:0x007f9c530b7ce0 @active=true, @brand="MIT Press", @id=16, @name="HTDP">,
#  #<Product:0x007f9c530b7bf0 @active=true, @brand="Ruby", @id=17, @name="MRI">,
#  #<Product:0x007f9c530b7ad8 @active=true, @brand="HHGTTG", @id=42, @name="HHGTTG">]
names.(get_active.(products))
# => ["SICP", "HTDP", "MRI", "HHGTTG"]

Sessions,

names.(sessions)
# => ["session A", "session A", "session A", "session A", "session A"]
get_active.(sessions)
# => [#<Session:0x007f9c52153560 @duration=3, @name="session A">,
#  #<Session:0x007f9c52152f98 @duration=9, @name="session A">,
#  #<Session:0x007f9c52152ca0 @duration=1, @name="session A">]
names.(get_active.(sessions))
# => ["session A", "session A", "session A"]

and SalesLeads,

names.(leads)
# => ["lead 1", "lead 2", "lead 3", "lead 4", "lead 5", "lead 6"]
get_active.(leads)
# => [#<SalesLead:0x007f9c5212a020
#   @active=true,
#   @lead=#<User:0x007f9c5212a160 @active=true, @name="lead 1">>,
#  #<SalesLead:0x007f9c521292d8
#   @active=true,
#   @lead=#<User:0x007f9c52129468 @active=true, @name="lead 4">>,
#  #<SalesLead:0x007f9c52128e00
#   @active=true,
#   @lead=#<User:0x007f9c52129080 @active=true, @name="lead 5">>]
names.(get_active.(leads))
# => ["lead 1", "lead 4", "lead 5"]

With this in mind, we will update the definition of our names and get_active to show that it is not just “users” it operates against, but any item.

names = Sequence.map.(->(x) {x.name})
# => #<Proc:0x007f9c53107f38 (lambda)>
get_active = Sequence.filter.(->(x) {x.active?})
# => #<Proc:0x007f9c530bfcd8 (lambda)>

So with this, we have now been able to take our map and select from Ruby’s Enumerable class that worked on a specific collection only, without being redefined and moved into a method to live on User somewhere, we now have a Proc that is recognized to be applicable to anything that accepts an item of that form, and these can be defined and used anywhere.

–Proctor

Erlang Thursday – More ETS data matching (and querying)

In today’s Erlang Thursday we continue from last week in looking at getting data from ETS.

To refresh, we have a module markov_words, and for this week we have added a new function markov_words:create_word_triples/1.

-module(markov_words).

-export([create_word_pairs/1,
         create_word_triples/1]).

-spec create_word_pairs(string()) -> list({string(), string()}).
create_word_pairs(Text) ->
  Words = string:tokens(Text, " \t\n"),
  create_word_pairs([], Words).

-spec create_word_triples(string()) -> list({string(), string(), string()}).
create_word_triples(Text) ->
  Words = string:tokens(Text, " \t\n"),
  create_word_triples(Words, []).


create_word_pairs(WordPairs, [_Word|[]]) ->
    WordPairs;
create_word_pairs(WordPairs, [Word|Words]) ->
    [Following|_] = Words,
    UpdatedWordPairs = [{Word, Following} | WordPairs],
    create_word_pairs(UpdatedWordPairs, Words).


create_word_triples([_Word, _SecondWord | []], WordTriples) ->
    WordTriples;
create_word_triples([FirstWord | Words], WordTriples) ->
    [SecondWord, Following | _] = Words,
    UpdatedWordTriples = [{FirstWord, SecondWord, Following} | WordTriples],
    create_word_triples(Words, UpdatedWordTriples).

The excuse for having this new function is that it would allow us to get more refined Markov chains by picking the probability of the next word by having the state be the compound key of the last two words seen.

With our module updated and defined, we get back to our Erlang shell to test things out, by compiling our module and loading up our intro text into a variable.

c(markov_words).
% {ok,markov_words}

ToTC = "It was the best of times, it was the worst of times,
it was the age of wisdom, it was the age of foolishness,
it was the epoch of belief, it was the epoch of incredulity,
it was the season of Light, it was the season of Darkness,
it was the spring of hope, it was the winter of despair,
we had everything before us, we had nothing before us,
we were all going direct to Heaven,
we were all going direct the other way--in short,
the period was so far like the present period,
that some of its noisiest authorities insisted on its
being received, for good or for evil, in the superlative
degree of comparison only.

There were a king with a large jaw and a queen with a
plain face, on the throne of England; there were a king
with a large jaw and a queen with a fair face,
on the throne of France. In both countries it was
clearer than crystal to the lords of the State preserves
of loaves and fishes, that things in general were
settled for ever.".

We create our fresh ETS table for this week, create a new process to own it, and give it away (in case we type something wrong and cause the current session of the shell to crash).

MarkovWords = ets:new(markov_word_tuples, [public, duplicate_bag]).
% 16402
Fun = fun() -> receive after infinity -> ok end end.
% #Fun<erl_eval.20.54118792>
SomeProcess = spawn(Fun).
% <0.58.0>
ets:give_away(MarkovWords, SomeProcess, []).
% true

This week, in addition to adding our word pair tuples to ETS, we will also add in our new word triple tuples to ETS in the same table.

[[ ets:insert(MarkovWords, WordPair) || WordPair <- markov_words:create_word_pairs(ToTC)]].

[[ ets:insert(MarkovWords, WordTriple) || WordTriple <- markov_words:create_word_triples(ToTC)]].

Since we have both word pairs and word triples in the same ETS table, we can see that with ets:match_object/2, we can specify a match_pattern() for only the two tuples

ets:match_object(MarkovWords, {"of", '$1'}).
% [{"of","loaves"},
%  {"of","the"},
%  {"of","France."},
%  {"of","England;"},
%  {"of","comparison"},
%  {"of","its"},
%  {"of","despair,"},
%  {"of","hope,"},
%  {"of","Darkness,"},
%  {"of","Light,"},
%  {"of","incredulity,"},
%  {"of","belief,"},
%  {"of","foolishness,"},
%  {"of","wisdom,"},
%  {"of","times,"},
%  {"of","times,"}]

or a match_pattern() that will only match the three tuples.

ets:match_object(MarkovWords, {"of", '$1', '$2'}).
% [{"of","loaves","and"},
%  {"of","the","State"},
%  {"of","France.","In"},
%  {"of","England;","there"},
%  {"of","comparison","only."},
%  {"of","its","noisiest"},
%  {"of","despair,","we"},
%  {"of","hope,","it"},
%  {"of","Darkness,","it"},
%  {"of","Light,","it"},
%  {"of","incredulity,","it"},
%  {"of","belief,","it"},
%  {"of","foolishness,","it"},
%  {"of","wisdom,","it"},
%  {"of","times,","it"},
%  {"of","times,","it"}]

Where as if we use the ets:lookup/2 with the key, we get all items with the key, regardless of the tuple size.

ets:lookup(MarkovWords, "of").
% [{"of","loaves"},
%  {"of","the"},
%  {"of","France."},
%  {"of","England;"},
%  {"of","comparison"},
%  {"of","its"},
%  {"of","despair,"},
%  {"of","hope,"},
%  {"of","Darkness,"},
%  {"of","Light,"},
%  {"of","incredulity,"},
%  {"of","belief,"},
%  {"of","foolishness,"},
%  {"of","wisdom,"},
%  {"of","times,"},
%  {"of","times,"},
%  {"of","loaves","and"},
%  {"of","the","State"},
%  {"of","France.","In"},
%  {"of","England;","there"},
%  {"of","comparison","only."},
%  {"of","its","noisiest"},
%  {"of","despair,","we"},
%  {"of","hope,","it"},
%  {"of","Darkness,","it"},
%  {"of","Light,",[...]},
%  {"of",[...],...},
%  {[...],...},
%  {...}|...]

And unlike ets:lookup/2, with ets:match_object/2 we can match on any tuple element, and not just the key.

ets:match_object(MarkovWords, {'$1', "the", '$2'}).
% [{"on","the","throne"},
%  {"on","the","throne"},
%  {"direct","the","other"},
%  {"short,","the","period"},
%  {"like","the","present"},
%  {"of","the","State"},
%  {"to","the","lords"},
%  {"in","the","superlative"},
%  {"was","the","winter"},
%  {"was","the","spring"},
%  {"was","the","season"},
%  {"was","the","season"},
%  {"was","the","epoch"},
%  {"was","the","epoch"},
%  {"was","the","age"},
%  {"was","the","age"},
%  {"was","the","worst"},
%  {"was","the","best"}]

And like ets:match_object/2, ets:match/2 can match based off the tuple itself as well.

ets:match(MarkovWords, {"was", "the", '$1'}).
% [["winter"],
%  ["spring"],
%  ["season"],
%  ["season"],
%  ["epoch"],
%  ["epoch"],
%  ["age"],
%  ["age"],
%  ["worst"],
%  ["best"]]

But sometimes we might want finer grain control over how the results are given back to us, such as a single list of items instead of a nested list of strings. Or maybe we even have some criteria that we want to hold true as part of our selections on the data.

Enter ets:select/2.

ets:select/2 takes the table as its first argument, and a match_spec() as its second argument.

The match_spec() is a list of three-tuples, where the first element is the match pattern, second element is a list of guard clause tuples, and the last element is the result is a term representation of the result for each match.

If we want to call ets:select/2 and have it align with ets:match/2 our call looks like the following.

ets:select(MarkovWords, [{{"was", "the", '$1'}, [], [['$1']]}]).
% [["winter"],
%  ["spring"],
%  ["season"],
%  ["season"],
%  ["epoch"],
%  ["epoch"],
%  ["age"],
%  ["age"],
%  ["worst"],
%  ["best"]]

The second argument is a list of match_spec()s, of which there is only one which consists of:
1). a match_pattern() of {"was", "the", '$1'}, which is the same thing we gave to ets:match/2
2). [], and empty list of guard condition tuples, and
3). [[‘$1’]] for the result term, which is the list of terms we want the result formatted as, in this case we want each result to be in its own list.

If we just wanted to get the word themselves as a list, we can update the result term part of the match_spec() to be ['$1'] instead.

ets:select(MarkovWords, [{{"was", "the", '$1'}, [], ['$1']}]).
% ["winter","spring","season","season","epoch","epoch","age",
%  "age","worst","best"]

If we wanted something that looked more like a ets:match_object/2 result set we can use the result term of '$_', which signifies the whole object.

ets:select(MarkovWords, [{{"was", "the", '$1'}, [], ['$_']}]).
% [{"was","the","winter"},
%  {"was","the","spring"},
%  {"was","the","season"},
%  {"was","the","season"},
%  {"was","the","epoch"},
%  {"was","the","epoch"},
%  {"was","the","age"},
%  {"was","the","age"},
%  {"was","the","worst"},
%  {"was","the","best"}]

And if we wanted to only match on one of the items, and capture the other items in the tuple, we can use the result of '$$' which returns all of the match variable in a list, ordered by variable number as opposed to position in the match_pattern().

ets:select(MarkovWords, [{{"was", '$1', '$2'}, [], ['$$']}]).
% [["clearer","than"],
%  ["so","far"],
%  ["the","winter"],
%  ["the","spring"],
%  ["the","season"],
%  ["the","season"],
%  ["the","epoch"],
%  ["the","epoch"],
%  ["the","age"],
%  ["the","age"],
%  ["the","worst"],
%  ["the","best"]]

ets:select(MarkovWords, [{{"was", '$2', '$1'}, [], ['$$']}]).
% [["than","clearer"],
%  ["far","so"],
%  ["winter","the"],
%  ["spring","the"],
%  ["season","the"],
%  ["season","the"],
%  ["epoch","the"],
%  ["epoch","the"],
%  ["age","the"],
%  ["age","the"],
%  ["worst","the"],
%  ["best","the"]]

With ets:select/2 we also get the ability to specify multiple match_spec()s. This allows us to find all word triple word triples that have either "of" or "the" as the middle word.

ets:select(MarkovWords, [{{'$1', "the", '$2'}, [], ['$_']}, {{'$1', "of", '$2'}, [], ['$_']}]).
% [{"some","of","its"},
%  {"on","the","throne"},
%  {"on","the","throne"},
%  {"direct","the","other"},
%  {"preserves","of","loaves"},
%  {"throne","of","France."},
%  {"throne","of","England;"},
%  {"worst","of","times,"},
%  {"short,","the","period"},
%  {"winter","of","despair,"},
%  {"degree","of","comparison"},
%  {"epoch","of","incredulity,"},
%  {"epoch","of","belief,"},
%  {"spring","of","hope,"},
%  {"like","the","present"},
%  {"of","the","State"},
%  {"age","of","foolishness,"},
%  {"age","of","wisdom,"},
%  {"best","of","times,"},
%  {"season","of","Darkness,"},
%  {"season","of","Light,"},
%  {"to","the","lords"},
%  {"in","the","superlative"},
%  {"was","the","winter"},
%  {"was","the","spring"},
%  {"was","the",[...]},
%  {"was",[...],...},
%  {[...],...},
%  {...}|...]

And with guard clauses, we can find third item in the three-tuples that start with "was", that comes later in the dictionary than the word in the second position of the tuple.

ets:select(MarkovWords, [{{"was", '$1', '$2'}, [{'<', '$1', '$2'}], ['$2']}]).
% ["than","winter","worst"]

So with this week’s post we have seen other ways of using ets:match/2 and ets:match_object/2, and what they can get over using just a ets:lookup/2 for a key, as well as being able to take advantage of even more powerful querying by using ets:select/2.

Next week, we will look at more ways to use ets:select/2, and how we can use some other ets module functions to help create queries that can be easier to deconstruct at a quicker glance.

–Proctor