Tag Archives: Ruby Tuesday

Ruby Tuesday – composing filter

As mentioned in the last Ruby Tuesday on using compose and map, I ended with saying that we would take a look at how we would take advantage of composition for our filter.

As a refresher, here is our Sequence module.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

So lets start taking a look at how we would be able to compose our filter lambdas together by seeing if we can find all numbers that are multiples of three, and then multiples of five.

multiple_of_three = ->(x) { x % 3 == 0}
# => #<Proc:0x007faeb28926b0@(pry):1 (lambda)>
multiple_of_five = ->(x) { x % 5 == 0}
# => #<Proc:0x007faeb3866c58@(pry):2 (lambda)>
Sequence.compose.([Sequence.filter.(multiple_of_three),
                   Sequence.filter.(multiple_of_five)]).((1..100))
=> [15, 30, 45, 60, 75, 90]

The catch again is that we are traversing the first list for all 100 elements to find those items that are multiples of 3, and then filtering the filtered list to find only those that are multiples of 5.

What if we could do this in one pass, and compose the predicates together?

And since we want to test this to make sure we get a good result, we will compose those predicates together and just test our composed function with the number 15.

three_and_five = Sequence.compose.([multiple_of_three, multiple_of_five])
=> #<Proc:0x007faeb4905d20 (lambda)>
three_and_five.(15)
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

It fails with a NoMethodError telling us that we can’t use the method % on an object of type TrueClass.

Don’t we want to get back a true instead of an error though?

If we rollback the compose into it’s nested function call version, it starts to become clear where the error is coming from.

multiple_of_five.(multiple_of_three.(15))
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

So what is happening is that we get back true from calling multiple_of_three with 15, and we then pass that true into multiple_of_five, when what we wanted to do was to pass in the 15 and make sure it was a multiple of 5 as well.

What we really want is to evaluate every lambda with the value of 15, and then then see if all of the predicate functions succeed in their checks.

So we will start with a very naive, and “un-curried” version, to prove out our concept.

Our first pass will be to map over each predicate and invoke it with the value we want to test resulting in a list of booleans if it succeeded or failed. We then reduce all of those items together via an and operation to get an overall success status.

def my_all_succeed_naive(predicates, value)
  check_results = Sequence.map.(->(f) {f.(value)}, predicates)
  Sequence.reduce.(true, ->(accum, item) {accum && item}, check_results)
end

my_all_succeed_naive([multiple_of_three, multiple_of_five], 15)
# => true
my_all_succeed_naive([multiple_of_three, multiple_of_five], 14)
# => false
my_all_succeed_naive([multiple_of_three, multiple_of_five], 5)
# => false

This looks like this works, as we get 15 is a multiple of 3 and a multiple of 5, but we still check if the item is a multiple of 5, even if our multiple of 3 check failed.

Could we do better, and short circuit our evaluation if we get a false?

Let’s try.

def my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

So let’s take a look at the difference between the two, just to make sure we are on the right track.

First we will create a “long running predicate check” to add to our chain.

[22] pry(main)> pass_after = ->(x, value) { sleep(x); true }.curry
=> #<Proc:0x007faeb310d498 (lambda)>

Then we will time it using Benchmark#measure (and don’t forget to require 'benchmark' first though). First with a success, and then with a failure against the first predicate.

Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 15)
end
# => #<Benchmark::Tms:0x007faeb28f24c0
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0046792929642834,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>


Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb6018c88
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=2.5073997676372528e-05,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

We can see that it takes three seconds to run if we succeed, but only a fraction of a second if we fail on the first check. And just to make sure our earlier assumptions were correct, we will benchmark the same predicate list against the naive version with the value of 14, so it will fail on the first check of a multiple of three.

Benchmark.measure do
  my_all_succeed_naive([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb487d218
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0028793679666705,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

And it does indeed take just over 3 seconds to complete.

So let’s add this to our Sequence module, and get it to be able to be curried.

So what if we did that for just a single string, not even in an list of some sort?

def self.my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

@@all_succeed = method(:my_all_succeed).curry

def self.all_succeed
  @@all_succeed
end

And we check that we can use it off our Sequence module partially applied.

Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(14)
# => false
Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(15)
# => true

So now we can get back to how we would use this with filter by using our new Sequence::all_succeed.

three_and_five_multiple = Sequence.all_succeed.([multiple_of_three, multiple_of_five])
# => #<Proc:0x007faeb28685e0 (lambda)>
Sequence.filter.(three_and_five_multiple).((1..100))
# => [15, 30, 45, 60, 75, 90]

And there we go, we have now composed our predicates into one function which we can then pass to Sequence::filter and only have to walk through the list once.

As we have seen how to compose predicates together for an “and” style composition, next week we will look at building an “or” style composition of predicates.

–Proctor

Ruby Tuesday – compose and map

As mentioned last week, now that we have our compose function we will take a look at some of the properties of we get when using our map and compose functions together.

Here is our Sequence class as we left it with compose added to it.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

Like last week we have a list of names, and our goal is to get the list of “first” names back and have them be capitalized.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]
# => ["jane doe", "john doe", "arthur dent", "lori lemaris", "DIANA PRINCE"]

So we start with our base functions that we are going to use in our map calls.

split = ->(delimiter, str) {str.split(delimiter)}.curry
# => #<Proc:0x007ffdfa203a00 (lambda)>
whitespace_split = split.(" ")
# => #<Proc:0x007ffdfa8391e0 (lambda)>
first = ->(xs){ xs[0] }.curry
# => #<Proc:0x007ffdfa2a26a0 (lambda)>
capitalize = ->(s) { s.capitalize }
# => #<Proc:0x007ffdfa210ed0@(pry):13 (lambda)>

And we create our different instances of map calls, which are partially applied map functions with the appropriate lambda passed to them.

name_parts = Sequence.map.(whitespace_split)
# => #<Proc:0x007ffdfa1a0248 (lambda)>
firsts = Sequence.map.(first)
# => #<Proc:0x007ffdfa169568 (lambda)>
capitalize_all = Sequence.map.(capitalize)
# => #<Proc:0x007ffdfa86a1f0 (lambda)>

And as we saw last week, we can nest the function calls together,

capitalize_all.(firsts.(name_parts.(names)))
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

or we can use compose to create a “pipeline” of function calls.

capitalized_first_names = Sequence.compose.([name_parts, firsts, capitalize_all])
# => #<Proc:0x007ffdfa1c1c18 (lambda)>
capitalized_first_names.(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Here’s where things can start to get interesting.

In our capitalized first names example, we go through the list once per transformation we want to apply.

First we transform the list of names into a list of split names, which we transform into a list of only the first items from the source list, and then finally we transform that into a list of the capitalized names.

That could be a lot of processing if we had more transformations, and/or a much longer list. This seems like a lot of work.

Let’s look at it from near the other extreme; the case of if we only had one name in our list.

capitalized_first_names.(['tARa cHAsE'])
=> ["Tara"]

In this case, the fact that we have a list at all is almost incidental.

For a list of only one item, we split the string, get the first element, and capitalize that value.

So what if we did that for just a single string, not even in an list of some sort?

capitalize_first_name = Sequence.compose.([whitespace_split, first, capitalize])
# => #<Proc:0x007ffdfa121100 (lambda)>
capitalize_first_name.("tARa cHAsE")
# => "Tara"

We can compose each of these individual operations together to get a function that will transform a name into the result we want.

And since we have a “transformation” function, we can pass that function to our map function for a given list of names.

Sequence.map.(capitalize_first_name, names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

Lo and behold, we get the same results as above for when we composed our map function calls.

Sequence.compose.([Sequence.map.(whitespace_split),
                   Sequence.map.(first),
                   Sequence.map.(capitalize)]
                 ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]
Sequence.map.(Sequence.compose.([whitespace_split,
                                 first,
                                 capitalize])
             ).(names)
# => ["Jane", "John", "Arthur", "Lori", "Diana"]

This leads us to the property that the composition of the map of function f and the map of function g is equivalent to the map of the composition of functions f and g.

\big(map\ f \circ map\ g\big)\ list = map\big( f \circ g \big) list

Where the circle (\circ) symbol represents the function compose expressed using mathematical notation.

Because of this, we can now only traverse the sequence via map once and do the composed transformation functions on each item as we encounter it without having to go revisit it again.

Next time we will take a look at how we can do the same kind of operation on filter, as we can’t just pipeline the results of one filter on through to another, since filter returns a boolean value which is not what we would want to feed through to the next filter call.

–Proctor

Ruby Tuesday – Refactoring towards compose

We have seen filter, map, reduce, partial application, and updating the former functions to be able to take advantage of partial application, so how much further can we go?

In this case, we will take a look at how we can chain the functions together to build up bigger building blocks out of their smaller components.

First as a reminder, here is our Sequence class that we have built up over the past few posts.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end
end

Suppose we want to get a list of capitalized “first” names from a list of names, and we have a bunch of smaller functions that handle the different part of the transformation process that we can reuse.

It might look like the following.

names = ["jane doe", "john doe", "arthur dent", "lori lemaris"]

name_parts_map = Sequence.map.(->(name) {name.split})
# => #<Proc:0x007fea82200638 (lambda)>
first_map = Sequence.map.(->(xs) {xs[0]})
#=> #<Proc:0x007fea82842780 (lambda)>
capitalize_map = Sequence.map.(->(s) {s.capitalize})
# => #<Proc:0x007fea82a05ce8 (lambda)>
initials_map = Sequence.map.(->(strings) {first_map.(strings)})
# => #<Proc:0x007fea82188638 (lambda)>

capitalize_map.(first_map.(name_parts_map.(names)))
# => ["Jane", "John", "Arthur", "Lori"]

And if we wanted to get a list of the initials as a list themselves, we might have something like this.

initials_map.(name_parts_map.(names))
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

And maybe somewhere else, we need to do some mathematical operations, like transform numbers into one less than their square.

square_map = Sequence.map.(->(i) {i*i})
# => #<Proc:0x007fea82183ef8 (lambda)>
dec_map = Sequence.map.(->(i) {i-1})
# => #<Proc:0x007fea82a37018 (lambda)>
dec_map.(square_map.([1,2,3,4,5]))
# => [0, 3, 8, 15, 24]

And yet another place, we have a calculation to turn Fahrenheit into Celsius.

minus_32 = ->(x) {x-32}
# => #<Proc:0x007fea8298d4c8@(pry):36 (lambda)>
minus_32_map = Sequence.map.(minus_32)
# => #<Proc:0x007fea82955488 (lambda)>
five_ninths = ->(x) {x*5/9}
# => #<Proc:0x007fea83836330@(pry):38 (lambda)>
five_ninths_map = Sequence.map.(five_ninths)
# => #<Proc:0x007fea821429f8 (lambda)>
five_ninths_map.(minus_32_map.([0, 32, 100, 212]))
# => [-18, 0, 37, 100]

Setting aside for the moment, all of the Procs making up other Procs if this is still foreign to you, there is a pattern here that we have been doing in all of these examples to compose a larger function out of a number of smaller functions. The pattern is that we are taking the return result of calling one function with a value, and feeding that into the next function, lather, rinse, and repeat.

Does this sound familiar?

This is our reduce.

We can use reduce to define a function compose that will take a list of functions as our items, and an initial value for our functions, and our function to apply will be to call the function in item against our accumulated value.

That’s a bit of a mouthful, so let’s look at it as code, and we will revisit that statement.

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@compose = method(:my_compose).curry

  def self.compose
    @@compose
  end

Now that we have the code to reference, let’s go back and inspect against what was described above.

First we create a lambda apply_fn, which will be our “reducing” function to apply to the accumulator and each item in the list, which in the case of my_compose is a list of functions to call.

apply_fn like all our “reducing” functions so far takes in an accumulator value, the result of the composed function calls so far, and the current item, which is the function to call. The result for the new accumulator value is the result of applying the function with the accumulator as its argument.

We were able to build yet another function out of our reduce, but this time we operated on a list of functions as our values.

Let that sink in for a while.

So let’s see how we use that.

We will start with creating a composed function to map the Fahrenheit to Celsius conversion and see what different temperatures are in Celsius, including the past few days of highs and lows here at DFW airport.

f_to_c_map = Sequence.compose.([minus_32_map, five_ninths_map])
# => #<Proc:0x007fea82a1f9b8 (lambda)>
f_to_c_map.([0, 32, 100, 212])
# => [-18, 0, 37, 100]
dfw_highs_in_celsius = f_to_c_map.([66, 46, 55, 48, 64, 68])
# => [18, 7, 12, 8, 17, 20]
dfw_lows_in_celsius = f_to_c_map.([35, 27, 29, 35, 45, 40])
# => [1, -3, -2, 1, 7, 4]

And if we take a look at the initials above and compose the map calls together, we get the following.

get_initials_map = Sequence.compose.([name_parts_map, initials_map])
# => #<Proc:0x007fea82108ff0 (lambda)>
get_initials_map.(names)
# => [["j", "d"], ["j", "d"], ["a", "d"], ["l", "l"]]

Doing the same for our capitalized first names we get:

capitalized_first_names_map = Sequence.compose.([name_parts_map, first_map, capitalize_map])
# => #<Proc:0x007fea821d1108 (lambda)>
capitalized_first_names_map.(names)
# => ["Jane", "John", "Arthur", "Lori"]

By having our compose function, we are able to be more explicit that capitalized_first_names_map is, along with the rest of the examples, just a composition of smaller functions that have been assembled together in an data transformation pipeline.

They don’t have any other logic other than being the result of chaining the other functions together to get some intended behavior.

Not only that, but we can now reuse our capitalized_first_names_map mapping function against other lists of names nicely, since we have it able to be partially applied as well.

capitalized_first_names_map.(["bob cratchit", "pete ross", "diana prince", "tara chase"])
# => ["Bob", "Pete", "Diana", "Tara"]

Even better is that compose can work on any function (Proc or lambda) that takes a single argument.

Such as a Fahrenheit to Celsius function that operates against a single value instead of a list.

f_to_c = Sequence.compose.([minus_32, five_ninths])
# => #<Proc:0x007fea8294c4c8 (lambda)>
f_to_c.(212)
# => 100
f_to_c.(32)
# => 0
Sequence.map.(f_to_c, [0, 32, 100, 212])
# => [-18, 0, 37, 100]

Next week well will look at some other properties of our functions and show how compose can potentially help us in those cases as well.

–Proctor

Ruby Tuesday – Partial Application of map, filter, and reduce

Now that we have covered how to get to a basic implementation of map, filter, and reduce in Ruby, as well as how to take advantage of Method#curry, we are going to see how we can get some extra power from our code by combining their use.

In the Ruby versions of Enumerables map, reduce, and select, it operates against a specific object, such as an array of users.

class User
  def initialize(name:, active:)
    @name = name
    @active = active
  end

  def active?
    @active
  end

  def name
    @name
  end
end

users = [User.new(name: "johnny b. goode", active: true),
         User.new(name: "jasmine", active: true),
         User.new(name: "peter piper", active: false),
         User.new(name: "mary", active: true),
         User.new(name: "elizabeth", active: true),
         User.new(name: "jennifer", active: false)]


users.map{|user| user.name}
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]

users.select{|user| user.active?}
# => [#<User:0x007fa37a13eb68 @active=true, @name="johnny b. goode">,
#  #<User:0x007fa37a13eaa0 @active=true, @name="jasmine">,
#  #<User:0x007fa37a13e910 @active=true, @name="mary">,
#  #<User:0x007fa37a13e848 @active=true, @name="elizabeth">]

If we want the names of a different collection, we need to call map (and use the same block) directly on that collection as well; like a if we had a collection of active Users.

users.select{|user| user.active?}.map{|user| user.name}
=> ["johnny b. goode", "jasmine", "mary", "elizabeth"]

This can be made a little more generic by having the methods get_user_names and get_active_users defined on User, but this still leaves us a bit shallow, so let’s see what else we can do base of what we have seen so far.

We will try it with our versions of map, filter, and reduce, and see how we can distill some of this logic, and raise the level of abstraction hight to make it more generic.

module Sequence
  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end
end

And we look at how we call it using our Sequence module defined above.

Sequence.my_map(->(user) {user.name}, users)
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]
Sequence.my_filter(->(user) {user.active?}, users)
# => [#<User:0x007fa37a13eb68 @active=true, @name="johnny b. goode">,
#  #<User:0x007fa37a13eaa0 @active=true, @name="jasmine">,
#  #<User:0x007fa37a13e910 @active=true, @name="mary">,
#  #<User:0x007fa37a13e848 @active=true, @name="elizabeth">]

Granted, at this point, this is not a great improvement, if any, on it’s own.

BUT….

Since we take the collection to operate on as the last argument to our methods, we can combine our versions of my_map, my_filter, my_reduce with partial application to get a Proc that will do a specific operation against any Enumerable.

Let’s see how this would work.

First, we will update our Sequence module to have a map, filter, and reduce that can be partially applied.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end
end

Next with the ability to partially applied versions of our map, filter, and reduce, we can now save these off to variables that we can invoke later and just pass in the Users enumerable we wish to operate against.

names = Sequence.map.(->(user) {user.name})
# => #<Proc:0x007f9c53155990 (lambda)>
names.(users)
# => ["johnny b. goode", "jasmine", "peter piper", "mary", "elizabeth", "jennifer"]
get_active = Sequence.filter.(->(user) {user.active?})
# => #<Proc:0x007f9c531af3f0 (lambda)>
get_active.(users)
# => [#<User:0x007f9c5224e960 @active=true, @name="johnny b. goode">,
#  #<User:0x007f9c5224e8c0 @active=true, @name="jasmine">,
#  #<User:0x007f9c5224e780 @active=true, @name="mary">,
#  #<User:0x007f9c5224e6e0 @active=true, @name="elizabeth">]

Or if we want to, we can use the Symbol#to_proc so we don’t have to define our lambda for checking if an item is active?.

get_active = Sequence.filter.(:active?.to_proc)
=> #<Proc:0x007f9c531e62d8 (lambda)>
get_active.(users)
=> [#<User:0x007f9c5224e960 @active=true, @name="johnny b. goode">,
 #<User:0x007f9c5224e8c0 @active=true, @name="jasmine">,
 #<User:0x007f9c5224e780 @active=true, @name="mary">,
 #<User:0x007f9c5224e6e0 @active=true, @name="elizabeth">]

And now that we have our partial applied functions, we can also chain our calls together to get the names of active Users.

names.(get_active.(users))
=> ["johnny b. goode", "jasmine", "mary", "elizabeth"]

Not only that, but say we have some collections of Product objects,

class Product
  attr_reader :id, :name, :brand

  def initialize(id:, name:, active:, brand:)
    @id = id
    @name = name
    @active = active
    @brand = brand
  end

  def active?
    @active
  end
end

products = [Product.new(id: 0, name: "Prefect", active: false, brand: "Ford"),
            Product.new(id: 7, name: "SICP", active: true, brand: "MIT Press"),
            Product.new(id: 16, name: "HTDP", active: true, brand: "MIT Press"),
            Product.new(id: 17, name: "MRI", active: true, brand: "Ruby"),
            Product.new(id: 42, name: "HHGTTG", active: true, brand: "HHGTTG"),
            Product.new(id: 53, name: "Windows 3.1", active: false, brand: "Microsoft")]

Session objects,

class Session
  def initialize(name:, duration:)
    @name = name
    @duration = duration
  end

  def name
    @name
  end

  def active?
    @duration < 15
  end
end

sessions = [Session.new(name: "session A", duration: 3),
            Session.new(name: "session A", duration: 30),
            Session.new(name: "session A", duration: 17),
            Session.new(name: "session A", duration: 9),
            Session.new(name: "session A", duration: 1)]

and even SalesLead objects.

class SalesLead
  def initialize(lead:, active:)
    @lead = lead
    @active = active
  end

  def active?
    @active && @lead.active?
  end

  def name
    @lead.name
  end
end

leads = [SalesLead.new(lead: User.new(name: "lead 1", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 2", active: true), active: false),
         SalesLead.new(lead: User.new(name: "lead 3", active: false), active: true),
         SalesLead.new(lead: User.new(name: "lead 4", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 5", active: true), active: true),
         SalesLead.new(lead: User.new(name: "lead 6", active: false), active: false)]

And because our Product class has the methods name and active?, we can use our names and get_active variables that hold our partially applied Procs against the list of Products,

names.(products)
# => ["Prefect", "SICP", "HTDP", "MRI", "HHGTTG", "Windows 3.1"]
get_active.(products)
# => [#<Product:0x007f9c530b7dd0 @active=true, @brand="MIT Press", @id=7, @name="SICP">,
#  #<Product:0x007f9c530b7ce0 @active=true, @brand="MIT Press", @id=16, @name="HTDP">,
#  #<Product:0x007f9c530b7bf0 @active=true, @brand="Ruby", @id=17, @name="MRI">,
#  #<Product:0x007f9c530b7ad8 @active=true, @brand="HHGTTG", @id=42, @name="HHGTTG">]
names.(get_active.(products))
# => ["SICP", "HTDP", "MRI", "HHGTTG"]

Sessions,

names.(sessions)
# => ["session A", "session A", "session A", "session A", "session A"]
get_active.(sessions)
# => [#<Session:0x007f9c52153560 @duration=3, @name="session A">,
#  #<Session:0x007f9c52152f98 @duration=9, @name="session A">,
#  #<Session:0x007f9c52152ca0 @duration=1, @name="session A">]
names.(get_active.(sessions))
# => ["session A", "session A", "session A"]

and SalesLeads,

names.(leads)
# => ["lead 1", "lead 2", "lead 3", "lead 4", "lead 5", "lead 6"]
get_active.(leads)
# => [#<SalesLead:0x007f9c5212a020
#   @active=true,
#   @lead=#<User:0x007f9c5212a160 @active=true, @name="lead 1">>,
#  #<SalesLead:0x007f9c521292d8
#   @active=true,
#   @lead=#<User:0x007f9c52129468 @active=true, @name="lead 4">>,
#  #<SalesLead:0x007f9c52128e00
#   @active=true,
#   @lead=#<User:0x007f9c52129080 @active=true, @name="lead 5">>]
names.(get_active.(leads))
# => ["lead 1", "lead 4", "lead 5"]

With this in mind, we will update the definition of our names and get_active to show that it is not just “users” it operates against, but any item.

names = Sequence.map.(->(x) {x.name})
# => #<Proc:0x007f9c53107f38 (lambda)>
get_active = Sequence.filter.(->(x) {x.active?})
# => #<Proc:0x007f9c530bfcd8 (lambda)>

So with this, we have now been able to take our map and select from Ruby’s Enumerable class that worked on a specific collection only, without being redefined and moved into a method to live on User somewhere, we now have a Proc that is recognized to be applicable to anything that accepts an item of that form, and these can be defined and used anywhere.

–Proctor

Ruby Tuesday – Partial Application

As we continue with the theme we have been pursuing in the last couple of posts, we take a brief pit stop and look at partial application before we move on to our next method we want to define.

Partial application is the ability to provide only a subset of arguments to a function or method, and return a new function/method to be called later that will keep the original context.

We will start our examples with the two methods double and triple.

def double(y)
  2 * y
end

def triple(y)
  3 * y
end

double(3)
# => 6

triple(5)
# => 15

We can think of these methods as defined in terms of a more generic function multiply that we call with a hard coded value.

def multiply(x, y)
  x * y
end

def double(y)
  multiply(2, y)
end

def triple(y)
  multiply(3, y)
end

What partial application allows us to do is to define double and triple in terms of multiply, by calling it with the first argument only, the 2 for the method double, and saving the resulting function to be invoked later.

To do this in Ruby we can use Method#curry, or Proc#curry.

Method#curry will return a new Proc that can then be invoked with only a subset of its arguments.

method(:multiply).curry
# => #<Proc:0x007fc02b225950 (lambda)>

So for our double and triple functionality, we can make those be variables which hold the resulting Proc of passing in their value to multiply, and invoke them by only passing in the value we want to double, or triple.

double = method(:multiply).curry.(2)
# => #<Proc:0x007fc02b1eeea0 (lambda)>
double.(8)
# => 16

triple = method(:multiply).curry.(3)
# => #<Proc:0x007fc02b186260 (lambda)>
triple.(17)
# => 51

At this point you might be wondering what this gets you, as the examples of double, triple, and multiply might seem a bit simplistic at best, and maybe even contrived.

I would agree; it is a simple example, but mainly to show as an example of what partial application is, and now we will take a look at our filter, map, and reduce from the previous posts and update them to show some of the power of partial application.

As a reminder this is the map, filter, and reduce as defined previously.

def map(items, do_map)
  reduce([], items, lambda do |accumulator, item|
    accumulator.dup << do_map.call(item)
  end)
end

def filter(items, predicate)
  reduce([], items, lambda do |accumulator, item|
    if (predicate.call(item))
      accumulator.dup << item
    else
      accumulator
    end
  end)
end

def reduce(initial, items, operation)
  return nil if items.empty?

  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

We will update these definitions to be able to take the items collection last, as that is the most general argument.

def map(f, items)
  do_map = lambda do |accumulator, item|
    accumulator.dup << f.call(item)
  end

  reduce([], do_map, items)
end

def filter(predicate, items)
  do_filter = lambda do |accumulator, item|
    if (predicate.call(item))
      accumulator.dup << item
    else
      accumulator
    end
  end

  reduce([], do_filter, items)
end

def reduce(initial, operation, items)
  return nil unless items.any?{|item| true}

  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

You might be wondering why I said that the collections of items is the most general argument.

That was because, if we want to sum a sequence of numbers, double a sequence of numbers, or even pick out the even numbers, we can do that against a number of different collections of items.

reduce(0, :+.to_proc, (1..5))
# => 15
reduce(0, :+.to_proc, [2, 4, 6, 8])
# => 20

map(double, (2..7))
# => [4, 6, 8, 10, 12, 14]
map(double, [1, 2, 3, 5, 8, 13])
# => [2, 4, 6, 10, 16, 26]

filter(lambda{|x| x.even?}, (5..10))
# => [6, 8, 10]
filter(lambda{|x| x.even?}, (0..100).step(10))
# => [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]

And because we now have the generic items last, we can use Method#curry to build up Procs to call that represent the parts that are the same and give those Procs descriptive names.

concat = method(:reduce).curry.("", :+.to_proc)
# => #<Proc:0x007fc02b207f68 (lambda)>

concat.(("a".."d"))
# => "abcd"
concat.(["alpha", "beta", "gamma", "delta"])
# => "alphabetagammadelta"


sum = method(:reduce).curry.(0, :+.to_proc)
# => #<Proc:0x007fc02b25c450 (lambda)>

sum.((1..10))
# => 55
sum.((2..20).step(2))
# => 110


evens_only = method(:filter).curry.(lambda{|x| x.even?})
# => #<Proc:0x007fc02b155688 (lambda)>

evens_only.([1, 2, 3, 5, 8, 13, 21])
# => [2, 8]
evens_only.([1, 2, 4, 7, 11])
# => [2, 4]


doubles = method(:map).curry.(double)
# => #<Proc:0x007fc02b0c0308 (lambda)>

doubles.((1..10))
# => [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]
doubles.([2, 4, 6, 8])
# => [4, 8, 12, 16]

And in the last example of doubles we did a curry of map, and used the partially applied function double as the argument to be partially applied to map.

So with partial application, we can start to abstract common behavior out into Procs that can operate against more generic data.

One last example would be filtering items against those that are active. With partial application, we can take filter and create a partially applied version that is given a Proc that will check to see if an item is active?.

active_items = method(:filter).curry.(lambda{|item| item.active?})
# => #<Proc:0x007fc02b207f68 (lambda)>

With this active_items Proc, we can then use that against any collection of objects as long as they support the method active?, e.g. Users, Orders, Sessions, Blog Posts, etc.

active_users = active_items.(users)

active_blog_posts = active_items.(blog_posts)

active_sessions = active_items.(sessions)

active_orders = active_items.(orders)

As you can hopefully start to see, we can start to get some very small, focused, and powerful functions that are nicely abstracted to work against a broader range of input.

Next week, we will take the new versions of map, filter, and reduce, along with partial application, to show how we can take these smaller pieces of code and reuse and assemble them together to get more advanced behaviors.

–Proctor

Ruby Tuesday – Refactoring Towards Creating reduce

As we continue with the theme we have been pursuing in the last couple of posts, we will look at refactoring to reduce, and will take a look at how we can use this with what we have built on from previous posts.

Again for reference, here is our setup data of a User object

require 'date'

class User
  attr_reader :name, :date_of_birth, :date_of_death, :languages_created

  def initialize(name:,
                 is_active:,
                 date_of_birth: nil,
                 date_of_death: nil,
                 languages_created: [])
    @name = name
    @is_active = is_active
    @date_of_birth = date_of_birth
    @date_of_death = date_of_death
    @languages_created = languages_created
  end

  def active?
    @is_active
  end

  def to_s
    inspect
  end
end

and our list of users.

alan_kay = User.new(name: "Alan Kay",
                    is_active: true,
                    date_of_birth: Date.new(1940, 5, 17),
                    languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
                         is_active: true,
                         date_of_birth: Date.new(1927, 9, 4),
                         date_of_death: Date.new(2011, 10, 24),
                         languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
                          is_active: true,
                          languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
                          is_active: true,
                          date_of_birth: Date.new(1941, 9, 9),
                          date_of_death: Date.new(2011, 10, 12),
                          languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
                         is_active: true,
                         date_of_birth: Date.new(1955, 5, 19),
                         languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
                is_active: true,
                date_of_birth: Date.new(1965, 4, 14),
                languages_created: ["Ruby"])
nobody = User.new(name: "",
                  is_active: false)

users = [alan_kay, john_mccarthy, robert_virding,
         dennis_ritchie, james_gosling, matz, nobody]

In our theoretical code base we have some code that will find the oldest language creator.

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next unless user.date_of_death.nil?
    next if user.date_of_birth.nil?
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
# => "Alan Kay"

That is pretty nasty, so let’s see if we can clean it up some and see what happens.

First, inside the for we have both and if and unless, so let’s refactor the unless to be an if.

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next if (not user.date_of_death.nil?)
    next if user.date_of_birth.nil?
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

pry(main)> oldest_language_creator(users).name
# => "Alan Kay"

And while we are at it, we will refactor out the conditions in the ifs to give them clarifying names.

def alive?(user)
  user.date_of_death.nil?
end

def has_known_birthday?(user)
  not user.date_of_birth.nil?
end

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next if not alive?(user)
    next if not has_known_birthday?(user)
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
# => "Alan Kay"

Still works, and that we now have multiple if in our for loop, we can think back to a couple of posts ago, and realize we have a couple of filters happening in our list, and then some logic around who has the earliest birth date.

So let’s refactor out the filters and see what our method starts to look like.

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})

  oldest = nil
  for user in with_birthdays do
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
#=> "Alan Kay"

This next refactoring might be a bit of a jump, but for me, I am not too fond with starting out with a nil and having to check that every time, since it will be fixed on the first time around, so let’s clean that up.

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users, 
                          lambda{|user| has_known_birthday?(user)})

  oldest, *rest = with_birthdays
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

Let’s refactor out our for loop into another method, so we can look at it on its own.

def user_with_earliest_birthday(users)
  oldest, *rest = users
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})
  user_with_earliest_birthday(with_birthdays)
end

Now we have a pattern here, and it has been present in our filter and map as well, if you can see it, so let’s see if we can identify it.

def user_with_earliest_birthday(users)
  oldest, *rest = users
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

def filter(items, predicate)
  matching = []
  for item in items do
    if (predicate.call(item))
      matching << item
    end
  end
  matching
end

def map(items, do_map)
  results = []
  for item in items do
    results << do_map.call(item)
  end
  results
end

If you haven’t detangled it, the pattern is:
1. We have some initial value,
2. and for every item in a list we do some operation against that item value and the current accumulated value, which results in a new value
3. we return the accumulated value.

With our user_with_earliest_birthday method, the initial accumlated value is the first user, the operation is a compare against that with each item to find the oldest user so far, and we return the oldest user.

With filter, the initial accumulated value is an empty Array, the operation is an append if some critera is met, and the return is the accumlated array for those items that meet that criteria.

With map, the initial accumulated value is an empty Array, the operation is an append of the result of a transformation function on each value, and the return is the accumlated array for the transformed results.

This pattern is called reduce.

So what would this look like generically???

def reduce(initial, items, operation)
  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

Let’s write our user_with_earliest_birthday using this new reduce then, and consume it in our oldest_language_creator

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})

  reduce(with_birthdays.first, with_birthdays.drop(1),
         lambda do |oldest, user|
           oldest.date_of_birth > user.date_of_birth ? user : oldest
         end)
end

Our accumulator starts with the first user in the list, uses the rest of the list to iterate through, and then will return either the accumulator (oldest_so_far) or the current item (user), which would be assigned to the accumulator value for the next iteration.

So how would we write our map and filter to use this new reduce?

def map(items, do_map)
  reduce([], items,
         lambda do |accumulator, item|
           accumulator.dup << do_map.call(item)
         end)
end

def filter(items, predicate)
  reduce([], items, lambda do |accumulator, item|
    if (predicate.call(item))
      accumulator.dup << item
    else
      accumulator
    end
  end)
end

For our new map, our operation is to call the do_map lambda given to the function, and add the transformed value to a duplicate of the original accumulator. While in these cases, it is not necessary to duplicate the original accumulator, I did so here to mirror that in reduce, we are getting what could be considered a completely new value, such as we have with our oldest_language_creator version that uses reduce.

And for our new filter our operation either returns the original accumulator, or adds the item to a new copy of the accumulated list if the predicate passed to filter returns truth. Again, we could leave out the duplication, but for purity sake, and working out the logic we will keep it in there.

So let’s step through our new filter and see what happens one step at a time with it now using reduce.

filter((1..9), lambda{|item| item.odd?})

If we inline reduce substituting the variables given to filter, it looks like the following.

reduce([], (1..9), lambda do |accumulator, item|
  if (lambda{|item| item.odd?}.call(item))
    accumulator.dup << item
  else
    accumulator
  end
end)

And if we expand the body of reduce, and rename it to filter_odds, we get

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = lambda do |accumulator, item|
      if (lambda{|item| item.odd?}.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end.call(accumulator, item)
  end
  accumulator
end

And we inline the calls to the lambda that came is from the predicate

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = lambda do |accumulator, item|
      if (item.odd?)
        accumulator.dup << item
      else
        accumulator
      end
    end.call(accumulator, item)
  end
  accumulator
end

and inline the lambda for the operation to given to reduce

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = if (item.odd?)
                    accumulator.dup << item
                  else
                    accumulator
                  end
  end
  accumulator
end

And we can see how through filter and reduce, we get back to something that looks like the orignal filtering out of odd numbers from a list.

And to test out reduce further, let’s add some numbers together.

We will call reduce with our initial accumulated “sum” of 0, the numbers from 1 to 10, and a lambda that adds the two numbers together to produce a new running sum.

reduce(0, (1..10), lambda{|accum, item| accum + item})
# => 55

And we do the same for a reduce that computes the product of a list of numbers.

This time our initial accumulator value is 1 which is the identity operation of multiplication.

reduce(1, (1..10), lambda{|accum, item| accum * item})
# => 3628800

But if we call it with an empty list, we return 1 still

reduce(1, [], lambda{|accum, item| accum * item})
# => 1

So we need to clean up our reduce some to make it more robust in the case of reducing against empty lists.

def reduce(initial, items, operation)
  return nil if items.empty?

  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

And now our reduce handles empty lists nicely, or at the least, a little more sanely.

reduce(1, [], lambda{|accum, item| accum * item})
# => nil

With all of that, we have refactored our code into something close to Ruby’s Enumerable#select, expect that we return nil if the enumerable is empty, instead of the initial value for the accumulator.

–Proctor

Ruby Tuesday – Refactoring Towards Creating map

Today’s Ruby Tuesday continues from where we left off with last week’s look at refactoring to filter.

For reference, we had a User class,

require 'date'

class User
  attr_reader :name, :date_of_birth, :date_of_death, :languages_created

  def initialize(name:, 
                 is_active:,
                 date_of_birth: nil,
                 date_of_death: nil,
                 languages_created: [])
    @name = name
    @is_active = is_active
    @date_of_birth = date_of_birth
    @date_of_death = date_of_death
    @languages_created = languages_created
  end

  def active?
    @is_active
  end

  def to_s
    inspect
  end
end

a list of User objects,

alan_kay = User.new(name: "Alan Kay",
                    is_active: true,
                    date_of_birth: Date.new(1940, 5, 17),
                    languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
                         is_active: true,
                         date_of_birth: Date.new(1927, 9, 4),
                         date_of_death: Date.new(2011, 10, 24),
                         languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
                          is_active: true,
                          languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
                          is_active: true,
                          date_of_birth: Date.new(1941, 9, 9),
                          date_of_death: Date.new(2011, 10, 12),
                          languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
                         is_active: true,
                         date_of_birth: Date.new(1955, 5, 19),
                         languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
                is_active: true,
                date_of_birth: Date.new(1965, 4, 14),
                languages_created: ["Ruby"])
nobody = User.new(name: "",
                  is_active: false)

users = [alan_kay, john_mccarthy, robert_virding, 
         dennis_ritchie, james_gosling, matz, nobody]

and a helper method to get the list of names for a list of Users.

def get_names_for(users)
  names = []
  for user in users do
    names << user.name
  end
  names
end

get_names_for(users)
=> ["Alan Kay", "John McCarthy", "Robert Virding",
    "Dennis Ritchie", "James Gosling", "Yukihiro Matsumoto", ""]

Elsewhere in our (imaginary, but based off real events with names changed to protect the innocent) code base, we have some logic to get a listing of languages created by the users.

def get_languages(users)
  languages = []
  for user in users do
    languages << user.languages_created
  end
  languages
end

get_languages(users)
# => [["Smalltalk", "Squeak"], ["Lisp"],
      ["Erlang", "LFE"], ["C"], ["Java"], ["Ruby"], []]

And yet somewhere else, there is logic to get a listing of the years different users were born.

def get_birth_years(users)
  birth_years = []
  for user in users do
    birth_years << (user.date_of_birth ? user.date_of_birth.year : nil)
  end
  birth_years
end

get_birth_years(users)
# => [1940, 1927, nil, 1941, 1955, 1965, nil]

As with the filter we looked at last week, we have quite a bit of duplication of logic in all of these methods.

If we turn our head and squint a little, we can see the methods all look something like this:

def transform_to(items)
  results = []
  for item in items do
    results << do_some_transformation(item)
  end
  results
end

This method:

  1. takes a list of items to iterate over
  2. creates a working result set
  3. iterates over every item in the items given and for each item
    • some transformation of the item into a new value is computed and
    • the result is added to the working results set
  4. the end results are returned

The only thing that is different between each of the functions above, once we have rationalized the variable names, is the transformation to be done on each item in the list.

And this transformation that is the different part is just calling a function on that item, also called map in Mathematics, which Wolfram Alpha defines as:

A map is a way of associating unique objects to every element in a given set. So a map f:A|->B from A to B is a function f such that for every a element A, there is a unique object f(a) element B. The terms function and mapping are synonymous for map.

So we will “map” over all of the items to get a new list of items, which makes our generic function look like the following, after we update names to match our new terminology.

def map(items)
  results = []
  for item in items do
    results << do_map(item)
  end
  results
end

This is starting to come together, but we still don’t have anything specific for what do_map represents yet.

We will follow our previous example in filter and make the generic function we want to call a anonymous function, specifically a lambda in Ruby, and pass that in to our map method.

def map(items, do_map)
  results = []
  for item in items do
    results << do_map.call(item)
  end
  results
end

Time to test it out by using our previous calls and making the specifics a lambda.

map(users, lambda{|user| user.languages_created})
# => [["Smalltalk", "Squeak"], ["Lisp"],
      ["Erlang", "LFE"], ["C"], ["Java"], ["Ruby"], []]
map(users, lambda{|user| user.name})
# => ["Alan Kay", "John McCarthy", "Robert Virding",
      "Dennis Ritchie", "James Gosling", "Yukihiro Matsumoto", ""]
map(users, lambda{|user| user.date_of_birth ? user.date_of_birth.year : nil})
# => [1940, 1927, nil, 1941, 1955, 1965, nil]

And to test if we did get this to be generic enough to work against lists of other types, we’ll do some conversions from characters to Integers, Integers to characters, and cube some integers.

map( ("a".."z"), lambda{|char| char.ord})
# => [97, 98, 99, 100, 101, 102, 103, 104, 105, 106,
      107, 108, 109, 110, 111, 112, 113, 114, 115,
      116, 117, 118, 119, 120, 121, 122]
map((65..90), lambda{|ascii_value| ascii_value.chr})
# => ["A", "B", "C", "D", "E", "F", "G", "H", "I",
      "J", "K", "L", "M", "N", "O", "P", "Q", "R",
      "S", "T", "U", "V", "W", "X", "Y", "Z"]
map((1..7), lambda{|i| i*i*i})
# => [1, 8, 27, 64, 125, 216, 343]

So like last week’s post, where we were able to genericize the logic about conditionally plucking out items from a list based of some condition, we were able to genericize the transformation of a list of one set of values into a list of another set of values.

Which if you are familiar to Ruby, you will likely recognize as Enumerable#map, a.k.a. Enumerable#select, but now you have seen how you could have went down the road to creating your own, if Ruby hadn’t already provided it for you.

-Proctor

Ruby Tuesday – Refactoring towards creating filter

Today’s Ruby Tuesday takes a look at the concept of filter, a.k.a. select in Ruby, and how we could create our own version of it through some refactoring.

Filter is a function/method that can really start to change the way you think about your programs, and start helping you to take advantage of smaller building blocks that compose, or assemble, together to create nice reusable pieces of code.

To get an understanding of when and where filter can be powerful, and how you could create filter on your own if not already given to you as Enumerable#select, we’ll look at some “typical” style code that would look like something you are likely to have encountered in your current code base, or past code bases.

For this guide, we have a User class, and we will require 'date' since we want the user to have a date of birth, and a date of death, since we will be using some historical figures in the world of Computer Science.

require 'date'

class User
  attr_reader :name, :date_of_birth, :date_of_death, :languages_created

  def initialize(name:, is_active:, date_of_birth: nil,
                 date_of_death: nil, languages_created: [])
    @name = name
    @is_active = is_active
    @date_of_birth = date_of_birth
    @date_of_death = date_of_death
    @languages_created = languages_created
  end

  def active?
    @is_active
  end

  def to_s
    inspect
  end
end

We create add some User objects of creators of various programming languages, and add them to an Array of Users.

alan_kay = User.new(name: "Alan Kay",
                    is_active: true,
                    date_of_birth: Date.new(1940, 5, 17),
                    languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
                         is_active: true,
                         date_of_birth: Date.new(1927, 9, 4),
                         date_of_death: Date.new(2011, 10, 24),
                         languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
                          is_active: true,
                          languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
                          is_active: true,
                          date_of_birth: Date.new(1941, 9, 9),
                          date_of_death: Date.new(2011, 10, 12),
                          languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
                         is_active: true,
                         date_of_birth: Date.new(1955, 5, 19),
                         languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
                is_active: true,
                date_of_birth: Date.new(1965, 4, 14),
                languages_created: ["Ruby"])
nobody = User.new(name: "",
                  is_active: false)

users = [alan_kay, john_mccarthy, robert_virding, 
         dennis_ritchie, james_gosling, matz, nobody]

For most of our cases, we will want an easy way to see what users we have as a result of some operation, so lets define a method that returns a list of just the names for a given list of users.

def get_names_for(users)
  names = []
  for user in users do
    names << user.name
  end
  names
end

So somewhere in our code base we have an area of code that wants to get only the active users from a given list of User objects.

We do our standard for loop, as we would do in so many languages, and we have an if clause that checks the active? method on a User. We do some other processing on that list which we will represent as putsing out the names of the result.

active_users = []
for user in users do
  if (user.active?)
    active_users << user
  end
end

puts "\n\nThe active users' names are:..."
puts get_names_for(active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil

Somewhere else in our code base, we have something that wants a list of the language creators that are still alive, because wouldn’t it be cool that we might happen to have the chance to get to have lunch with them during a conference.

alive_users = []
for user in users do
  if (not user.date_of_death)
    alive_users << user
  end
end

puts "\n\nThe alive users' names are:..."
puts get_names_for(alive_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
# 
# => nil

And again, the puts just represents some processing of that list.

Yet somewhere else, we have some code that looks for those people that we know to have created more than one programming language.

users_created_more_than_one_language = []
for user in users do
  if (user.languages_created.count > 1)
    users_created_more_than_one_language << user
  end
end

puts "\n\nThe names for users who have created more than one language:..."
puts get_names_for(users_created_more_than_one_language)
# Alan Kay
# Robert Virding
# => nil

If we take a look at our three segments of code above, after a while, if you haven’t already, you will start to notice that they all are very, very similar.

They all:
– create an empty array and assign it to a variable that represents the working list of items that meet some condition,
– iterate over all the items in the list of users
– for each item, it checks some condition,
– add the user to the working copy variable if the condition is true
– return the working copy of items that meet the condition.

If we renamed the working variable to be the same name, the only thing that would be different in the code segments, is the conditional that is checked as part of the if clause.

matching = []
for user in users do
  if (something_specific_goes_here)
    matching << user
  end
end
matching

For a number of languages, you would have to live with that duplication, but this is Ruby, so we can use an escape hatch to make this code more generic and abstract.

The only thing that is different, is the conditional, a.k.a. the predicate. The term predicate signifies a method, or function, that returns a boolean result.

So if we want to abstract out the filtering out of items that match some predicate condition, we can use a lambda, or Proc, call that predicate passing in the User to see if we get a true returned.

def filter(users, predicate)
  matching = []
  for user in users do
    if (predicate.call(user))
      matching << user
    end
  end
  matching
end

We now have something that looks like we can re-use it elsewhere for a list of Users.

So let’s test it out, by redoing the previous checks to use the new filter method we just defined.

First we will call our new filter, and pass it a lambda that looks at the count of the languages that user created.

puts "\n\nThe names for users who have created more than one language (using `filter` method):..."
multi_language_creators = filter(users, lambda{|u| u.languages_created.count > 1 })
puts get_names_for(multi_language_creators)
# Alan Kay
# Robert Virding
# => nil

Looks to be the same as the previous version.

Let’s try it with finding those that don’t have a known date of death.

puts "\n\nThe names for users who are not dead (using `filter` method):..."
not_dead_users = filter(users, lambda{|u| not u.date_of_death})
puts get_names_for(not_dead_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
# 
# => nil

So far, so good.

Finally, we try it for active users.

puts "\n\nThe names for users who are active (using `filter` method):..."
filtered_active_users = filter(users, lambda{|u| u.active?})
puts get_names_for(filtered_active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil

Yay! We have extracted a common pattern of our code out into something that represents a higher abstraction of filtering out users with a certain condition from a list of User objects.

Not only that, we have separated the concern of iterating over items and checking each item, from the concern of the actual condition we care about.

This seems pretty useful, and something that would apply beyond just a list of User objects.

Let’s see if we can do this for some Array of numbers as well.

Integers in Ruby have an even? method that we can use to know if a number is even.

puts "is 1 even???"
puts 1.even?

To get the even numbers from an Array of numbers, we have some code that looks very familiar.

even_numbers = []
for i in [1, 2, 3, 4, 5, 6, 7] do
  if (i.even?)
    even_numbers << i
  end
end

Let’s try out our new filter method, and see if we can use it on a list of numbers, and only get back those that are even.

puts "\n\nEven numbers"
evens = filter([1, 2, 3, 4, 5, 6, 7], lambda{|i| i.even?})
puts evens
# 2
# 4
# 6
# => nil

That works!!! Let the celebration commence!!!

Well, let it commence after we clean up our filter method to let it represent that it is for more than just a list of Users.

def filter(items, predicate)
  matching = []
  for item in items do
    if (predicate.call(item))
      matching << item
    end
  end
  matching
end

Instead of users, we change it to be items, and an individual item instead of user in that list we loop through.

We can also use our filter method against Ranges and Hashes.

puts "\n\nOdd numbers"
odds = filter((1..7), lambda{|i| i.odd?})
puts odds
# 1
# 3
# 5
# 7
# => nil
puts filter({1 => :a, 2 => :b, 3 => :c, 4 => :d}, lambda{|(key, value)| key.even?}).inspect
# [[2, :b], [4, :d]]
# => nil

So by taking advantage of lambdas, Procs, or even blocks in Ruby, we have been able to extract out a recurring pattern in our code and give it a name.

Not only that, but saw how we could write the start of one do it ourselves, and with some work, we could get it to return a proper hash instead of a list of key-value lists.

-Proctor

Ruby Tuesday – Array Decomposition

Today’s Ruby Tuesday takes a look at Array Decomposition.

Ruby gives you some ability to destructure, or decompose, an Array into its component pieces.

Say we have an Array which represents a point in two-dimensional Cartesian space, we can decompose that array into its x and y coordinates, by doing an assignment of x and y to the point represented by the given Array, if we wrap them in parenthesis.

(x, y) = [1, -1]
# => [1, -1]
x
# => 1
y
# => -1

If we want to decompose only part of the Array, and save off anything at the end, we can use a * before the last variable in the assignment.

(_, second, *rest) = [:a, :b, :c, :d]
# => [:a, :b, :c, :d]

The capturing even works, if there are less items in the array than there are variables to decompose into.

(_, second, *rest) = [:a]
# => [:a]
second
# => nil
rest
# => []

We can also use Array Decomposition in method arguments to decompose a passed in Array to individual variable for specific elements.

def cartesian_point((x, y))
  puts "x: #{x}, y: #{y}"
end
# => :cartesian_point

cartesian_point([1, -1])
# x: 1, y: -1
# => nil

Above I mentioned that we can capture the “rest” of the Array we aren’t wanting to decompose at this point by using a * (called the “splat” operator).

(head, *rest) = [1, 2, 3, 4, 5]
# => [1, 2, 3, 4, 5]
head
# => 1
rest
# => [2, 3, 4, 5]

So let’s see what we get when we use the splat to capture the rest of an Array that is smaller than what is at the end.

(first, *tail) = [1]
# => [1]
first
# => 1
tail
# => []

We get an empty list.

Let’s see what we get on the decomposition for an empty list, into the items, and a capturing “rest” variable.

(h, *t) = []
# => []
h
# => nil
t
# => []

We get a nil as part of the binding, and we still get an empty array the “rest” of the items.

This means we can use destructing to handle recursing a list, and have a guard clause that checks if we received and empty Array by checking if the head of the Array is nil, and then just recurse by passing in the tail as the argument when recursing.

def recurse_list((head, *tail))
  if (head != nil)
    puts head
    recurse_list(tail)
  else
    puts "all done"
  end
end
# => :recurse_list

recurse_list([1, 2, 3, 4, 5, 6, 7])
# 1
# 2
# 3
# 4
# 5
# 6
# 7
# all done
# => nil

–Proctor

Ruby Tuesday – SSL Version In Ruby

Today’s Ruby Tuesday takes a look at the OpenSSL::SSL::SSLContext#ssl_version.

At work today, I was pulled into a bit of a “fire”, where I was told that one of the sets of services our app at work depends on is going to be removing support for all but TLS 1.2.

Just doing a basic Net::HTTP.get did not get us a result back, and we first had to figure out how to get TLS 1.2 as something that Ruby as the client would say it supports.

Before we can start testing any of this, we need to require openssl and net/http.

require 'openssl'
# => true
require 'net/http'
# => true

It turns out when you don’t specify a version the default version is SSLv23, which can be found by looking at DEFAULT_PARAMS on the SSLContext.

OpenSSL::SSL::SSLContext::DEFAULT_PARAMS[:ssl_version]
=> "SSLv23"

Since TLSv1.2, is a “higher” version than SSLv23, we were getting errors back because we a connection using TLSv1.2 would never be negotiated.

To get the ability to support TLS generically, instead of just hardcoding a specific version, e.g. TLSv1, or TLSv1_2, the ssl_version needs to be set to nil to tell Ruby’s OpenSSL components to use TLS instead of SSL.

This opened up the question of if we set OpenSSL to be TLS instead of SSL, would the TLS Protocol negotiate down to SSL if we happen to have an endpoint we need to talk to that doesn’t currently support TLS, but only SSL.

Playing around with Net::HTTP.start I was able to play with sending HTTPs requests using different settings for the ssl_version.

As I was also testing against a local instance of nginx that would only support SSLv3 using a self-signed certificate, I had the veryfy_mode to VERIFY_NONE for testing. Note that I do NOT recommend this for real use cases.

The first helper method I created doesn’t specify a ssl_version option, so it just uses the default ssl_version setting.

def test_url_no_version(url)
  Net::HTTP.start(url.hostname, nil,
                  use_ssl: url.scheme == "https",
                  verify_mode: OpenSSL::SSL::VERIFY_NONE ) do |http|
    response = http.request(Net::HTTP::Get.new(url))
    puts response.inspect
    response
  end
end
# => :test_url_no_version

The the next helper method I created sets the ssl_version option to nil to allow it to use TLS instead of SSL.

def test_url_ssl_version_is_nil(url)
  Net::HTTP.start(url.hostname, nil,
                  use_ssl: url.scheme == "https",
                  verify_mode: OpenSSL::SSL::VERIFY_NONE,
                  ssl_version: nil ) do |http|
    response = http.request(Net::HTTP::Get.new(url))
    puts response.inspect
    response
  end
end
# => :test_url

The the last helper method I created sets the ssl_version option to :SSLv3, which is the only version the webserver is setup to handle.

def test_url_ssl3(url)
  Net::HTTP.start(url.hostname, nil,
                  use_ssl: url.scheme == "https",
                  verify_mode: OpenSSL::SSL::VERIFY_NONE,
                  ssl_version: :SSLv3 ) do |http|
    response = http.request(Net::HTTP::Get.new(url))
    puts response.inspect
    response
  end
end
# => :test_url_ssl3

Now that we have these, we can test the results of asking the test webserver for a request and see what happens. We will also use Google as a baseline to compare against.

First we will test the verion where the ssl_version is set to nil. This would tell us if it would fall back to try a SSL variant.

test_url_ssl_version_is_nil(URI("https://www.google.com"))
# #<Net::HTTPOK 200 OK readbody=true>
# => #<Net::HTTPOK 200 OK readbody=true>
test_url_ssl_version_is_nil(URI("https://localhost/index.html"))
# OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unsupported protocol
# from /Users/proctor/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/net/http.rb:923:in `connect'

Google returns successfully, but the local test doesn’t so it doesn’t look to fall back to SSL from TLS.

Next we try with the default setting for ssl_version.

test_url_no_version(URI("https://www.google.com"))
# #<Net::HTTPOK 200 OK readbody=true>
# => #<Net::HTTPOK 200 OK readbody=true>
test_url_no_version(URI("https://localhost/index.html"))
# OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=SSLv2/v3 read server hello A: unsupported protocol
# from /Users/proctor/.rvm/rubies/ruby-2.2.3/lib/ruby/2.2.0/net/http.rb:923:in `connect'

Google still returns successfully, but the local test case still doesn’t work.

Finally we will test with specifying SSL v3 specifically.

test_url_ssl3(URI("https://www.google.com"))
# #<Net::HTTPOK 200 OK readbody=true>
# => #<Net::HTTPOK 200 OK readbody=true>
test_url_ssl3(URI("https://localhost/index.html"))
# #<Net::HTTPOK 200 OK readbody=true>
# => #<Net::HTTPOK 200 OK readbody=true>

And for this, Google and the local test both work. So we have shown that with the right ssl version specified, we do get a response back from our local test server, but the fallback from TLS to SSL doesn’t happen.

–Proctor