Ruby Tuesday – composing filter

As mentioned in the last Ruby Tuesday on using compose and map, I ended with saying that we would take a look at how we would take advantage of composition for our filter.

As a refresher, here is our Sequence module.

module Sequence

  def self.my_map(f, items)
    do_map = lambda do |accumulator, item|
      accumulator.dup << f.call(item)
    end

    my_reduce([], do_map, items)
  end

  def self.my_filter(predicate, items)
    do_filter = lambda do |accumulator, item|
      if (predicate.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end

    my_reduce([], do_filter, items)
  end

  def self.my_reduce(initial, operation, items)
    return nil unless items.any?{|item| true}

    accumulator = initial
    for item in items do
      accumulator = operation.call(accumulator, item)
    end
    accumulator
  end

  def self.my_compose(functions, initial)
    apply_fn = ->(accum, fn) { fn.(accum) }
    my_reduce(initial, apply_fn, functions)
  end

  @@map = method(:my_map).curry
  @@filter = method(:my_filter).curry
  @@reduce = method(:my_reduce).curry
  @@compose = method(:my_compose).curry

  def self.map
    @@map
  end

  def self.filter
    @@filter
  end

  def self.reduce
    @@reduce
  end

  def self.compose
    @@compose
  end
end

So lets start taking a look at how we would be able to compose our filter lambdas together by seeing if we can find all numbers that are multiples of three, and then multiples of five.

multiple_of_three = ->(x) { x % 3 == 0}
# => #<Proc:0x007faeb28926b0@(pry):1 (lambda)>
multiple_of_five = ->(x) { x % 5 == 0}
# => #<Proc:0x007faeb3866c58@(pry):2 (lambda)>
Sequence.compose.([Sequence.filter.(multiple_of_three),
                   Sequence.filter.(multiple_of_five)]).((1..100))
=> [15, 30, 45, 60, 75, 90]

The catch again is that we are traversing the first list for all 100 elements to find those items that are multiples of 3, and then filtering the filtered list to find only those that are multiples of 5.

What if we could do this in one pass, and compose the predicates together?

And since we want to test this to make sure we get a good result, we will compose those predicates together and just test our composed function with the number 15.

three_and_five = Sequence.compose.([multiple_of_three, multiple_of_five])
=> #<Proc:0x007faeb4905d20 (lambda)>
three_and_five.(15)
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

It fails with a NoMethodError telling us that we can’t use the method % on an object of type TrueClass.

Don’t we want to get back a true instead of an error though?

If we rollback the compose into it’s nested function call version, it starts to become clear where the error is coming from.

multiple_of_five.(multiple_of_three.(15))
# NoMethodError: undefined method `%' for true:TrueClass
# from (pry):2:in `block in __pry__'

So what is happening is that we get back true from calling multiple_of_three with 15, and we then pass that true into multiple_of_five, when what we wanted to do was to pass in the 15 and make sure it was a multiple of 5 as well.

What we really want is to evaluate every lambda with the value of 15, and then then see if all of the predicate functions succeed in their checks.

So we will start with a very naive, and “un-curried” version, to prove out our concept.

Our first pass will be to map over each predicate and invoke it with the value we want to test resulting in a list of booleans if it succeeded or failed. We then reduce all of those items together via an and operation to get an overall success status.

def my_all_succeed_naive(predicates, value)
  check_results = Sequence.map.(->(f) {f.(value)}, predicates)
  Sequence.reduce.(true, ->(accum, item) {accum && item}, check_results)
end

my_all_succeed_naive([multiple_of_three, multiple_of_five], 15)
# => true
my_all_succeed_naive([multiple_of_three, multiple_of_five], 14)
# => false
my_all_succeed_naive([multiple_of_three, multiple_of_five], 5)
# => false

This looks like this works, as we get 15 is a multiple of 3 and a multiple of 5, but we still check if the item is a multiple of 5, even if our multiple of 3 check failed.

Could we do better, and short circuit our evaluation if we get a false?

Let’s try.

def my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

So let’s take a look at the difference between the two, just to make sure we are on the right track.

First we will create a “long running predicate check” to add to our chain.

[22] pry(main)> pass_after = ->(x, value) { sleep(x); true }.curry
=> #<Proc:0x007faeb310d498 (lambda)>

Then we will time it using Benchmark#measure (and don’t forget to require 'benchmark' first though). First with a success, and then with a failure against the first predicate.

Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 15)
end
# => #<Benchmark::Tms:0x007faeb28f24c0
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0046792929642834,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>


Benchmark.measure do
  my_all_succeed([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb6018c88
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=2.5073997676372528e-05,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

We can see that it takes three seconds to run if we succeed, but only a fraction of a second if we fail on the first check. And just to make sure our earlier assumptions were correct, we will benchmark the same predicate list against the naive version with the value of 14, so it will fail on the first check of a multiple of three.

Benchmark.measure do
  my_all_succeed_naive([multiple_of_three, pass_after.(3), multiple_of_five], 14)
end
# => #<Benchmark::Tms:0x007faeb487d218
#  @cstime=0.0,
#  @cutime=0.0,
#  @label="",
#  @real=3.0028793679666705,
#  @stime=0.0,
#  @total=0.0,
#  @utime=0.0>

And it does indeed take just over 3 seconds to complete.

So let’s add this to our Sequence module, and get it to be able to be curried.

So what if we did that for just a single string, not even in an list of some sort?

def self.my_all_succeed(predicates, value)
  for predicate in predicates do
    return false unless predicate.(value)
  end
  true
end

@@all_succeed = method(:my_all_succeed).curry

def self.all_succeed
  @@all_succeed
end

And we check that we can use it off our Sequence module partially applied.

Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(14)
# => false
Sequence.all_succeed.([multiple_of_three, multiple_of_five]).(15)
# => true

So now we can get back to how we would use this with filter by using our new Sequence::all_succeed.

three_and_five_multiple = Sequence.all_succeed.([multiple_of_three, multiple_of_five])
# => #<Proc:0x007faeb28685e0 (lambda)>
Sequence.filter.(three_and_five_multiple).((1..100))
# => [15, 30, 45, 60, 75, 90]

And there we go, we have now composed our predicates into one function which we can then pass to Sequence::filter and only have to walk through the list once.

As we have seen how to compose predicates together for an “and” style composition, next week we will look at building an “or” style composition of predicates.

–Proctor