Ruby Tuesday – Refactoring Towards Creating reduce

As we continue with the theme we have been pursuing in the last couple of posts, we will look at refactoring to reduce, and will take a look at how we can use this with what we have built on from previous posts.

Again for reference, here is our setup data of a User object

require 'date'

class User
  attr_reader :name, :date_of_birth, :date_of_death, :languages_created

  def initialize(name:,
                 is_active:,
                 date_of_birth: nil,
                 date_of_death: nil,
                 languages_created: [])
    @name = name
    @is_active = is_active
    @date_of_birth = date_of_birth
    @date_of_death = date_of_death
    @languages_created = languages_created
  end

  def active?
    @is_active
  end

  def to_s
    inspect
  end
end

and our list of users.

alan_kay = User.new(name: "Alan Kay",
                    is_active: true,
                    date_of_birth: Date.new(1940, 5, 17),
                    languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
                         is_active: true,
                         date_of_birth: Date.new(1927, 9, 4),
                         date_of_death: Date.new(2011, 10, 24),
                         languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
                          is_active: true,
                          languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
                          is_active: true,
                          date_of_birth: Date.new(1941, 9, 9),
                          date_of_death: Date.new(2011, 10, 12),
                          languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
                         is_active: true,
                         date_of_birth: Date.new(1955, 5, 19),
                         languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
                is_active: true,
                date_of_birth: Date.new(1965, 4, 14),
                languages_created: ["Ruby"])
nobody = User.new(name: "",
                  is_active: false)

users = [alan_kay, john_mccarthy, robert_virding,
         dennis_ritchie, james_gosling, matz, nobody]

In our theoretical code base we have some code that will find the oldest language creator.

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next unless user.date_of_death.nil?
    next if user.date_of_birth.nil?
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
# => "Alan Kay"

That is pretty nasty, so let’s see if we can clean it up some and see what happens.

First, inside the for we have both and if and unless, so let’s refactor the unless to be an if.

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next if (not user.date_of_death.nil?)
    next if user.date_of_birth.nil?
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

pry(main)> oldest_language_creator(users).name
# => "Alan Kay"

And while we are at it, we will refactor out the conditions in the ifs to give them clarifying names.

def alive?(user)
  user.date_of_death.nil?
end

def has_known_birthday?(user)
  not user.date_of_birth.nil?
end

def oldest_language_creator(users)
  oldest = nil
  for user in users do
    next if not alive?(user)
    next if not has_known_birthday?(user)
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
# => "Alan Kay"

Still works, and that we now have multiple if in our for loop, we can think back to a couple of posts ago, and realize we have a couple of filters happening in our list, and then some logic around who has the earliest birth date.

So let’s refactor out the filters and see what our method starts to look like.

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})

  oldest = nil
  for user in with_birthdays do
    if (oldest.nil? || oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

oldest_language_creator(users).name
#=> "Alan Kay"

This next refactoring might be a bit of a jump, but for me, I am not too fond with starting out with a nil and having to check that every time, since it will be fixed on the first time around, so let’s clean that up.

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users, 
                          lambda{|user| has_known_birthday?(user)})

  oldest, *rest = with_birthdays
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

Let’s refactor out our for loop into another method, so we can look at it on its own.

def user_with_earliest_birthday(users)
  oldest, *rest = users
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})
  user_with_earliest_birthday(with_birthdays)
end

Now we have a pattern here, and it has been present in our filter and map as well, if you can see it, so let’s see if we can identify it.

def user_with_earliest_birthday(users)
  oldest, *rest = users
  for user in rest do
    if (oldest.date_of_birth > user.date_of_birth)
      oldest = user
    end
  end
  oldest
end

def filter(items, predicate)
  matching = []
  for item in items do
    if (predicate.call(item))
      matching << item
    end
  end
  matching
end

def map(items, do_map)
  results = []
  for item in items do
    results << do_map.call(item)
  end
  results
end

If you haven’t detangled it, the pattern is:
1. We have some initial value,
2. and for every item in a list we do some operation against that item value and the current accumulated value, which results in a new value
3. we return the accumulated value.

With our user_with_earliest_birthday method, the initial accumlated value is the first user, the operation is a compare against that with each item to find the oldest user so far, and we return the oldest user.

With filter, the initial accumulated value is an empty Array, the operation is an append if some critera is met, and the return is the accumlated array for those items that meet that criteria.

With map, the initial accumulated value is an empty Array, the operation is an append of the result of a transformation function on each value, and the return is the accumlated array for the transformed results.

This pattern is called reduce.

So what would this look like generically???

def reduce(initial, items, operation)
  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

Let’s write our user_with_earliest_birthday using this new reduce then, and consume it in our oldest_language_creator

def oldest_language_creator(users)
  alive_users = filter(users, lambda{|user| alive?(user)})
  with_birthdays = filter(alive_users,
                          lambda{|user| has_known_birthday?(user)})

  reduce(with_birthdays.first, with_birthdays.drop(1),
         lambda do |oldest, user|
           oldest.date_of_birth > user.date_of_birth ? user : oldest
         end)
end

Our accumulator starts with the first user in the list, uses the rest of the list to iterate through, and then will return either the accumulator (oldest_so_far) or the current item (user), which would be assigned to the accumulator value for the next iteration.

So how would we write our map and filter to use this new reduce?

def map(items, do_map)
  reduce([], items,
         lambda do |accumulator, item|
           accumulator.dup << do_map.call(item)
         end)
end

def filter(items, predicate)
  reduce([], items, lambda do |accumulator, item|
    if (predicate.call(item))
      accumulator.dup << item
    else
      accumulator
    end
  end)
end

For our new map, our operation is to call the do_map lambda given to the function, and add the transformed value to a duplicate of the original accumulator. While in these cases, it is not necessary to duplicate the original accumulator, I did so here to mirror that in reduce, we are getting what could be considered a completely new value, such as we have with our oldest_language_creator version that uses reduce.

And for our new filter our operation either returns the original accumulator, or adds the item to a new copy of the accumulated list if the predicate passed to filter returns truth. Again, we could leave out the duplication, but for purity sake, and working out the logic we will keep it in there.

So let’s step through our new filter and see what happens one step at a time with it now using reduce.

filter((1..9), lambda{|item| item.odd?})

If we inline reduce substituting the variables given to filter, it looks like the following.

reduce([], (1..9), lambda do |accumulator, item|
  if (lambda{|item| item.odd?}.call(item))
    accumulator.dup << item
  else
    accumulator
  end
end)

And if we expand the body of reduce, and rename it to filter_odds, we get

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = lambda do |accumulator, item|
      if (lambda{|item| item.odd?}.call(item))
        accumulator.dup << item
      else
        accumulator
      end
    end.call(accumulator, item)
  end
  accumulator
end

And we inline the calls to the lambda that came is from the predicate

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = lambda do |accumulator, item|
      if (item.odd?)
        accumulator.dup << item
      else
        accumulator
      end
    end.call(accumulator, item)
  end
  accumulator
end

and inline the lambda for the operation to given to reduce

def filter_odds()
  accumulator = []
  for item in (1..9) do
    accumulator = if (item.odd?)
                    accumulator.dup << item
                  else
                    accumulator
                  end
  end
  accumulator
end

And we can see how through filter and reduce, we get back to something that looks like the orignal filtering out of odd numbers from a list.

And to test out reduce further, let’s add some numbers together.

We will call reduce with our initial accumulated “sum” of 0, the numbers from 1 to 10, and a lambda that adds the two numbers together to produce a new running sum.

reduce(0, (1..10), lambda{|accum, item| accum + item})
# => 55

And we do the same for a reduce that computes the product of a list of numbers.

This time our initial accumulator value is 1 which is the identity operation of multiplication.

reduce(1, (1..10), lambda{|accum, item| accum * item})
# => 3628800

But if we call it with an empty list, we return 1 still

reduce(1, [], lambda{|accum, item| accum * item})
# => 1

So we need to clean up our reduce some to make it more robust in the case of reducing against empty lists.

def reduce(initial, items, operation)
  return nil if items.empty?

  accumulator = initial
  for item in items do
    accumulator = operation.call(accumulator, item)
  end
  accumulator
end

And now our reduce handles empty lists nicely, or at the least, a little more sanely.

reduce(1, [], lambda{|accum, item| accum * item})
# => nil

With all of that, we have refactored our code into something close to Ruby’s Enumerable#select, expect that we return nil if the enumerable is empty, instead of the initial value for the accumulator.

–Proctor