Ruby Tuesday – Refactoring towards creating filter

Today’s Ruby Tuesday takes a look at the concept of filter, a.k.a. select in Ruby, and how we could create our own version of it through some refactoring.

Filter is a function/method that can really start to change the way you think about your programs, and start helping you to take advantage of smaller building blocks that compose, or assemble, together to create nice reusable pieces of code.

To get an understanding of when and where filter can be powerful, and how you could create filter on your own if not already given to you as Enumerable#select, we’ll look at some “typical” style code that would look like something you are likely to have encountered in your current code base, or past code bases.

For this guide, we have a User class, and we will require 'date' since we want the user to have a date of birth, and a date of death, since we will be using some historical figures in the world of Computer Science.

require 'date'

class User
  attr_reader :name, :date_of_birth, :date_of_death, :languages_created

  def initialize(name:, is_active:, date_of_birth: nil,
                 date_of_death: nil, languages_created: [])
    @name = name
    @is_active = is_active
    @date_of_birth = date_of_birth
    @date_of_death = date_of_death
    @languages_created = languages_created
  end

  def active?
    @is_active
  end

  def to_s
    inspect
  end
end

We create add some User objects of creators of various programming languages, and add them to an Array of Users.

alan_kay = User.new(name: "Alan Kay",
                    is_active: true,
                    date_of_birth: Date.new(1940, 5, 17),
                    languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
                         is_active: true,
                         date_of_birth: Date.new(1927, 9, 4),
                         date_of_death: Date.new(2011, 10, 24),
                         languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
                          is_active: true,
                          languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
                          is_active: true,
                          date_of_birth: Date.new(1941, 9, 9),
                          date_of_death: Date.new(2011, 10, 12),
                          languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
                         is_active: true,
                         date_of_birth: Date.new(1955, 5, 19),
                         languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
                is_active: true,
                date_of_birth: Date.new(1965, 4, 14),
                languages_created: ["Ruby"])
nobody = User.new(name: "",
                  is_active: false)

users = [alan_kay, john_mccarthy, robert_virding, 
         dennis_ritchie, james_gosling, matz, nobody]

For most of our cases, we will want an easy way to see what users we have as a result of some operation, so lets define a method that returns a list of just the names for a given list of users.

def get_names_for(users)
  names = []
  for user in users do
    names << user.name
  end
  names
end

So somewhere in our code base we have an area of code that wants to get only the active users from a given list of User objects.

We do our standard for loop, as we would do in so many languages, and we have an if clause that checks the active? method on a User. We do some other processing on that list which we will represent as putsing out the names of the result.

active_users = []
for user in users do
  if (user.active?)
    active_users << user
  end
end

puts "\n\nThe active users' names are:..."
puts get_names_for(active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil

Somewhere else in our code base, we have something that wants a list of the language creators that are still alive, because wouldn’t it be cool that we might happen to have the chance to get to have lunch with them during a conference.

alive_users = []
for user in users do
  if (not user.date_of_death)
    alive_users << user
  end
end

puts "\n\nThe alive users' names are:..."
puts get_names_for(alive_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
# 
# => nil

And again, the puts just represents some processing of that list.

Yet somewhere else, we have some code that looks for those people that we know to have created more than one programming language.

users_created_more_than_one_language = []
for user in users do
  if (user.languages_created.count > 1)
    users_created_more_than_one_language << user
  end
end

puts "\n\nThe names for users who have created more than one language:..."
puts get_names_for(users_created_more_than_one_language)
# Alan Kay
# Robert Virding
# => nil

If we take a look at our three segments of code above, after a while, if you haven’t already, you will start to notice that they all are very, very similar.

They all:
– create an empty array and assign it to a variable that represents the working list of items that meet some condition,
– iterate over all the items in the list of users
– for each item, it checks some condition,
– add the user to the working copy variable if the condition is true
– return the working copy of items that meet the condition.

If we renamed the working variable to be the same name, the only thing that would be different in the code segments, is the conditional that is checked as part of the if clause.

matching = []
for user in users do
  if (something_specific_goes_here)
    matching << user
  end
end
matching

For a number of languages, you would have to live with that duplication, but this is Ruby, so we can use an escape hatch to make this code more generic and abstract.

The only thing that is different, is the conditional, a.k.a. the predicate. The term predicate signifies a method, or function, that returns a boolean result.

So if we want to abstract out the filtering out of items that match some predicate condition, we can use a lambda, or Proc, call that predicate passing in the User to see if we get a true returned.

def filter(users, predicate)
  matching = []
  for user in users do
    if (predicate.call(user))
      matching << user
    end
  end
  matching
end

We now have something that looks like we can re-use it elsewhere for a list of Users.

So let’s test it out, by redoing the previous checks to use the new filter method we just defined.

First we will call our new filter, and pass it a lambda that looks at the count of the languages that user created.

puts "\n\nThe names for users who have created more than one language (using `filter` method):..."
multi_language_creators = filter(users, lambda{|u| u.languages_created.count > 1 })
puts get_names_for(multi_language_creators)
# Alan Kay
# Robert Virding
# => nil

Looks to be the same as the previous version.

Let’s try it with finding those that don’t have a known date of death.

puts "\n\nThe names for users who are not dead (using `filter` method):..."
not_dead_users = filter(users, lambda{|u| not u.date_of_death})
puts get_names_for(not_dead_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
# 
# => nil

So far, so good.

Finally, we try it for active users.

puts "\n\nThe names for users who are active (using `filter` method):..."
filtered_active_users = filter(users, lambda{|u| u.active?})
puts get_names_for(filtered_active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil

Yay! We have extracted a common pattern of our code out into something that represents a higher abstraction of filtering out users with a certain condition from a list of User objects.

Not only that, we have separated the concern of iterating over items and checking each item, from the concern of the actual condition we care about.

This seems pretty useful, and something that would apply beyond just a list of User objects.

Let’s see if we can do this for some Array of numbers as well.

Integers in Ruby have an even? method that we can use to know if a number is even.

puts "is 1 even???"
puts 1.even?

To get the even numbers from an Array of numbers, we have some code that looks very familiar.

even_numbers = []
for i in [1, 2, 3, 4, 5, 6, 7] do
  if (i.even?)
    even_numbers << i
  end
end

Let’s try out our new filter method, and see if we can use it on a list of numbers, and only get back those that are even.

puts "\n\nEven numbers"
evens = filter([1, 2, 3, 4, 5, 6, 7], lambda{|i| i.even?})
puts evens
# 2
# 4
# 6
# => nil

That works!!! Let the celebration commence!!!

Well, let it commence after we clean up our filter method to let it represent that it is for more than just a list of Users.

def filter(items, predicate)
  matching = []
  for item in items do
    if (predicate.call(item))
      matching << item
    end
  end
  matching
end

Instead of users, we change it to be items, and an individual item instead of user in that list we loop through.

We can also use our filter method against Ranges and Hashes.

puts "\n\nOdd numbers"
odds = filter((1..7), lambda{|i| i.odd?})
puts odds
# 1
# 3
# 5
# 7
# => nil
puts filter({1 => :a, 2 => :b, 3 => :c, 4 => :d}, lambda{|(key, value)| key.even?}).inspect
# [[2, :b], [4, :d]]
# => nil

So by taking advantage of lambdas, Procs, or even blocks in Ruby, we have been able to extract out a recurring pattern in our code and give it a name.

Not only that, but saw how we could write the start of one do it ourselves, and with some work, we could get it to return a proper hash instead of a list of key-value lists.

-Proctor

2 thoughts on “Ruby Tuesday – Refactoring towards creating filter

  1. Pingback: Ruby Tuesday – Refactoring Towards Creating mapProctor It | Proctor It

  2. Pingback: Ruby Tuesday – Refactoring towards composeProctor It | Proctor It

Comments are closed.