Today’s Ruby Tuesday takes a look at the concept of filter, a.k.a. select in Ruby, and how we could create our own version of it through some refactoring.
Filter is a function/method that can really start to change the way you think about your programs, and start helping you to take advantage of smaller building blocks that compose, or assemble, together to create nice reusable pieces of code.
To get an understanding of when and where filter can be powerful, and how you could create filter on your own if not already given to you as Enumerable#select, we’ll look at some “typical” style code that would look like something you are likely to have encountered in your current code base, or past code bases.
For this guide, we have a User class, and we will require 'date' since we want the user to have a date of birth, and a date of death, since we will be using some historical figures in the world of Computer Science.
require 'date'
class User
attr_reader :name, :date_of_birth, :date_of_death, :languages_created
def initialize(name:, is_active:, date_of_birth: nil,
date_of_death: nil, languages_created: [])
@name = name
@is_active = is_active
@date_of_birth = date_of_birth
@date_of_death = date_of_death
@languages_created = languages_created
end
def active?
@is_active
end
def to_s
inspect
end
end
We create add some User objects of creators of various programming languages, and add them to an Array of Users.
alan_kay = User.new(name: "Alan Kay",
is_active: true,
date_of_birth: Date.new(1940, 5, 17),
languages_created: ["Smalltalk", "Squeak"])
john_mccarthy = User.new(name: "John McCarthy",
is_active: true,
date_of_birth: Date.new(1927, 9, 4),
date_of_death: Date.new(2011, 10, 24),
languages_created: ["Lisp"])
robert_virding = User.new(name: "Robert Virding",
is_active: true,
languages_created: ["Erlang", "LFE"])
dennis_ritchie = User.new(name: "Dennis Ritchie",
is_active: true,
date_of_birth: Date.new(1941, 9, 9),
date_of_death: Date.new(2011, 10, 12),
languages_created: ["C"])
james_gosling = User.new(name: "James Gosling",
is_active: true,
date_of_birth: Date.new(1955, 5, 19),
languages_created: ["Java"])
matz = User.new(name: "Yukihiro Matsumoto",
is_active: true,
date_of_birth: Date.new(1965, 4, 14),
languages_created: ["Ruby"])
nobody = User.new(name: "",
is_active: false)
users = [alan_kay, john_mccarthy, robert_virding,
dennis_ritchie, james_gosling, matz, nobody]
For most of our cases, we will want an easy way to see what users we have as a result of some operation, so lets define a method that returns a list of just the names for a given list of users.
def get_names_for(users)
names = []
for user in users do
names << user.name
end
names
end
So somewhere in our code base we have an area of code that wants to get only the active users from a given list of User objects.
We do our standard for loop, as we would do in so many languages, and we have an if clause that checks the active? method on a User. We do some other processing on that list which we will represent as putsing out the names of the result.
active_users = []
for user in users do
if (user.active?)
active_users << user
end
end
puts "\n\nThe active users' names are:..."
puts get_names_for(active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil
Somewhere else in our code base, we have something that wants a list of the language creators that are still alive, because wouldn’t it be cool that we might happen to have the chance to get to have lunch with them during a conference.
alive_users = []
for user in users do
if (not user.date_of_death)
alive_users << user
end
end
puts "\n\nThe alive users' names are:..."
puts get_names_for(alive_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
#
# => nil
And again, the puts just represents some processing of that list.
Yet somewhere else, we have some code that looks for those people that we know to have created more than one programming language.
users_created_more_than_one_language = []
for user in users do
if (user.languages_created.count > 1)
users_created_more_than_one_language << user
end
end
puts "\n\nThe names for users who have created more than one language:..."
puts get_names_for(users_created_more_than_one_language)
# Alan Kay
# Robert Virding
# => nil
If we take a look at our three segments of code above, after a while, if you haven’t already, you will start to notice that they all are very, very similar.
They all:
– create an empty array and assign it to a variable that represents the working list of items that meet some condition,
– iterate over all the items in the list of users
– for each item, it checks some condition,
– add the user to the working copy variable if the condition is true
– return the working copy of items that meet the condition.
If we renamed the working variable to be the same name, the only thing that would be different in the code segments, is the conditional that is checked as part of the if clause.
matching = []
for user in users do
if (something_specific_goes_here)
matching << user
end
end
matching
For a number of languages, you would have to live with that duplication, but this is Ruby, so we can use an escape hatch to make this code more generic and abstract.
The only thing that is different, is the conditional, a.k.a. the predicate. The term predicate signifies a method, or function, that returns a boolean result.
So if we want to abstract out the filtering out of items that match some predicate condition, we can use a lambda, or Proc, call that predicate passing in the User to see if we get a true returned.
def filter(users, predicate)
matching = []
for user in users do
if (predicate.call(user))
matching << user
end
end
matching
end
We now have something that looks like we can re-use it elsewhere for a list of Users.
So let’s test it out, by redoing the previous checks to use the new filter method we just defined.
First we will call our new filter, and pass it a lambda that looks at the count of the languages that user created.
puts "\n\nThe names for users who have created more than one language (using `filter` method):..."
multi_language_creators = filter(users, lambda{|u| u.languages_created.count > 1 })
puts get_names_for(multi_language_creators)
# Alan Kay
# Robert Virding
# => nil
Looks to be the same as the previous version.
Let’s try it with finding those that don’t have a known date of death.
puts "\n\nThe names for users who are not dead (using `filter` method):..."
not_dead_users = filter(users, lambda{|u| not u.date_of_death})
puts get_names_for(not_dead_users)
# Alan Kay
# Robert Virding
# James Gosling
# Yukihiro Matsumoto
#
# => nil
So far, so good.
Finally, we try it for active users.
puts "\n\nThe names for users who are active (using `filter` method):..."
filtered_active_users = filter(users, lambda{|u| u.active?})
puts get_names_for(filtered_active_users)
# Alan Kay
# John McCarthy
# Robert Virding
# Dennis Ritchie
# James Gosling
# Yukihiro Matsumoto
# => nil
Yay! We have extracted a common pattern of our code out into something that represents a higher abstraction of filtering out users with a certain condition from a list of User objects.
Not only that, we have separated the concern of iterating over items and checking each item, from the concern of the actual condition we care about.
This seems pretty useful, and something that would apply beyond just a list of User objects.
Let’s see if we can do this for some Array of numbers as well.
Integers in Ruby have an even? method that we can use to know if a number is even.
puts "is 1 even???" puts 1.even?
To get the even numbers from an Array of numbers, we have some code that looks very familiar.
even_numbers = []
for i in [1, 2, 3, 4, 5, 6, 7] do
if (i.even?)
even_numbers << i
end
end
Let’s try out our new filter method, and see if we can use it on a list of numbers, and only get back those that are even.
puts "\n\nEven numbers"
evens = filter([1, 2, 3, 4, 5, 6, 7], lambda{|i| i.even?})
puts evens
# 2
# 4
# 6
# => nil
That works!!! Let the celebration commence!!!
Well, let it commence after we clean up our filter method to let it represent that it is for more than just a list of Users.
def filter(items, predicate)
matching = []
for item in items do
if (predicate.call(item))
matching << item
end
end
matching
end
Instead of users, we change it to be items, and an individual item instead of user in that list we loop through.
We can also use our filter method against Ranges and Hashes.
puts "\n\nOdd numbers"
odds = filter((1..7), lambda{|i| i.odd?})
puts odds
# 1
# 3
# 5
# 7
# => nil
puts filter({1 => :a, 2 => :b, 3 => :c, 4 => :d}, lambda{|(key, value)| key.even?}).inspect
# [[2, :b], [4, :d]]
# => nil
So by taking advantage of lambdas, Procs, or even blocks in Ruby, we have been able to extract out a recurring pattern in our code and give it a name.
Not only that, but saw how we could write the start of one do it ourselves, and with some work, we could get it to return a proper hash instead of a list of key-value lists.
-Proctor
Pingback: Ruby Tuesday – Refactoring Towards Creating mapProctor It | Proctor It
Pingback: Ruby Tuesday – Refactoring towards composeProctor It | Proctor It