Today’s Ruby Tuesday takes a look at the concept of filter, a.k.a. select
in Ruby, and how we could create our own version of it through some refactoring.
Filter is a function/method that can really start to change the way you think about your programs, and start helping you to take advantage of smaller building blocks that compose, or assemble, together to create nice reusable pieces of code.
To get an understanding of when and where filter can be powerful, and how you could create filter
on your own if not already given to you as Enumerable#select
, we’ll look at some “typical” style code that would look like something you are likely to have encountered in your current code base, or past code bases.
For this guide, we have a User
class, and we will require 'date'
since we want the user to have a date of birth, and a date of death, since we will be using some historical figures in the world of Computer Science.
require 'date' class User attr_reader :name, :date_of_birth, :date_of_death, :languages_created def initialize(name:, is_active:, date_of_birth: nil, date_of_death: nil, languages_created: []) @name = name @is_active = is_active @date_of_birth = date_of_birth @date_of_death = date_of_death @languages_created = languages_created end def active? @is_active end def to_s inspect end end
We create add some User
objects of creators of various programming languages, and add them to an Array
of Users
.
alan_kay = User.new(name: "Alan Kay", is_active: true, date_of_birth: Date.new(1940, 5, 17), languages_created: ["Smalltalk", "Squeak"]) john_mccarthy = User.new(name: "John McCarthy", is_active: true, date_of_birth: Date.new(1927, 9, 4), date_of_death: Date.new(2011, 10, 24), languages_created: ["Lisp"]) robert_virding = User.new(name: "Robert Virding", is_active: true, languages_created: ["Erlang", "LFE"]) dennis_ritchie = User.new(name: "Dennis Ritchie", is_active: true, date_of_birth: Date.new(1941, 9, 9), date_of_death: Date.new(2011, 10, 12), languages_created: ["C"]) james_gosling = User.new(name: "James Gosling", is_active: true, date_of_birth: Date.new(1955, 5, 19), languages_created: ["Java"]) matz = User.new(name: "Yukihiro Matsumoto", is_active: true, date_of_birth: Date.new(1965, 4, 14), languages_created: ["Ruby"]) nobody = User.new(name: "", is_active: false) users = [alan_kay, john_mccarthy, robert_virding, dennis_ritchie, james_gosling, matz, nobody]
For most of our cases, we will want an easy way to see what users we have as a result of some operation, so lets define a method that returns a list of just the names for a given list of users.
def get_names_for(users) names = [] for user in users do names << user.name end names end
So somewhere in our code base we have an area of code that wants to get only the active users from a given list of User
objects.
We do our standard for
loop, as we would do in so many languages, and we have an if
clause that checks the active?
method on a User
. We do some other processing on that list which we will represent as puts
ing out the names of the result.
active_users = [] for user in users do if (user.active?) active_users << user end end puts "\n\nThe active users' names are:..." puts get_names_for(active_users) # Alan Kay # John McCarthy # Robert Virding # Dennis Ritchie # James Gosling # Yukihiro Matsumoto # => nil
Somewhere else in our code base, we have something that wants a list of the language creators that are still alive, because wouldn’t it be cool that we might happen to have the chance to get to have lunch with them during a conference.
alive_users = [] for user in users do if (not user.date_of_death) alive_users << user end end puts "\n\nThe alive users' names are:..." puts get_names_for(alive_users) # Alan Kay # Robert Virding # James Gosling # Yukihiro Matsumoto # # => nil
And again, the puts
just represents some processing of that list.
Yet somewhere else, we have some code that looks for those people that we know to have created more than one programming language.
users_created_more_than_one_language = [] for user in users do if (user.languages_created.count > 1) users_created_more_than_one_language << user end end puts "\n\nThe names for users who have created more than one language:..." puts get_names_for(users_created_more_than_one_language) # Alan Kay # Robert Virding # => nil
If we take a look at our three segments of code above, after a while, if you haven’t already, you will start to notice that they all are very, very similar.
They all:
– create an empty array and assign it to a variable that represents the working list of items that meet some condition,
– iterate over all the items in the list of users
– for each item, it checks some condition,
– add the user to the working copy variable if the condition is true
– return the working copy of items that meet the condition.
If we renamed the working variable to be the same name, the only thing that would be different in the code segments, is the conditional that is checked as part of the if
clause.
matching = [] for user in users do if (something_specific_goes_here) matching << user end end matching
For a number of languages, you would have to live with that duplication, but this is Ruby, so we can use an escape hatch to make this code more generic and abstract.
The only thing that is different, is the conditional, a.k.a. the predicate. The term predicate signifies a method, or function, that returns a boolean result.
So if we want to abstract out the filtering out of items that match some predicate condition, we can use a lambda
, or Proc
, call that predicate passing in the User
to see if we get a true
returned.
def filter(users, predicate) matching = [] for user in users do if (predicate.call(user)) matching << user end end matching end
We now have something that looks like we can re-use it elsewhere for a list of User
s.
So let’s test it out, by redoing the previous checks to use the new filter
method we just defined.
First we will call our new filter
, and pass it a lambda that looks at the count of the languages that user created.
puts "\n\nThe names for users who have created more than one language (using `filter` method):..." multi_language_creators = filter(users, lambda{|u| u.languages_created.count > 1 }) puts get_names_for(multi_language_creators) # Alan Kay # Robert Virding # => nil
Looks to be the same as the previous version.
Let’s try it with finding those that don’t have a known date of death.
puts "\n\nThe names for users who are not dead (using `filter` method):..." not_dead_users = filter(users, lambda{|u| not u.date_of_death}) puts get_names_for(not_dead_users) # Alan Kay # Robert Virding # James Gosling # Yukihiro Matsumoto # # => nil
So far, so good.
Finally, we try it for active users.
puts "\n\nThe names for users who are active (using `filter` method):..." filtered_active_users = filter(users, lambda{|u| u.active?}) puts get_names_for(filtered_active_users) # Alan Kay # John McCarthy # Robert Virding # Dennis Ritchie # James Gosling # Yukihiro Matsumoto # => nil
Yay! We have extracted a common pattern of our code out into something that represents a higher abstraction of filtering out users with a certain condition from a list of User
objects.
Not only that, we have separated the concern of iterating over items and checking each item, from the concern of the actual condition we care about.
This seems pretty useful, and something that would apply beyond just a list of User
objects.
Let’s see if we can do this for some Array
of numbers as well.
Integers in Ruby have an even?
method that we can use to know if a number is even.
puts "is 1 even???" puts 1.even?
To get the even numbers from an Array
of numbers, we have some code that looks very familiar.
even_numbers = [] for i in [1, 2, 3, 4, 5, 6, 7] do if (i.even?) even_numbers << i end end
Let’s try out our new filter
method, and see if we can use it on a list of numbers, and only get back those that are even.
puts "\n\nEven numbers" evens = filter([1, 2, 3, 4, 5, 6, 7], lambda{|i| i.even?}) puts evens # 2 # 4 # 6 # => nil
That works!!! Let the celebration commence!!!
Well, let it commence after we clean up our filter
method to let it represent that it is for more than just a list of User
s.
def filter(items, predicate) matching = [] for item in items do if (predicate.call(item)) matching << item end end matching end
Instead of users
, we change it to be items
, and an individual item
instead of user
in that list we loop through.
We can also use our filter
method against Range
s and Hash
es.
puts "\n\nOdd numbers" odds = filter((1..7), lambda{|i| i.odd?}) puts odds # 1 # 3 # 5 # 7 # => nil puts filter({1 => :a, 2 => :b, 3 => :c, 4 => :d}, lambda{|(key, value)| key.even?}).inspect # [[2, :b], [4, :d]] # => nil
So by taking advantage of lambda
s, Proc
s, or even blocks in Ruby, we have been able to extract out a recurring pattern in our code and give it a name.
Not only that, but saw how we could write the start of one do it ourselves, and with some work, we could get it to return a proper hash instead of a list of key-value lists.
-Proctor
Pingback: Ruby Tuesday – Refactoring Towards Creating mapProctor It | Proctor It
Pingback: Ruby Tuesday – Refactoring towards composeProctor It | Proctor It