Category Archives: Ruby

Ruby Tuesday – String#bytesize

Today’s Ruby Tuesday takes a look at String#bytesize.

String#bytesize returns how many bytes the string uses to represent itself. For standard ASCII characters, the result of bytesize is the same as the length.

"Hello, World!".bytesize
# => 13
"Hello, World!".length
# => 13

But if we go to Google Translate, have it translate “Hello, World” into Japanese, and take that result as the string to call length and bytesize on, we see they can be drastically different.

"こんにちは世界".bytesize
# => 21
"こんにちは世界".length
# => 7

And if you take a string that is just an emoji, say the floppy disk (💾), and call bytesize on that string, you get a value of 4 returned, where the length of the string is only 1.

Why is this important?

The big reason is that you can start to see how much space strings can take up in your storage mechanism of choice (e.g. memory, database, or flat files), and how that can cause issues if you are just checking the length of a string for max size of a field, which may be under the available bytes for storage, but the number of bytes in the string would be too big.

–Proctor

Ruby Tuesday – String#each_char

Today’s Ruby Tuesday covers String#each_char.

String#each_char takes a block, and passes each character in the string to the block.

"hello".each_char{|char| print char.upcase}; puts
# HELLO
# => nil

Note that String#each_char does not modify the original string, and that any intended side effects done in the block must be captured some how if desired.

"hello".each_char{|char| char.upcase}
=> "hello"

If the string is empty, the block is never invoked, as there are no characters in the string to call the block with.

"".each_char{|char| puts "xxx"}; puts "done"
# done
# => nil

If no block is given, String#each_char returns an enumerator, which opens up all of the other methods that Enumerable provides.

"hello".each_char.map{|char| char.upcase}.join
# => "HELLO"
"hello".each_char.select{|char| ["a", "e", "i", "o", "u"].include?(char)}.join
# => "eo"
 "hello".each_char.reject{|char| ["a", "e", "i", "o", "u"].include?(char)}.join
# => "hll"

But then again, String provides a method chars which returns all of the characters in the string as an array.

"hello".chars
# => ["h", "e", "l", "l", "o"]
"".chars
# => []

Which means we also get the full Enumerable on that as well.

"hello".chars.select{|char| ["a", "e", "i", "o", "u"].include?(char)}.join
# => "eo"
"hello".chars.reject{|char| ["a", "e", "i", "o", "u"].include?(char)}.join
# => "hll"

Ruby Tuesday – String#casecmp

Today’s Ruby Tuesday covers String#casecmp.

String#casecmp is a kind of oddly named method, as what it does is a case insensitive string comparison.

The return value of String#casecmp is either -1, 0, or 1, depending on if the item casecmp is being called on is less than, equal or greater than the string passed in as an argument.

"foobar".casecmp("FOOBAR")
# => 0
"abcdeft".casecmp("ABCDEFG")
# => 1
"abcdefg".casecmp("ABCDEFG")
# => 0
"A".casecmp("b")
# => -1
"a".casecmp("B")
# => -1
"z".casecmp("A")
# => 1
"Z".casecmp("a")
# => 1
"z" <=> "A"
# => 1
"A" <=> "Z"
# => -1
"a" <=> "Z"
# => 1
"Z" <=> "a"
# => -1

This can be handy if you are trying to match two strings by doing a downcase or upcase on the strings, as well as being more clear about what you are trying to accomplish with the comparision.

You can also take advantage of using String#casecmp if you even need to do sorting of items by their name regardless of case.

["foo", "a", "Z", "Foo", "buzz", "FOO"].sort do |a, b|
  result = a.casecmp(b)
  if (result == 0)
    result = a <=> b
  end
  result
end
# => ["a", "buzz", "FOO", "Foo", "foo", "Z"]

–Proctor

Ruby Tuesday – String#rjust

Today’s Ruby Tuesday is on String#rjust.

String#rjust takes an integer, N, and returns a string of length N with the string that rjust was invoked on right aligned.

"Name".rjust 20
# => "                Name"
"Email".rjust 20
# => "               Email"
"Password".rjust 20
# => "            Password"

If the integer value passed to rjust is larger than the length of the string to right justify, the return value is a new string with the same content as the original. We can see this by comparing the object_id of the string rjust is called on with the object_id of the resulting string.

"foobar".rjust(4)
# => "foobar"
f = "foobar"
# => "foobar"
f.object_id
# => 70206438160440
f.rjust(2).object_id
# => 70206429587280
f.rjust(2).object_id
# => 70206438091960

String#rjust can also take a non-empty string as its second argument, and uses that string as the characters to pad the result with.

"Password".rjust(20, '_')
# => "____________Password"
"Password".rjust(20, '_-_')
# => "_-__-__-__-_Password"
"Password".rjust(20, ("a".."z").to_a.join)
# => "abcdefghijklPassword"
"Password".rjust(20, "")
# ArgumentError: zero width padding
# from (pry):16:in `rjust'

–Proctor

Ruby Tuesday – String#intern

Today’s Ruby Tuesday covers String#intern.

If you are not familiar with the concept of interning, it is when the programming language/vm uses the same reference instead of multiple copies of equal objects. This is able to be done by having immutable objects, and knowing that if two objects are the same, it is safe to represent them as the same exact thing.

In Ruby String objects are mutable, so they must be compared by using an equality operator, and they get different object ids, since they can be changed at any time.

Enter String#intern. String#intern locks down a string, and creates an immutable object that represents that string, and which reference can be re-used where ever that interned string is needed.

And the result of interning a string in Ruby? A symbol.

'foo'.intern
# => :foo
'bar'.intern
# => :bar

The benefit of using String#intern in addition to being able to create symbols from a dynamic set of shared data, such as CSV or database table headers, is that it gives you as the developer a nice way to create some complex symbols without needing to worry about the correct way of quoting a symbol as well.

'Bar'.intern
# => :Bar
'string with spaces'.intern
# => :"string with spaces"
'comic cursing: @*%$*^!!!'.intern
# => :"comic cursing: @*%$*^!!!"
'"'.intern
# => :"""
"'".intern
# => :"'"

As we can see below, that every time we reference the string "foo", it creates a new object for that string, as seen by the different object_ids for x, y, and z. But once we intern them, even though they were different objects, they share the same interned object.

x = "foo"
=> "foo"
y = "foo"
=> "foo"
z = "foo"
=> "foo"
x.object_id
=> 70297488464940
y.object_id
=> 70297488429980
z.object_id
=> 70297488410060
x.intern.object_id
=> 592808
y.intern.object_id
=> 592808
z.intern.object_id
=> 592808

String#intern is also aliased as String#to_sym.

'foo'.to_sym
=> :foo
'bar'.to_sym
=> :bar

I wanted to highlight this as String#intern instead of String#to_sym, as having an understanding about interning is something that is useful beyond just Ruby String objects, and is applicable across multiple programming languages.

–Proctor

Ruby Tuesday – Random::new_seed

Today’s Ruby Tuesday is on Random::new_seed.

Random::new_seed returns a new arbitrary seed value. By generating and capturing the seed value, we can have multiple instances of Random generate the same sequence of random numbers if they were constructed with the same seed.

seed = Random.new_seed
# => 90121465857858294451245401342699150799
Random.new(seed).rand(1_000_000_000)
# => 966720783
Random.new(seed).rand(1_000_000_000)
# => 966720783
Random.new(seed).rand(1_000_000_000)
# => 966720783
Random.new(seed).rand(1_000_000_000)
# => 966720783
Random.new(seed).rand(1_000_000_000)
# => 966720783

In what real world case would we care about capturing a seed?

One example where this becomes useful is creating random sets of test data, especially when one is trying to do a very, very, basic version of generative testing.

def random_list(size, seed=Random.new_seed)
  puts "seed used to generate list was: #{seed}"
  prng = Random.new(seed)
  (1..size).map{|_| prng.rand(1_000_000)}
end


random_list(10)
# seed used to generate list was: 186039884741241642189311371060927079314
# => [333029, 833700, 863953, 325452, 761340, 165891, 818711, 35680, 970562, 926764]
random_list(10)
# seed used to generate list was: 195630211850073328706621905093237636602
# => [414039, 78807, 761787, 93581, 912224, 334025, 139492, 597469, 191557, 637405]
random_list(10)
# seed used to generate list was: 305942993230783695517144566027975028636
# => [459072, 417794, 851547, 51516, 299288, 859682, 514847, 356177, 436546, 63844]

By defining a helper method like random_list above, and having it print out the seed it was using, if we use this list in a test, and that test case fails, we can reproduce the test case by getting the list generated by using the appropriate seed.

random_list(10, 195630211850073328706621905093237636602)
# seed used to generate list was: 195630211850073328706621905093237636602
# => [414039, 78807, 761787, 93581, 912224, 334025, 139492, 597469, 191557, 637405]

–Proctor

Ruby Tuesday – Functional fizzbuzz

Today we are going to take a quick break from looking at some different library methods for today’s Ruby Tuesday post.

Last Friday I was going back and forth with a teammate over our chat tool about some different programming problems we could use to do some very basic screening of candidates, and I mentioned that if nothing else there is always fizzbuzz.

fizzbuzz, for anybody who at this point has yet to encounter it, even by just blog posts such as this, take a number N, and prints out fizz if the number is a multiple of three, buzz if the number is a multiple of 5, fizzbuzz if the number is both a multiple of three and five, or else just the number itself if none of the conditions hold true.

While fizzbuzz, isn’t a great problem, and, depending on the circles you are in, can be pretty worthless because most candidates have seen it already, it can still have its place. But, sadly, there are developer candidates we encounter that have a hard time translating the problem above, which states the algorithm in English, directly into code. The reason I think fizzbuzz can be interesting, as I mentioned to my coworker, is it can give candidates a chance to show off some more advanced topics that they wish to show off, if they wish.

After mentioning that benefit to him on Friday, I came across the post “Bro, Do You Even FizzBuzz?!?” on fizzbuzz in Clojure and shows off a solution of not using the modulus operator to determine the result.

I decided I would attempt to translate the solution to Ruby, to prove out the point of our conversation.

def do_fizzbuzz(n)
  fizz = ['', '', 'fizz'].cycle.lazy
  buzz = ['', '', '', '', 'buzz'].cycle.lazy
  fizzbuzz = fizz.zip(buzz).map(&:join)
  puts (1..n).
         zip(fizzbuzz).
         map {|number, translation| if translation.empty? then number else translation end}.
         join("n")
end

First, we declare the “fizzes” as a lazy repeating array of blank strings, where every third is the string ‘fizz’, by using cycle and lazy. We do the same for the occurrences of buzz, but have it be every fifth item in an array.

We then zip the fizz and buzz lazy enumerations together, and then map over that result joining the two strings together, e.g. ["", ""].join, ["fizz", ""].join, ["", "buzz"].join, or ["fizz", "buzz"].join.

After we have our lazy enumeration of fizzbuzz translations, we take the range of numbers from 1 to N, zip that together with the translations, pick the translation if it is not an empty string, otherwise pick the number, and then join all of the results together with a newline separator, so each entry will print on its own line.

There you have it, a functional programming style fizzbuzz solution that does not use the modulus operator to arrive at a solution.

–Proctor

Ruby Tuesday – String#squeeze

Today’s Ruby Tuesday is on String#squeeze.

String#squeeze replaces runs of repeated characters in a source string down to a single character in the returned string.

"foo".squeeze
# => "fo"
"bar".squeeze
# => "bar"
"Mississippi".squeeze
# => "Misisipi"

String#squeeze can also take a string an argument. If a string is passed in, that string is seen as a set of characters to use for squeezing against the source string.

"Mississippi".squeeze("")
# => "Mississippi"
"Mississippi".squeeze("s")
# => "Misisippi"
"Mississippi".squeeze("abcprt")
# => "Mississipi"
"Mississippi".squeeze("sp")
# => "Misisipi"

If a string contains a - between two characters, that string is seen as containing a sequence of characters to include in the set of chacters to match for squeezing against the source string, but if it is first or last the - is seen as a character by itself.

"Mississippi".squeeze("a-r")
# => "Mississipi"
"Mississippi".squeeze("q-z")
# => "Misisippi"
"Mississippi".squeeze("pr-t")
# => "Misisipi"
"^^--~".squeeze('-~^')
# => "^-~"

If a string starts with a ^ then that string is seen as the negation of the set of characters that were given in that string to use, but if it is not the first character, then it is just seen as a normal ^ character.

"Mississippi".squeeze("^a-r")
# => "Misisippi"
"^^--~".squeeze('-^')
=> "^-~"

Backslashes can also be used to escape characters from any special meaning, but when using the backslash character in a double quoted string, then you have to make sure to escape that backslash with another backslash for it to be treated as an escape character in the string to String#squeeze.

"^^--~".squeeze('^')
# => "^--~"
"^^--~".squeeze("\^")
# => "^--~"
"^^--~~".squeeze('~-^')
# => "^-~"

String#squeeze doesn’t modify the source string given, but returns a new instance of a string for the result. If you do need to modify the string in place for some reason, there exists a String#squeeze! as well, but like many other Ruby methods that end in a !, if the original item is not modified a nil is returned.

str = "Mississippi"
# => "Mississippi"
str.squeeze
# => "Misisipi"
str
# => "Mississippi"
str.squeeze!
# => "Misisipi"
str
# => "Misisipi"
"bar".squeeze!
# => nil
"foo".squeeze!
# => "fo"

–Proctor

Ruby Tuesday – Array#fetch

Today’s Ruby Tuesday covers a variation of one of my personal favorite set of Ruby methods, Array#fetch.

Array.fetch takes an integer for an argument and gets the element for a given index, using 0-based index scheme.

icao = [:alpha, :bravo, :charlie, :delta, :foxtrot]
# => [:alpha, :bravo, :charlie, :delta, :foxtrot]
icao.fetch 1
# => :bravo
icao.fetch 0
# => :alpha

It also can get you the elements walking backward from a list

icao.fetch -1
# => :foxtrot
icao.fetch -4
# => :bravo

If an index is out of bounds of the array, either positive of negative, an IndexError is raised.

icao.fetch 5
# IndexError: index 5 outside of array bounds: -5...5
# from (pry):5:in `fetch'
icao.fetch -7
# IndexError: index -7 outside of array bounds: -5...5
# from (pry):11:in `fetch'

At this point, I can hear you saying:

So what?! I can do this using a subscript to an array and not get exceptions but a nil when something is not found.
–You

icao[4]
# => :foxtrot
icao[0]
# => :alpha
icao[-1]
# => :foxtrot
icao[-7]
# => nil
icao[5]
# => nil

And you would be right.

But where you might run into trouble with just using a subscript index, is if you have nils in your array.

some_nils[2]
# => nil
some_nils[100]
# => nil

This is where Array#fetch makes its way to one of my favorite Ruby methods.

When you decide to use Array#fetch over just subscripts, you can specify a default value that you get back if the index is out of bounds.

icao.fetch 2, :fubar
# => :charlie
icao.fetch 5, :snafu
# => :snafu
icao.fetch -7, :fubar
# => :fubar

And to top it off, you can even pass a whole block which is called with the index passed to fetch, so you can do more detailed logic if needed.

icao.fetch(-7) { |index| "SNAFU occured trying to fetch with index #{index}" }
# => "SNAFU occured trying to fetch with index -7"

–Proctor

Ruby Tuesday – Array#sample

Today’s Ruby Tuesday is on Array#sample.

Array#sample has a couple of different forms it can take.

The first form it can take is a version where no arguments are passed; in this case it returns a random element from the array it was called on, or nil if the array is empty.

[1, 2, 3, 4, 5].sample
# => 4
[1, 2, 3, 4, 5].sample
# => 1
[1, 2, 3, 4, 5].sample
# => 2
[].sample
# => nil
[:heart, :diamond, :club, :spade].sample
# => :club

Array#sample can also take as its argument a count of the number of random items to be returned.

[1, 2, 3, 4, 5].sample 2
# => [1, 2]
[1, 2, 3, 4, 5].sample 2
# => [2, 1]
[1, 2, 3, 4, 5].sample 2
# => [1, 3]
[1, 2, 3, 4, 5].sample 2
# => [3, 5]
[1, 2, 3, 4, 5].sample 2
# => [5, 2]

If the argument to the count is greater than, or equal to, the number of items in the array, then the array is returned.

[:heart, :diamond, :club, :spade].sample 6
# => [:heart, :diamond, :spade, :club]

This has two subtle but important consequences. First if we call sample with a count on an empty array, an empty array is returned, instead of a nil like we would get back if we just call Array#sample with no count argument. This includes calling Array#sample with a count of zero on an empty array.

[].sample
# => nil
[].sample 0
# => []
[].sample 1
# => []
[].sample 3
# => []

Second, it shows that as we sample the array, we are not going to get duplicate elements, unless they were in the array to being with.

On both of these forms, a random number generator can also be passed. This can give us a repeatable generation sequence as we can prime the generator with a given seed.

generatorA = Random.new 1
# => #<Random:0x007f86c3925878>
(1..100).to_a.sample(random: generatorA)
# => 38
(1..100).to_a.sample(random: generatorA)
# => 13
(1..100).to_a.sample(random: generatorA)
# => 73
generatorB = Random.new 1
# => #<Random:0x007f86c497dd10>
(1..100).to_a.sample(random: generatorB)
# => 38
(1..100).to_a.sample(random: generatorB)
# => 13
(1..100).to_a.sample(random: generatorB)
# => 73
(1..100).to_a.sample(random: generatorB)
# => 10
(1..100).to_a.sample(random: generatorA)
# => 10

(1..100).to_a.sample(4, random: generatorA)
# => [76, 6, 82, 66]
(1..100).to_a.sample(4, random: generatorA)
# => [17, 2, 79, 74]
(1..100).to_a.sample(4, random: generatorB)
# => [76, 6, 82, 66]
(1..100).to_a.sample(4, random: generatorB)
# => [17, 2, 79, 74]

And if we want to pick a random card from a deck of cards, or even draw a 5 card hand we can use Array#sample on an array of “playing cards”.

suit = [:heart, :diamond, :club, :spade]
# => [:heart, :diamond, :club, :spade]
rank = [:ace, 2, 3, 4, 5, 6, 7, 8, 9, :jack, :queen, :king]
# => [:ace, 2, 3, 4, 5, 6, 7, 8, 9, :jack, :queen, :king]
suit.product(rank).sample 1
# => [[:heart, :jack]]
suit.product(rank).sample 5
# => [[:club, 2], [:diamond, 2], [:spade, 7], [:club, 8], [:spade, :king]]

–Proctor