Ruby Tuesday – String#bytesize

Today’s Ruby Tuesday takes a look at String#bytesize.

String#bytesize returns how many bytes the string uses to represent itself. For standard ASCII characters, the result of bytesize is the same as the length.

"Hello, World!".bytesize
# => 13
"Hello, World!".length
# => 13

But if we go to Google Translate, have it translate “Hello, World” into Japanese, and take that result as the string to call length and bytesize on, we see they can be drastically different.

"こんにちは世界".bytesize
# => 21
"こんにちは世界".length
# => 7

And if you take a string that is just an emoji, say the floppy disk (💾), and call bytesize on that string, you get a value of 4 returned, where the length of the string is only 1.

Why is this important?

The big reason is that you can start to see how much space strings can take up in your storage mechanism of choice (e.g. memory, database, or flat files), and how that can cause issues if you are just checking the length of a string for max size of a field, which may be under the available bytes for storage, but the number of bytes in the string would be too big.

–Proctor