Today’s Ruby Tuesday takes a look at String#bytesize.
String#bytesize
returns how many bytes the string uses to represent itself. For standard ASCII characters, the result of bytesize
is the same as the length.
"Hello, World!".bytesize # => 13 "Hello, World!".length # => 13
But if we go to Google Translate, have it translate “Hello, World” into Japanese, and take that result as the string to call length
and bytesize
on, we see they can be drastically different.
"こんにちは世界".bytesize # => 21 "こんにちは世界".length # => 7
And if you take a string that is just an emoji, say the floppy disk (💾), and call bytesize
on that string, you get a value of 4
returned, where the length
of the string is only 1
.
Why is this important?
The big reason is that you can start to see how much space strings can take up in your storage mechanism of choice (e.g. memory, database, or flat files), and how that can cause issues if you are just checking the length
of a string for max size of a field, which may be under the available bytes for storage, but the number of bytes in the string would be too big.
–Proctor