程式扎記: [ 常見問題 ] How to convert string to bytes in Ruby?

標籤

2014年11月15日 星期六

[ 常見問題 ] How to convert string to bytes in Ruby?

Source From Here
Question
How do I extend the String class, and attach a method named to_bytes?

How-To
Ruby already has a String#each_byte method which is aliased to String#bytes.

Prior to Ruby 1.9 strings were equivalent to bytes, i.e. a character was assumed to be a single byte. That's fine for ASCII text and the various text codings like Win-1252 andISO-8859-1 but fails badly with Unicode, which we see more and more often on the web. Ruby 1.9+ is Unicode aware, and strings are no longer considered to be made up of bytes, but instead are characters, which can be multiple bytes long.

So, if you are trying to manipulate text as single bytes, you'll need to ensure your input is ASCII, or at least a single-byte-based character set. If you might have multi-byte characters you should use String#each_char or String.split(//) or String.unpack with the U flag.

Example of unpack()
>> str = "AaBb123"
>> str.each_byte do |b|; printf("%c->%d\n", b, b); end # handle each byte of string
A->65
a->97
B->66
b->98
1->49
2->50
3->51
=> "AaBb123"

>> str.unpack('CCCC') # Pickup top 4 bytes: C->| Integer | 8-bit unsigned (unsigned char)
=> [65, 97, 66, 98]
>> vax_bo = str.unpack('v')[0] # v->| Integer | 16-bit unsigned, VAX (little-endian) byte order
=> 24897 # What is this number for? Because of the 'v' flag, the output means two bytes (16 bits) of little endian 
>> printf("%016b\n", vax_bo) # Print the vax_bo in binary format
0110000101000001
=> nil

>> bytes = str.unpack('CC') # Pickup top two bytes.
=> [65, 97]
>> printf("%08b%08b\n", bytes[1], bytes[0]) # The little endian means byte[0] is the least significant byte
# Little-endian systems, in contrast, store the least significant byte in the smallest address.
0110000101000001 # equals to vax_bo in binary format
=> nil 


沒有留言:

張貼留言

網誌存檔

關於我自己

我的相片
Where there is a will, there is a way!