开发者

Unpack signed little-endian in Ruby

开发者 https://www.devze.com 2023-02-15 09:03 出处:网络
So I\'m working on some MongoDB protocol stuff. All integers are signed little-endian. Using Ruby\'s standard Array#pack method, I can convert from an integer to the binary string I want just fine:

So I'm working on some MongoDB protocol stuff. All integers are signed little-endian. Using Ruby's standard Array#pack method, I can convert from an integer to the binary string I want just fine:

positive_one = Array(1).pack('V')   #=> '\x01\x00\x00\x00'
negative_one = Array(-1).pack('V')  #=> '\xFF\xFF\xFF\xFF'

However, going the other way, the String#unpack method has the 'V' format documented as specifically returning unsigned integers:

positive_one.unpack('V').first #=> 1
negative_one.unpack('V').first #=> 4开发者_JAVA百科294967295

There's no formatter for signed little-endian byte order. I'm sure I could play games with bit-shifting, or write my own byte-mangling method that doesn't use array packing, but I'm wondering if anyone else has run into this and found a simple solution. Thanks very much.


After unpacking with "V", you can apply the following conversion

class Integer
  def to_signed_32bit
    if self & 0x8000_0000 == 0x8000_0000
      self - 0x1_0000_0000  
    else
      self
    end
  end
end

You'll need to change the magic constants 0x1_0000_0000 (which is 2**32) and 0x8000_0000 (2**31) if you're dealing with other sizes of integers.


Edit I misunderstood the direction you were converting originally (according to the comment). But after thinking about it some, I believe the solution is still the same. Here is the updated method. It does the exact same thing, but the comments should explain the result:

def convertLEToNative( num )
    # Convert a given 4 byte integer from little-endian to the running
    # machine's native endianess.  The pack('V') operation takes the
    # given number and converts it to little-endian (which means that
    # if the machine is little endian, no conversion occurs).  On a
    # big-endian machine, the pack('V') will swap the bytes because
    # that's what it has to do to convert from big to little endian.  
    # Since the number is already little endian, the swap has the
    # opposite effect (converting from little-endian to big-endian), 
    # which is what we want. In both cases, the unpack('l') just 
    # produces a signed integer from those bytes, in the machine's 
    # native endianess.
    Array(num).pack('V').unpack('l')
end

Probably not the cleanest, but this will convert the byte array.

def convertLEBytesToNative( bytes )
    if ( [1].pack('V').unpack('l').first == 1 )
        # machine is already little endian
        bytes.unpack('l')
    else
        # machine is big endian
        convertLEToNative( Array(bytes.unpack('l')))
    end
end


This question has a method for converting signed to unsigned that might be helpful. It also has a pointer to the bindata gem which looks like it will do what you want.

BinData::Int16le.read("\000\f") # 3072

[edited to remove the not-quite-right s unpack directive]


For the sake of posterity, here's the method I eventually came up with before spotting Paul Rubel's link to the "classical method". It's kludgy and based on string manipulation, so I'll probably scrap it, but it does work, so someone might find it interesting for some other reason someday:

# Returns an integer from the given little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  bits = str.reverse.unpack('B*').first   # Get the 0s and 1s
  if bits[0] == '0'   # We're a positive number; life is easy
    bits.to_i(2)
  else                # Get the twos complement
    comp, flip = "", false
    bits.reverse.each_char do |bit|
      comp << (flip ? bit.tr('10','01') : bit)
      flip = true if !flip && bit == '1'
    end
    ("-" + comp.reverse).to_i(2)
  end
end

UPDATE: Here's the simpler refactoring, using a generalized arbitrary-length form of Ken Bloom's answer:

# Returns an integer from the given arbitrary length little-endian binary string.
# @param [String] str
# @return [Fixnum]
def self.bson_to_int(str)
  arr, bits, num = str.unpack('V*'), 0, 0
  arr.each do |int|
    num += int << bits
    bits += 32
  end
  num >= 2**(bits-1) ? num - 2**bits : num  # Convert from unsigned to signed
end
0

精彩评论

暂无评论...
验证码 换一张
取 消