开发者

Can I construct a BigInt with any byte array (Scala)?

开发者 https://www.devze.com 2022-12-09 23:33 出处:网络
I\'m trying to represent the result of an MD5 hash in the shortest possible string. It seems a waste to just turn it into a hex string and let G through Z go to waste.

I'm trying to represent the result of an MD5 hash in the shortest possible string. It seems a waste to just turn it into a hex string and let G through Z go to waste.

One idea I have had is getting the MD5 hash of my input as an array of bytes and constructing a BigInt with it. I can then call toString(36), and get the number as a base-36 in 开发者_如何学Goa string (-?[0-9a-z]*, the number can be positive or negative). It works for me.

Problem is, I'm not sure that a BigInt can be constructed with any array of bytes, and I can't prove it with testing (at least not in a timely way!). I assume so, because I understand that a BigInt can be of arbitrary size. I can't use this method until I know for sure that it will work for all possible outputs. So, can anyone tell me whether it will work for all inputs (or how to easily convert a byte array so it can be represented in base 36).

Clarification: I have the implementation, I'm asking about the behaviour over the whole domain (i.e. 00000000000000000000000000000000 to FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF)


Building on your feedback above, the following implementation will reliably encode/decode an arbitrary byte array:

package blevins.example

object BigIntEncoder {
  val radix = 36

  implicit def byteArrayToString(ba: Array[Byte]): String = {
    new java.math.BigInteger(addByte(ba)).toString(radix)
  }

  implicit def stringToByteArray(s: String): Array[Byte] = {
    stripByte(new java.math.BigInteger(s, radix).toByteArray)
  }

  def addByte(ba: Array[Byte]): Array[Byte] = {
    val h = new Array[Byte](1)
    h(0) = 0x01
    h ++ ba
  }

  def stripByte(ba: Array[Byte]): Array[Byte] = {
    ba.slice(1,ba.size)
  }

}

Note that we are adding an extra 0x01 byte to the head of the array to avoid any side effects from taking the two-complement of the byte array.

EDIT: The testing involved to prove this out is documented here: http://cleverlytitled.blogspot.com/2009/10/scalacheck.html


Wouldn't Base64 encoding be shorter than Base36? You can find plenty of implementations around.

But, to actually answer the question:

  // Make a big randomly-filled byte array
  val random = scala.util.Random
  val arraySize = 8543
  val bytes: Array[Byte] = new Array[Byte](arraySize) // make some big array
  random.nextBytes(bytes) // fill it randomly

  // Make a BigInt out of it and the corresponding base36 string representation
  val bi: BigInt = new BigInt(new java.math.BigInteger(bytes))
  val strRep: String = bi.toString(36)

  // Make a new BigInt out of the string rep.  Does it match?
  val bi2: BigInt = new BigInt(new java.math.BigInteger(strRep, 36))
  if (bi == bi2) {
      println("yippee!!")
  }

  // Make a new byte array out of the BigInt.  Does it match the original array?
  val bytes2: Array[Byte] = bi2.toByteArray
  if (bytes deepEquals bytes2) {
      println("yippee again!!")
  }
0

精彩评论

暂无评论...
验证码 换一张
取 消