I'd like to get the exact sequence of bits from a file into a string using Python 3. There are several questions on this topic which come close, but don't quite answer it. So far, I have this:
>>> data = open('file.bin', 'rb').read()
>>> data
'\xa1\xa7\xda4\x86G\xa0!e\xab7M\xce\xd4\xf9\x0e\x99\xce\xe94Y3\x1d\xb7\xa3d\xf9\x92\xd9\xa8\xca\x05\x0f$\xb3\xcd*\xbfT\xbb\x8d\x801\xfanX\x1e\xb4^\xa7l\xe3=\xaf\x89\x86\xaf\x0e8\xeeL\xcd|*5\xf16\xe4\xf6a\xf5\xc4\xf5\xb0\xfc;\xf3\xb5\xb3/\x9a5\xee+\xc5^\xf5\xfe\xaf]\xf7.X\x81\xf3\x14\xe9\x9fK\xf6d\xefK\x8e\xff\x00\x9a>\xe7\xea\xc8\x1b\xc1\x8c\xff\x00D>\xb8\xff\x00\x9c9...'
>>> bin(data[:][0])
'0b11111111'
OK, I can get a base-2 number, but I don't understand why data[:][x], and I still have the leading 0b开发者_如何学运维. It would also seem that I have to loop through the whole string and do some casting and parsing to get the correct output. Is there a simpler way to just get the sequence of 01's without looping, parsing, and concatenating strings?
Thanks in advance!
I would first precompute the string representation for all values 0..255
bytetable = [("00000000"+bin(x)[2:])[-8:] for x in range(256)]
or, if you prefer bits in LSB to MSB order
bytetable = [("00000000"+bin(x)[2:])[-1:-9:-1] for x in range(256)]
then the whole file in binary can be obtained with
binrep = "".join(bytetable[x] for x in open("file", "rb").read())
If you are OK using an external module, this uses bitstring:
>>> import bitstring
>>> bitstring.BitArray(filename='file.bin').bin
'110000101010000111000010101001111100...'
and that's it. It just makes the binary string representation of the whole file.
It is not quite clear what the sequence of bits is meant to be. I think it would be most natural to start at byte 0 with bit 0, but it actually depends on what you want.
So here is some code to access the sequence of bits starting with bit 0 in byte 0:
def bits_from_char(c):
i = ord(c)
for dummy in range(8):
yield i & 1
i >>= 1
def bits_from_data(data):
for c in data:
for bit in bits_from_char(c):
yield bit
for bit in bits_from_data(data):
# process bit
(Another note: you would not need data[:][0]
in your code. Simply data[0]
would do the trick, but without copying the whole string first.)
To convert raw binary data such as b'\xa1\xa7\xda4\x86'
into a bitstring that represents the data as a number in binary system (base-2) in Python 3:
>>> data = open('file.bin', 'rb').read()
>>> bin(int.from_bytes(data, 'big'))[2:]
'1010000110100111110110100011010010000110...'
See Convert binary to ASCII and vice versa.
精彩评论