开发者

Write a string of 1's and 0's to a binary file?

开发者 https://www.devze.com 2023-04-02 18:51 出处:网络
I want to take a string of 1\'s and 0\'s and convert it into an actual binary file(simply writing the string of 1\'s and 0\'s to a file would just make it either ascii file containing \"00110001\"s an

I want to take a string of 1's and 0's and convert it into an actual binary file(simply writing the string of 1's and 0's to a file would just make it either ascii file containing "00110001"s and "00110000"s ). I would prefer to do this in python or directly from a bash shell, but java or C is fine too. this is probably a one time use.

Than开发者_StackOverflowks.


In Python, use the int built-in function to convert the string of 0s and 1s to a number:

>>> int("00100101", 2)
37

Then use the chr built-in to convert a 8-bit integer (that is, in the inclusive range 0-255) to a character.

>>> chr(_)
'%'

The result of chr can be simply written to a file (opened in binary mode) with the file.write method.


If you've got more than 8 characters to convert (and I'm presuming you do) then you'll need something like this (using Python):

>>> b = '0010101001010101010111101010100101011011'
>>> bytearray(int(b[x:x+8], 2) for x in range(0, len(b), 8))
bytearray(b'*U^\xa9[')

This splits the bit string into 8 character chunks (and if your string isn't a multiple of 8 long you should pad it first), converts each chunk into an integer and then converts the list of integers into a bytearray which can be written directly to your binary file (there's no need to convert it to a string):

>>> with open('somefile', 'wb') as f:
...     f.write(the_bytearray)

If you have more tasks like this then there are libraries that can help, for example here's the same conversion using my bitstring module:

>>> from bitstring import BitArray
>>> with open('somefile', 'wb') as f:
...     BitArray(bin=b).tofile(f)


Any language that can do shifting can combine numbers of any radix. While I'm a fan of the different ways/manipulations that different languages can access this kind of stuff with ease, never forget that behind all of this is some very very basic maths.

In this case, binary is just a simple power of 2 so:

    1 << 1 = 1
    1 << 2 = 2
    1 << 3 = 4
    1 << 4 = 8

and so on...

if your taking the binary string : 10100101 you can easily convert it to a byte as follows:

    (1 << 7) + (0 << 6) + (1 << 5) + (0 << 4) + (0 << 3) + (1 << 2) + (0 << 1) + 1

Assuming that you've gone through and converted each "0" or "1" to it's number format first.

This will start getting a bit tedious if your dealing with numbers of bits larger than the 8 above, but since your doing a byte at a time, a simple byte array in your chosen language will suffice, allowing you to push each byte in turn.

It's worth mentioning also that the same process can be used for other bases, and if you don't have a shift facility, a simple multiplication will generally work just as well.

If you label your columns across the top in binary, you'll easily see what I'm on about.. taking the above example (Remember it's all powers of 2):

    1   0  1  0  0 1 0 1
    128 64 32 16 8 4 2 1 = 128 + 32 + 4 + 1 = 165

Not part of the question, but related... and taking it one step further:

Hexadecimal is the values 0 to F (16 values) each can fit into 4 bits... so

    1010 0101 (8+2) (4+1) - Binary using powers of 2 only on 4 bits (8 4 2 1)
    10   5    (Decimal) - (10 << 4) + 5 = 165
    A    5    (Hexadecimal)


In java u have built-in function Integer.parseInt(String strBinaryNumber,int radix) method.

Which work as..

             String strBinaryNumber="00100101";
     System.out.println(Integer.parseInt(strBinaryNumber,2));

Output Will be: 37

but An exception of type NumberFormatException is thrown if any of the following situations occurs:

  1. The first argument is null or is a string of length zero.
  2. The radix is either smaller than Character.MIN_RADIX or larger than Character.MAX_RADIX.
  3. Any character of the string is not a digit of the specified radix, except that the first character may be a minus sign '-' ('\u002D') provided that the string is longer than length 1. 4.The value represented by the string is not a value of type int.


This isn't all that practical, but here's one way it could be done in a shell script. Note: it uses bc

#!/bin/bash

# Name of your output file
OFILE="output.txt"

# A goofy wrapper to convert a sequence of 8 1s and 0s into a 8-bit number, expressed in hex
function bstr_to_byte()
{
    echo "obase=16;ibase=2;$1" | bc
}


# Build input string from stdin
#   This can be done using pipes ( echo "1010101..." | ./binstr.sh
#   Or "interactively", so long as you enter q on it's own line when you are done entering your
#       binary string.
ISTR=""
while read data; do
    if [[ ${data} != "q" ]] ; then
        ISTR="${ISTR}${data}"
    else
        break
    fi
done

# Byte-by-byte conversion
while [[ $(expr length ${ISTR}) -ge 8 ]] ; do
    # Copy the first 8 characters
    BSTR=${ISTR:0:8}
    # Drop them from the input string
    ISTR=${ISTR:8}
    # Convert the byte-string into a byte
    BYTE=$(bstr_to_byte $BSTR)

    # Debug print
    ##echo "$BSTR => [ ${BYTE} ]"

    # Write character to file
    echo -en "\x${BYTE}" >> ${OFILE}

    # Check for empty ISTR, which will cause error on iteration
    if [[ -z ${ISTR} ]] ; then
        ##echo "String parsed evenly"
        break
    fi
done

##echo "Remaining, unparsed characters: ${ISTR}"

Which, if you name binstr.sh can be run by piping stdin, e.g.:

echo "11001100" | ./binstr.sh

You can check this with something like hexdump, e.g. hexdump output.txt

I should point out that this assumes that your string is being entered with the MSB first. It will also simply discard any number of "bits" that don't form a complete byte. You could change this, or just make sure you pad your input sufficiently.

Lastly, there are some debugging lines I left in there but commented out with double # signs.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号