开发者

Is it possible to read/write bits from a file using JAVA?

开发者 https://www.devze.com 2023-01-26 08:20 出处:网络
To read/write binary files, I am using DataInputStream/DataOutputStream, they have this method writeByte()/readByte(), but what I want to do is read/write bits? Is it possible?

To read/write binary files, I am using DataInputStream/DataOutputStream, they have this method writeByte()/readByte(), but what I want to do is read/write bits? Is it possible?

I want to use it for a compression al开发者_开发技巧gorithm, so when I am compressing I want to write 3 bits(for one number and there are millions of such numbers in a file) and if I write a byte at everytime I need to write 3 bits, I will write loads of redundant data...


It's not possible to read/write individual bits directly, the smallest unit you can read/write is a byte.

You can use the standard bitwise operators to manipulate a byte though, so e.g. to get the lowest 2 bits of a byte, you'd do

byte b = in.readByte();
byte lowBits = b&0x3;

set the low 4 bits to 1, and write the byte:

b |= 0xf;
out.writeByte(b);

(Note, for the sake of efficiency you might want to read/write byte arrays and not single bytes)


There's no way to do it directly. The smallest unit computers can handle is a byte (even booleans take up a byte). However you can create a custom stream class that packs a byte with the bits you want then writes it. You can then make a wrapper for this class who's write function takes some integral type, checks that it's between 0 and 7 (or -4 and 3 ... or whatever), extracts the bits in the same way the BitInputStream class (below) does, and makes the corresponding calls to the BitOutputStream's write method. You might be thinking that you could just make one set of IO stream classes, but 3 doesn't go into 8 evenly. So if you want optimum storage efficiency and you don't want to work really hard you're kind of stuck with two layers of abstraction. Below is a BitOutputStream class, a corresponding BitInputStream class, and a program that makes sure they work.

import java.io.IOException;
import java.io.OutputStream;

class BitOutputStream {

    private OutputStream out;
    private boolean[] buffer = new boolean[8];
    private int count = 0;

    public BitOutputStream(OutputStream out) {
        this.out = out;
    }

    public void write(boolean x) throws IOException {
        this.count++;
        this.buffer[8-this.count] = x;
        if (this.count == 8){
            int num = 0;
            for (int index = 0; index < 8; index++){
                num = 2*num + (this.buffer[index] ? 1 : 0);
            }

            this.out.write(num - 128);

            this.count = 0;
        }
    }

    public void close() throws IOException {
        int num = 0;
        for (int index = 0; index < 8; index++){
            num = 2*num + (this.buffer[index] ? 1 : 0);
        }

        this.out.write(num - 128);

        this.out.close();
    }

}

I'm sure there's a way to pack the int with bit-wise operators and thus avoid having to reverse the input, but I don't what to think that hard.

Also, you probably noticed that there is no local way to detect that the last bit has been read in this implementation, but I really don't want to think that hard.

import java.io.IOException;
import java.io.InputStream;

class BitInputStream {

    private InputStream in;
    private int num = 0;
    private int count = 8;

    public BitInputStream(InputStream in) {
        this.in = in;
    }

    public boolean read() throws IOException {
        if (this.count == 8){
            this.num = this.in.read() + 128;
            this.count = 0;
        }

        boolean x = (num%2 == 1);
        num /= 2;
        this.count++;

        return x;
    }

    public void close() throws IOException {
        this.in.close();
    }

}

You probably know this, but you should put a BufferedStream in between your BitStream and FileStream or it'll take forever.

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.Random;

class Test {

    private static final int n = 1000000;

    public static void main(String[] args) throws IOException {

        Random random = new Random();

        //Generate array

        long startTime = System.nanoTime();

        boolean[] outputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            outputArray[index] = random.nextBoolean();
        }

        System.out.println("Array generated in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Write to file

        startTime = System.nanoTime();

        BitOutputStream fout = new BitOutputStream(new BufferedOutputStream(new FileOutputStream("booleans.bin")));

        for (int index = 0; index < n; index++){
            fout.write(outputArray[index]);
        }

        fout.close();

        System.out.println("Array written to file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Read from file

        startTime = System.nanoTime();

        BitInputStream fin = new BitInputStream(new BufferedInputStream(new FileInputStream("booleans.bin")));

        boolean[] inputArray = new boolean[n];
        for (int index = 0; index < n; index++){
            inputArray[index] = fin.read();
        }

        fin.close();

        System.out.println("Array read from file in " + (double)(System.nanoTime() - startTime)/1000/1000/1000 + " seconds.");

        //Delete file
        new File("booleans.bin").delete();

        //Check equality

        boolean equal = true;
        for (int index = 0; index < n; index++){
            if (outputArray[index] != inputArray[index]){
                equal = false;
                break;
            }
        }

        System.out.println("Input " + (equal ? "equals " : "doesn't equal ") + "output.");
    }

}


Please take a look at my bit-io library https://github.com/jinahya/bit-io, which can read and write non-octet-aligned values such as a 1-bit boolean or 17-bit unsigned integer.

<dependency>
  <!-- resides in central repo -->
  <groupId>com.googlecode.jinahya</groupId>
  <artifactId>bit-io</artifactId>
  <version>1.0-alpha-13</version>
</dependency>

This library reads and writes arbitrary-length bits.

final InputStream stream;
final BitInput input = new BitInput(new BitInput.StreamInput(stream));

final int b = input.readBoolean(); // reads a 1-bit boolean value
final int i = input.readUnsignedInt(3); // reads a 3-bit unsigned int
final long l = input.readLong(47); // reads a 47-bit signed long

input.align(1); // 8-bit byte align; padding


final WritableByteChannel channel;
final BitOutput output = new BitOutput(new BitOutput.ChannelOutput(channel));

output.writeBoolean(true); // writes a 1-bit boolean value
output.writeInt(17, 0x00); // writes a 17-bit signed int
output.writeUnsignedLong(54, 0x00L); // writes a 54-bit unsigned long

output.align(4); // 32-bit byte align; discarding


InputStreams and OutputStreams are streams of bytes.

To read a bit you'll need to read a byte and then use bit manipulation to inspect the bits you care about. Likewise, to write bits you'll need to write bytes containing the bits you want.


Yes and no. On most modern computers, a byte is the smallest addressable unit of memory, so you can only read/write entire bytes at a time. However, you can always use bitwise operators to manipulate the bits within a byte.


Bits are packaged in bytes and apart from VHDL/Verilog I have seen no language that allows you to append individual bits to a stream. Cache up your bits and pack them into a byte for a write using a buffer and bitmasking. Do the reverse for read, i.e. keep a pointer in your buffer and increment it as you return individually masked bits.


Afaik there is no function for doing this in the Java API. However you can of course read a byte and then use bit manipulation functions. Same goes for writing.


If you are just writing bits to a file, Java's BitSet class might be worth a look at. From the javadoc:

This class implements a vector of bits that grows as needed. Each component of the bit set has a boolean value. The bits of a BitSet are indexed by nonnegative integers. Individual indexed bits can be examined, set, or cleared. One BitSet may be used to modify the contents of another BitSet through logical AND, logical inclusive OR, and logical exclusive OR operations.

You are able to convert BitSets to long[] and byte[] to save data to a file.


The below code should work

    int[] mynumbers = {3,4};
    BitSet compressedNumbers = new BitSet(mynumbers.length*3);
    // let's say you encoded 3 as 101 and 4 as 010
    String myNumbersAsBinaryString = "101010"; 
    for (int i = 0; i < myNumbersAsBinaryString.length(); i++) {
        if(myNumbersAsBinaryString.charAt(i) == '1')
            compressedNumbers.set(i);
    }
    String path = Resources.getResource("myfile.out").getPath();
    ObjectOutputStream outputStream = null;
    try {
        outputStream = new ObjectOutputStream(new FileOutputStream(path));
        outputStream.writeObject(compressedNumbers);
    } catch (IOException e) {
        e.printStackTrace();
    }
0

精彩评论

暂无评论...
验证码 换一张
取 消