开发者

java.util.BitSet -- set() doesn't work as expected

开发者 https://www.devze.com 2022-12-31 04:37 出处:网络
Am I missing something painfully obvious?Or does just nobody in the world actually use java.util.BitSet?

Am I missing something painfully obvious? Or does just nobody in the world actually use java.util.BitSet?

The following test fails:

@Test
public void testBitSet() throws Exception {
    BitSet b = new BitSet();
    b.set(0, true);
    b.set(1, false);
    assertEquals(2, b.length());
}

It's really uncle开发者_如何转开发ar to me why I don't end up with a BitSet of length 2 and the value 10. I peeked at the source for java.util.BitSet, and on casual inspection it seems to fail to make sufficient distinction between a bit that's been set false and a bit that has never been set to any value...

(Note that explicitly setting the size of the BitSet in the constructor has no effect, e.g.:

BitSet b = new BitSet(2);


You highest bit set (as in "set to 1") is Bit 0. So the length should be 1.

See the JavaDoc for length:

public int length()

Returns the "logical size" of this BitSet: the index of the highest set bit in the BitSet plus one. Returns zero if the BitSet contains no set bits.

Maybe you're looking for size although it's possible that might be higher than two if bits are allocated at a certain resolution (say 16 bit boundaries)?


People do use BitSet; however, they use it for something other than what you intend. It's probably best to think of BitSet as a very compact, memory-efficient form of Set<Integer> that has the peculiar property that you can't put negative numbers into it.

It's very common with BitSets to use them in the pattern of

for (int id = set.nextSetBit(0); id >= 0; id = set.nextSetBit(id + 1)) {
  // do stuff to a set index
}

after you do something to fill them up. This is equivalent to iterating over the elements of the Set.


This puzzled me too, not sure of the rationale behind BitSet's current rather unexpected functionality. However since it's not final, we can use some embrace and extend tactics and do the following to get a fixed BitSet with length semantics as expected:

import java.util.BitSet;

/**
 * Variation of BitSet which does NOT interpret the highest bit synonymous with
 * its length.
 *
 * @author casper.bang@gmail.com
 */
public class FixedBitSet extends BitSet{

    int fixedLength;

    public FixedBitSet(int fixedLength){
        super(fixedLength);
        this.fixedLength = fixedLength;
    }

    @Override
    public int length() {
        return fixedLength;
    }
}


Given that the bitset is backed by a long[], the minimum size is 64 (because 1 long is 64 bits). The size gets incremented by a multiple of 64 and for some reason, they have not maintained the # of bits you intended to represent when you use the constructor that takes an int.


// Abhay Dandekar

import java.util.BitSet;

public class TestBitSet {

    public static void main(String[] args) {

        BitSet bitSet = new BitSet();
        System.out.println("State 0 : " + bitSet.size() + " : " + bitSet.length() );

        bitSet.set(0, true);
        bitSet.set(1, true);
        System.out.println("State 1 : " + bitSet.size() + " : " + bitSet.length() );

        bitSet.set(2, false);
        bitSet.set(3, false);
        System.out.println("State 2 : " + bitSet.size() + " : " + bitSet.length() );

        bitSet.set(4, true);
        System.out.println("State 3 : " + bitSet.size() + " : " + bitSet.length() );

    }
}

A simple java program to show what happens inside. Some points to note :

  1. BitSet is backed by a long

  2. All the default values are false

  3. While returning the length, it returns the index+1 of the highest "true" value in the set.

The output below should be able to explain itself :

State 0 : 64 : 0

State 1 : 64 : 2

State 2 : 64 : 2

State 3 : 64 : 5

So points to conclude :

  1. Do not use the length to conclude the no of bits modified

  2. Can be used in scenarios like bloom filters. More on bloom filters can be googled .. ;)

Hope this helps

Regards,

Abhay Dandekar


Good Casper! Your small improvement should indeed have been present in the original BitSet java def! I also suggest this (append() and concat() are useful for various usages)

import java.util.BitSet;

public class fixBitSet extends BitSet {

  public int fsize = 0;

  public void set(int k, boolean value) {
    if (k >= fsize)
      fsize = k + 1;
    super.set(k, value);
  }

  public void append(fixBitSet bs) {
    for (int k = 0; k < bs.fsize; k++)
      super.set(fsize + k, bs.get(k));
    fsize += bs.fsize;
  }

  public static fixBitSet concat(fixBitSet[] vbs) {
    final fixBitSet bs = new fixBitSet();
    for (fixBitSet xbs : vbs)
      bs.append(xbs);
    return (bs);
  }

}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号