开发者

Extremely large Boolean list in Python

开发者 https://www.devze.com 2022-12-08 13:30 出处:网络
I want to create an object in python that is a collection of around 200,000,000 true/false values. So that I can most effectively change or recall any given true/false value, so that I can quickly det

I want to create an object in python that is a collection of around 200,000,000 true/false values. So that I can most effectively change or recall any given true/false value, so that I can quickly determine if any given number, like 123,456,000 is true or false or change its value.

Is the best way to do this a list? or an array? or a class? or just a long int using bit operations? or something else?

I'm a bit noob so you may have to spell things out for me more th开发者_运维知识库an if I were asking the question in one of the other languages I know better. Please give me examples of how operating on this object would look.

Thanks


You can try the bitarray module, or write a similar thing using an array of integers yourself.


"quickly determine if any given number, like 123,456,000 is" in the "true" set or "false" set.

This is what a set is for.

The "true" set is a set of all the numbers.

To make a number's boolean flag "true", add it to the true set.

To make a number's boolean flag "false", remove it from the true set.

Life will be much simpler.


Have you considered using a lightweight database like SQLite?


You might also like to try the bitstring module, which is pure Python. Internally it's all stored as a byte array and the bit masking and shifting is done for you:

from bitstring import BitArray
# Initialise with two hundred million zero bits
s = BitArray(200000000)
# Set a few bits to 1
s.set(1, [76, 33, 123456000])
# And test them
if s.all([33, 76, 123456000]):
    pass

The other posters are correct though that a simple set might be a better solution to your particular problem.


At first glance, the Python BitVector module sounds like it does exactly what you want. It's available at http://cobweb.ecn.purdue.edu/~kak/dist/BitVector-1.5.1.html and since it is pure Python code, it will run on any platform with no compiling needed.

You mentioned that you need some speed in getting and setting any arbitrary true-false value. For that you need to use a Python array, rather than a list, and if you go to the above URL and browse the source code for BitVector you can see that it does indeed rely on Python arrays.

Ideally, you would encapsulate what you are doing in a class that subclasses from BitVector, i.e.

class TFValues(BitVector):
   pass

That way you can do things like add a list to contain associated info such as the name of a particular TF value.


If the bits which are set are mostly consecutive, there's also the option to store a list of ranges, e.g. the PyPI module https://pypi.org/project/range_set/ which is API compatible to Python's set class.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号