开发者

Is there a B-Tree Database or framework in Python?

开发者 https://www.devze.com 2023-01-19 08:48 出处:网络
I 开发者_JAVA百科heard that B-Tree databases are faster than Hash tables, so I thought of using a B-Tree database for my project. Is there any existing framework in python which allows us to use such

I 开发者_JAVA百科heard that B-Tree databases are faster than Hash tables, so I thought of using a B-Tree database for my project. Is there any existing framework in python which allows us to use such Data structure or will I have to code from scratch?


The only reason to choose a B-Tree over a hash table, either in memory or with block storage (as in a database) is to support queries other than equal. A b-tree permits you perform range queries with good performance. Many key-value stores (such as berkley db) don't make this externally visible, though, because they still hash the keys, but this still lets you iterate over the whole dataset quickly and stably (iterators remain valid even if there are adds or deletes, or the tree must be rebalanced).

If you don't need range queries, and you don't need concurrent iteration, then you don't need b-trees, use a hash table, it will be faster at any scale.

Edit: I've had occasion for the above to actually be true; for this, the blist package seems to be the most complete implementation of a sorted container library.


you should really check out zodb. http://www.zodb.org/en/latest/

i made a monography about it long go, though its in spanish http://sourceforge.net/projects/banta/files/Labs/zodb/Monografia%20-%20ZODB.pdf/download

Information in english is all over the place.


Program what you are trying to do first, then optimize if needed. Period.

EDIT:

http://pypi.python.org/pypi/blist

Drop in replacement for python's built in list.


SQLite3 uses B+ Trees internally, but it sounds like you may want a key-value store. Try Berkeley DB for that. If you don't need transactions, try HDF5. If you want a distributed key-value store, there is also http://scalien.com/keyspace/, but that is a server-client type system which would open up all sorts of NoSQL key-value stores.

All of these systems will be O(log(n)) for insertion and retrieval, so they will probably be slower than the hash tables that you're currently using.

Kyoto Cabinet offers a hash tree, so that may be more of what you're looking at since it should be O(1) for insertion and retrieval, but you can't do in-order traversal if you need that (although since you're currently using hash trees, this shouldn't be an issue).

http://fallabs.com/kyotocabinet/

If you're looking for performance, you will need to have the speed critical items implemented in a compiled language and then have a wrapper API in Python.


Here there is a good btree pure python implementation. You can adapt it if needed.


You might want to have a look at mxBeeBase which is part of the eGenix mx Base Distribution. It includes a fast on-disk B+Tree implementation and provides storage classes which allow building on-disk dictionaries or databases in Python.

0

精彩评论

暂无评论...
验证码 换一张
取 消