I'd like to create a SQLITE file that includes 5,000,000 tinyint values (between 0 - 256). According to my calculation, this file should use roughly 5[MB].
Yet, I am getting a file of size closer to 50[MB].
Here's the code that I use:
import sqlite3
# Create and populate a database
db = sqlite3.connect('/home/mine/temp.db')
db.execute("CREATE TABLE IF NOT EXISTS test(id INT2)")
for i in range(0, 5000000):
db.execute("INSERT INTO tes开发者_如何学运维t(id) VALUES({0})".format(0))
db.commit()
db.close()
Any ideas how to optimize things so that it takes much less space?
There's a lot more to a database than just space for 5 million tiny ints. SQLite has published details of their file format. It seems to say you're going to need at least 9 bytes per row.
SQLite uses two distinct variants of the B-Tree structure:
- The table B-Tree, which uses 64-bit integer values for keys. In a table B-Tree, an associated database record (section 2.3.2) is stored along with each entry. Table B-Tree structures are described in detail in section 2.3.4.
- The index B-Tree, which uses database records as keys. Index B-Tree structures are described in detail in section 2.3.3.
Have you considered using a two-column Value/Frequency table? It'd make queries a little more complicated (e.g., you'd need SUM(Value * Freq) / SUM(Freq)
instead of AVG(Value)
), but you'd only need at most 256 rows instead of 5 million.
精彩评论