I'm looking at solutions to store a massive quantity of information consuming the less possible disk space.
The informatio开发者_如何学JAVAn structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned.
Any ideas on this would be great.
Speaking about Apache Cassandra - it's just a disk space hog. 200 MB of logs resulted in 1.2 GB files produced by Cassandra - and the keyspace was just 4 columns with 200 length strings.
Take a look at Oracle Berkeley DB - very simple robust database (key/value):
"Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the handheld device to the datacenter, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes."
Redis might worth a check if you can store your data in key-value
Newest version of Microsoft's SQL Server (2008) supports several levels of compression (row compression and page compression, in addition to backup compression). Might be worth investigating.
Some relevant resources:
- Linchi Shea shows that compression can sometimes improve performance
- Official MS Best Pracices doc for SQL 2008 compression
精彩评论