开发者

Structure for large volume of semi-persistent data?

开发者 https://www.devze.com 2023-03-16 11:24 出处:网络
I need to track a large volume of inotify messages for a set of files that, during their lifetime, will move between several specific directories, with inodes intact; I need to track the movement of t

I need to track a large volume of inotify messages for a set of files that, during their lifetime, will move between several specific directories, with inodes intact; I need to track the movement of these inodes, as well as create/delete and changes to a file's content. There will be many hundreds of changes per second.

Because of limited resources, I cant store it all in RAM (or disk, or a database).

Luckily, most of these files will be deleted in short order; the file content- and movement-history just need to be stored for later analysis. The files that are not deleted immediately will end up staying in a particular directory for a known period of time.

So it seems to me that I need a data structure that is partially 开发者_如何学Cstored in RAM, and partially saved to disk; part of the portion saved to disk will need to be recalled (the files not deleted), but most will not. I will not need to query the data, only access it by an identifier (the file name, which is [A-Z0-9]{8}). It would be helpful to be able to configure when the file data is flushed to disk.

Does such a beast exist?

Edit: I've asked a related question.


Why not a database? Say SQLite.

While SQLite isn't the most efficient storage mechanism in terms of space there are a number of advantages -- the first and foremost is that is an SQL RDBMS. The amount of memory SQLite uses (to temporarily cache data) can be configured through the cache_size pragma.

If SQLite isn't an option, what about one of the "key value stores"? They range from distributed client/server in-memory (e.g. memcached) to local embedded disk-based (e.g BDB) to memory-with-a-persistent-backing-for-overflow and anywhere in-between, etc. They do not have the SQL DDL/DQL (although some might allow relationships), but are efficient at what they do -- store keys and values.

Of course, one could always implement a LRU structure (say a basic sorted list with a limit) with overflow to a simple extensible disk-based hash implementation... but... consider above first :) [There may also be some micro-KV libraries/source out there].

Happy coding.

0

精彩评论

暂无评论...
验证码 换一张
取 消