开发者

What is the best database for this type of situation with 10,000 inserts a second?

开发者 https://www.devze.com 2023-04-12 05:34 出处:网络
I have an application that needs to save a large number of this type of class/information: public struct PrimaryPacket

I have an application that needs to save a large number of this type of class/information:

public struct PrimaryPacket
{
    public uint IPAddress;
    public ushort UDPPort;
    public ushort TCPPort;
    public uint RequestID;
    public byte Hop;
    public byte FreePrimaries;
    public byte FreeSecondaries;
    public ushort Length;
    public byte[] Data;
}

Currently I am using this for storing the items, with the key being the IPAddress and RequestID in a uint[] array:

ConcurrentDictionary<uint[], PrimaryPacket> Packets = new ConcurrentDictionary<uint[], PrimaryPacket>();

However, the extremely large amount of these I need to store is about 10,000 new items per second (saving them for up开发者_运维技巧 to one hour) and the memory usage becomes insane.

If I were to use a database for this would it be any more efficient (and less of a memory hog) ? And should I be looking at using mysql or something like mongodB?


If you just want a log of all the data so that you can restore your state in the case of a restart, a simple flat file will work fine. Any system can keep up with writing 1MB/sec as long as you buffer the writes. If you're going to be doing random access, though, it's a different story.

You mentioned that you're going to have 10k inserts per second. Even without reads (writing only), it will take a lot of work and fairly expensive hardware to get that kind of bandwidth for random access.

Since you are only going to be keeping an hour's worth of data (36M records), it will probably be much cheaper and easier to just store the data in memory. Assuming it takes 100 bytes to store all the data for a record, you'll only need an additional 4GB. Because it requires 4GB just to store the data, I'm going to assume you have a 64-bit machine.

Your current implementation of ConcurrentDictionary<uint[], PrimaryPacket> has some problems, though.

First of all, using a uint[] as a dictionary key is a bad idea because two different arrays with the same contents are not considered equal -- you'll never be able to look up anything in your dictionary! Since the key is 8 bytes, I would recommend a struct such as ulong, KeyValuePair<uint, uint>, or a custom one. I would not recommend a Tuple<uint, uint> because it will have something like 24 bytes of overhead.

Second, it looks like you may be defining PrimaryPacket as a struct. For an object this large, you will likely find that you get better performance defining it as a class.


10,000 per second?! I hope that is just in short bursts, because otherwise it's 864 million per day (that is 86% of eBay's daily transactions). I would always recommend DB dumps for high volume like this.

Check out:

http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis

I've worked a bit with Cassandra which is great for high-volume writing.

0

精彩评论

暂无评论...
验证码 换一张
取 消