开发者

Is couchdb good for lot of documents with file attachments over multiple servers?

开发者 https://www.devze.com 2023-03-24 01:58 出处:网络
i would love to hear your thoughts about couchdb, and would it handle my use case. What i will do, i will have database where i store documents in size about 20kb with attachment of 1-10MB for each.

i would love to hear your thoughts about couchdb, and would it handle my use case.

What i will do, i will have database where i store documents in size about 20kb with attachment of 1-10MB for each.

  1. will couch handle database 10TB or more per server with my schema?(in 4u case you can put 24 2TB drives is this too much per couch node?, there will be very less reads, so i down need speed)

  2. will couch be able replicate all documents with attachments

  3. how about splitting all data to multiple servers (for example to 4 nodes)? will it han开发者_如何转开发dle that much attachments?

what problems do you see here?

need more info please ask :)


I don't think you will hit a physical limitation with a 10TB file, that is I don't think couch has some inbuilt "can't use files bigger than X" with X being < 10TB.

However.

The biggest issue is the file compaction. In order to reclaim space, Couch wants to compress the file. This effectively means copying the file. So, for some point at least, 10TB needs to be 20TB as it duplicates the live data in the new copy.

If you are mostly appending to the file, that is you are simply adding new data and not updating or overwriting old data, then this will be less of a problem, as compaction won't gain you quite that much. If your data is basically static, then I would build the file and compact it a final time and be doe with it.

There are "3rd party" sharding solution for Couch, Lounge is popular.

When I approach a couch solution the primary thing to consider is what your query criteria is. Couch is all about the views, really. What kind of views are you looking at? If you're simply storing data by some simple key (file name, the date, or whatever), you may well be better off simply using a file system, and an appropriate directory structure, frankly.

So I'd like to hear more about your views you plan to use since you don't intend to do a lot of reading.

Addenda:

You still haven't mentioned what kind of queries you're looking for. The queries are, effectively, THE design component, especially for a Couch DB since it gets more and more difficult to add new queries on large datasets.

When you said attachments, I assumed you meant attachments to the Couch DB payload (since it can handle attachments).

So, all that said, you could easily create meta-data document capturing all of the whatever information you want to capture, and as part of that document add a path name to the actual file stored on the file system. This will reduce the overall size of the Couch file dramatically, which makes the maintenance faster and more efficient. You lose some of the "Self contained" part of having it all in a single document, of course.

0

精彩评论

暂无评论...
验证码 换一张
取 消