开发者

More efficient to store text as file or in DB?

开发者 https://www.devze.com 2022-12-13 08:23 出处:网络
Imagine you\'re dealing with many strings of text that are about 10,000 characters long entered by users.Would it be more efficient to write those automatically onto pages or input them onto a table i

Imagine you're dealing with many strings of text that are about 10,000 characters long entered by users. Would it be more efficient to write those automatically onto pages or input them onto a table in a database? I hope that question is clear enough开发者_运维百科...


It depends on what sort of "efficiency" you're aiming for.

Here's what I mean:

  • will you be changing the content of your text strings?
  • what sorts of searches will you be doing?
  • when you extract the text do what do you do with it?

My opinion is that provided you're not going to change the content much, nor perform much analysis, you're better off with the database.


10k isn't particularly large, so either is fine. I would personally use the database, as it will allow you to easily search though.


Depends how you're accessing them, but normally using the FS would result in better performance. That's for the obvious reason the DB is another layer built on top of the FS, and using the FS directly, assuming no extra heavy processing (for example, have 100s of named files instead of one big bloated file ordered in a special order you need to parse), would save you the DBMS operations.


I'm wondering if SQLite would be the best of both worlds, or at least, the best database for that size of job.


The real answer her is what you're going to do with these strings.

Databases are meant to be able to quickly return specific records. If you're just going to SELECT * FROM Table and then concat it all together, there's no point in using a database.

However, if you have a relation between your data that you want to be able to search, then a database will likely be more efficient.

E.G., do you want to be able to pull up all the text records from a set of users on a set of dates? Find all records from users who match some records?

These kinds of loads will likely be more efficient than a naive implementation, and still probably faster than a decent one, even if it does avoid some access layers.


There are a lot of considerations. As others said - either approach would work fine for a small number of 10k rows (thousands).

But what's the rest of your app do? If it does everything in the database, then I'd be inclined to put this there as well; the opposite is true as well.

And how will you be selecting these? Do you need to do complex text searches? If so, a database might not be the best. Or, would you be adding new attributes, searching on those attributes - or matching them against data in other tables? In this common case a database would be better.

And if your data is really vast (many millions of 10k rows) and your performance requirements aren't terribly high - you may want to compress them and store them in the file system.

Lastly, how important is data quality? Given the features of a good database it's much easier to guarantee good data quality with a database.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号