What is the best solution to work with archived HUGE amount of records in SQL Server 2008?_问答_开发者

What is the best solution to work with archived HUGE amount of records in SQL Server 2008?

开发者 https://www.devze.com 2023-03-30 01:34 出处：网络

I\'m developing a project to gather ebanking transactions which is not online with .NET. It means that I\'ll get some text files containing the transactions for previous day. In total, it contains ab

I'm developing a project to gather ebanking transactions which is not online with .NET. It means that I'll get some text files containing the transactions for previous day. In total, it contains about 2,000,000 rows per day.

I want to have great performance in searching on the last month transactions (maximum 3 seconds), but I want to be able to search in the older transactions (maximum 30 seconds). Archive searching is based on CardNumber, TransactionNumber, and TransactionDate. I mean the archive search scena开发者_开发知识库rio is static and we don't want to search on the other columns or get any kind of reports.

I'm thinking of archiving strategy.

There are some options:

Tuning and optimizing indexes on the main table.
Partitioning the main table.
Moving old transactions to another database, nightly.
Moving old transactions to a text file, every hour (TextFile, XmlSerialization, BinarySerialization) and then search in memory.
Using other open source NON-RDBMS databases (like Lucene engine in text search).

First of all, I want to know which strategy is the best matching one for this scenario?

Also, how many records supported by known database engines (like SQL Server 2008, Oracle, Sqlite, MySql,...)? When should we think of Table Partitioning?

I definitely think partitioning is going to work best here, probably monthly partitions. You can switch out old partitions (move them to separate filegroups over time, and mark them as read-only) but still have them available for querying. 2MM rows a day is not really all that huge, but if you're collecting that in the same partition forever some tasks/queries are eventually not going to scale. You need to be very diligent about how you set up your clustered index, non-clustered indexes, and partitioning scheme/function. If you're not already familiar with partitioning I strongly recommend budgeting some time to get familiar and become an expert with it in your test/dev/staging environments before letting it loose on production. This is probably a good a starting point as any, but don't be scared to search blogs for practical and real-world advice outside the official documentation as well.