开发者

Extract-Transform-Load from Oracle to SQL Server with C#

开发者 https://www.devze.com 2023-02-25 02:13 出处:网络
I\'ve been hanging around SO for a couple of days found the community great! I\'m certain that someone can give me an opinion on my question:

I've been hanging around SO for a couple of days found the community great! I'm certain that someone can give me an opinion on my question:

We have raw data in an Oracle server and our query fetches around 312 MB of data (just the columns we need joined across a couple of tables). This is done using Oracle.DataAccess.Client, the FetchSize = RowSize * 512, which is around 3.9 MB. I'm currently using OracleDataReader and do the processing in his while() loop.

The results of the processing are then written using prepared parameterized IF EXISTS() UPDATE ELSE INSERT.开发者_Go百科 From this DB we can return our needed data directly in Excel after joining the dimension tables.

I'm looking at ways to improve the performance. Is it possible to multi-thread BULK INSERTs and UPDATEs and also multi-thread the processing of rows (say start 4 Threads and divide incomming rows evenly between them) and does something like BULK UPDATE exist in SQL Server 2005?

I'm not looking for copy-paste code, I'm more interested in already existing best-practices or patterns.

Best Regards,


I'm not sure what dividing incoming rows between threads would gain you, given that they all have to be inserted in the same target table. Inserting in the database will likely be the bottleneck, and multiple threads on the client probably just end up waiting for each other.

My normal approach for this would be to use SqlBulkCopy to read the rows from the OracleDataReader and insert them into an empty staging table in the target database.

Then process the staging table in batches, UPSERTing into the target table. Each batch would be a transaction.

0

精彩评论

暂无评论...
验证码 换一张
取 消