开发者

Using Parallel Extensions or Parallel LINQ with LINQ Take

开发者 https://www.devze.com 2023-02-08 15:01 出处:网络
I have a database with about 5 million rows in it. I am trying to generate XML strings for the database and push them to a service. Instead of doing this one at a time, the service supports taking 100

I have a database with about 5 million rows in it. I am trying to generate XML strings for the database and push them to a service. Instead of doing this one at a time, the service supports taking 1000 records at a time. At the moment, this is quite slow, taking upwards of 10 seconds per 1000 records (including writing back to the database and uploading to the service).

I tried to get the following code working, but have failed... I get a crash when I try it. Any ideas?

    var data = <insert LINQ query here>
    int take = 1000
    int left = data.Count();

    Parallel.For(0, left / 1000, i =>
        {
            data.Skip(i*1000).Take(1000)...
            //Generate XML here.
            //Write to service here...
            //Mark items in database as generated.
        });
        //Get companies which are still marked as not generated.
        //Create XML.
        //Write to Service.

I get a crash telling me that the index is out of bounds. If left is 5 million, the number in the loop should be no more than 5000. If I multiply that again by 1000, I should not get more than 5 million. I wouldn't mind if it worked for a bit, and 开发者_Python百科then failed, but it just fails after the SQL query!


I think it doesn't like your last index value - it should be left / 1000 -1, not left / 1000:

Parallel.For(0, left / 1000 - 1, i =>
        {
            data.Skip(i*1000).Take(1000)...
            //Generate XML here
            //Write to Service here...
            //mark items in DB as generated
        });


I suspect the index out of bounds error is caused by code other than what is currently being displayed.

That being said, this could be handled in a much cleaner manner. Instead of using this approach, you should consider switching to using a custom partitioner. This will be dramatically more efficient, as each call to Skip/Take is going to force a re-evaluation of your sequence.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号