I have a method which returns an IEnumerable<string>
which of course is being handled with yield return <string>;
. I want to have multiple threads processing the result of this, of course without repeating it and being thread safe. How would I achieve this?
var result = GetFiles(source);
for (int i = 0; i < Environment.ProcessorCount; i++)
{
tasks.Add(Task.Factory.StartNew(() => { ProcessCopy(result开发者_运维百科); }));
}
Task.WaitAll(tasks.ToArray());
However this seems to be producing repeats:
C:\Users\esac\Pictures\2000-06\DSC_1834.JPG
C:\Users\esac\Pictures\2000-06\DSC_1835.JPG
C:\Users\esac\Pictures\2000-06\.picasa.ini
C:\Users\esac\Pictures\2000-06\DSC_1834.JPG
C:\Users\esac\Pictures\2000-06\DSC_1835.JPG
C:\Users\esac\Pictures\2000-06\.picasa.ini
C:\Users\esac\Pictures\2000-06\DSC_1834.JPG
C:\Users\esac\Pictures\2000-06\DSC_1835.JPG
C:\Users\esac\Pictures\2000-06\.picasa.ini
C:\Users\esac\Pictures\2000-06\DSC_1834.JPG
C:\Users\esac\Pictures\2000-06\DSC_1835.JPG
You can easily do this using the Parallel.ForEach method.
Write a Simple Parallel.ForEach loop
Each iteration will be queued in the task manager. The loop will exit when all iterations are performed.
var result = GetFiles(source);
Parallel.ForEach(result, current => {
ProcessCopy(current);
});
Console.WriteLine("Done");
You have to pick a range of items for each ProcessCopy()
call - right now you are passing each thread the full enumeration of files - remember that the IEnumerable
you are passing has a method called GetEnumerator()
- only when that method is called (which the foreach does for you under the hood) the real Enumerator is returned, with which you can then enumerate the items one by one. Since you are passing the IEnumerable
each thread is calling GetEnumerator()
and is hence enumerating all files.
Instead do something like this to have each ProcessCopy()
process one file :
foreach(string file in GetFiles(source))
{
string fileToProcess = file;
tasks.Add(Task.Factory.StartNew(() => { ProcessCopy(fileToProcess); }));
}
Task.WaitAll(tasks.ToArray());
I wouldn't worry about processor count - let the TPL and the thread pool figure out how many threads to run for optimal performance.
Why not use a simple LINQ query to do what you want?
var tasks =
from f in GetFiles(source)
select Task.Factory.StartNew(() => { ProcessCopy(f); });
Task.WaitAll(tasks.ToArray());
Behind the scenes TPL handles all of the icky Environment.ProcessorCount
stuff for you anyway.
精彩评论