I'm a big fan of Linq for typeing, clarity and brevity. But I'm finding it very slow to search for matching records compared to the old dataview by a factor of some 2000 times!
I am writing an app to backup large sets of files - 500,000 files and 500 gb of data. I have created a manifest o开发者_如何学编程f files in the backup set and compare the files in the directory with those in the manifest documenting what's been backed up already. This way I know which files have changed and so need to be copied.
The slow step is this one:
var matchingMEs = from m in manifest
where m.FullName == fi.FullName
select m;
where manifest = List<ManifestEntry>
and ManifestEntry
is a relatively simple POCO.
Overall performance is 17-18 records per second.
When I use a dataview:
DataView vueManifest = new DataView(dt, "", "FullName", DataViewRowState.CurrentRows);
then in the loop find the matching manifest entries with a .FindRows:
matchingMEs = vueManifest.FindRows(fi.FullName);
... then I'm getting some 35,000 files per second throughput!
Is this normal? I can't believe that Linq comes at such a price. Is it the Linq or the objects that slow things down?
(btw, I tried using a Dictionary
and a SortedList
as well as the List<ManifestEntries>
and they all gave about the same result.)
Your DataView is sorting by fullname and hence FindRows can jump straight to the correct record(s), whereas your linq query has to iterate through list until it reaches the correct record(s). This will definitely be noticeable if you have 500,000 entries.
Assuming fullname is unique, then when you switched to using a dictionary, I would suspect you are still iterating through it using a similar linq query, something like
var matchingME = (from m in manifest where m.Key == fi.FullName select m).Single();
whereas you should be using
var matchingME = manifest[fi.FullName] ;
精彩评论