I tried to make this code perform faster using Parallel.ForEach and ConcurrentBag but it's stil开发者_运维问答l running way to long (esp. when having in mind that in my scenario i may also be 1.000.000++):
List<Point> points = new List<Point>();
for(int i = 0; i<100000;i++) {
Point point = new Point {X = i-50000, Y = i+50000, CanDelete = false};
points.Add(point);
}
foreach (Point point in points) {
foreach (Point innerPoint in points) {
if (innerPoint.CanDelete == false && (point.X - innerPoint.X) < 2) {
innerPoint.Y = point.Y;
point.CanDelete = true;
}
}
}
That code will perform WORSE in parallel, due to the data access patterns.
The best way to speed it up is to recognize that you don't need to consider all O(N^2) pairs of points, but only the ones having nearby X-coordinates.
First, sort the list by X-coordinate, O(N log N), then process forward and backward in the list from each point until you leave the neighborhood. You'll need to use indexing and not foreach
.
If your sample data, the list is already sorted.
Since your distance test is symmetric, and removes matching points from consideration, you can skip looking at earlier points.
for (int j = 0; j < points.Length; ++j) {
int x1 = points[j].X;
//for (int k = j; k >= 0 && points[k].X > x1 - 2; --k ) { /* merge points */ }
for (int k = j + 1; k < points.Length && points[k].X < x1 + 2; ++k ) { /* merge points */ }
}
Not only is the complexity better, the cache behavior is far superior. And it can be split among multiple threads with far less cache contention.
Well, I don't know exactly what do you want, but let's try.
First, when creating the List, you might want to set it's desired initial size, since you know how many items it will hold. So it does not need to grow all the time.
List<Point> points = new List<Point>(100000);
Next, you could sort the list by the X property. So you would only compare each point with the points that are near it: when you find the first, forward or backward, that is too distant, you can stop comparing.
精彩评论