开发者

how to search a list for items with a distance lower than F to P without searching the entire list?

开发者 https://www.devze.com 2023-03-03 10:22 出处:网络
I have to search a list of structs for every item whose XZPos is closer to Vector2 (or PointF) P. The list is ordered by XZPos\' x and y. It\'ll look something like this:

I have to search a list of structs for every item whose XZPos is closer to Vector2 (or PointF) P. The list is ordered by XZPos' x and y. It'll look something like this:

Item 1 (XZPos: 0,0)

Item 2 (XZPos: 0,1)

Item 3 (XZPos: 0,2)

...

Item 12 (XZPos: 1,0)

Item 13 (XZPos: 1,1)

Item 14 (XZPos: 1,2)

...

2.249.984 elements later

...

Now I have a point P (4,4) and I want a list of structs in the above list of every item closer to P than 5,66f. My algorithm searches every item in the list like this:

        List<Node> res = new List<Node>();
        for (int i = 0; i < Map.Length; i++)
        {
            Vector2 xzpos = new Vector2(Map[i].X, Map[i].Z);//Map is the list containing 2.250.000 elements
            //neighbourlength = 5,66f in our example
            if ((xzpos - pos).Length() <= neighbourlength && xzpos != pos)//looking for every item except the item which is P itself, therefore checki开发者_如何学Pythonng whether "xzpos != pos"
            {
                res.Add(new Node() { XZPos = xzpos, /*Other fields*/ }); 
            }
        }
        return res.ToArray();

My problem is that it takes way too long to complete, and now I'm looking for a way to find the fields I'm looking for without searching the entire list. 22 seconds for a search is TOO LONG. If someone could help me get it to 1 or 2 seconds that would be very nice.

Thanks for your help, Alex


Your list is sorted, and you can use that to shrink the problem space. Instead of searching the whole list, search the subset of the list that spans x values within 5,66f, since anything that is farther than the maximum on one coordinate will be farther than you want no matter what the other coordinate is. Then if you store the start positions of each value in the list (i.e. in your example, "0" elements start at 1, and "1" elements start at 12), you can quickly get to the part of the list you care about. So instead of iterating through items 0 to 2 million, you could instead be iterating through items 175839 to 226835.

Example:

The List
1: (0,1)
2: (0,2)
3: (1,0)
4: (1,1)
5: (2,1)
6: (2,3)
7: (3,1)
8: (3,5)
9: (4,2)
10: (4,5)
11: (5,1)
12: (5,2)
13: (6,1)
14: (6,2)
15: (6,3)
16: (7,1) 
17: (7,2)   

Start Locations
(0,1)
(1,3)
(2,5)
(3,7)
(4,9)
(5,11)
(6,13)
(7,16)

If I have a point (3,5) and I want to search the list for points within 2 of it, I only need to iterate through the points where x is between 1 and 5. So I look at my start locations, and see that 1 starts at position 3 in the list, and 5 ends at position (13 - 1). So instead of iterating from 1 to 17, I only need to iterate from 3 to 12. If the range of values in your data is large but the distance to check is short, this will reduce the number of entries you need to iterate across greatly.


To the below solution there's a few assumptions. These assumptions are not stated in your question an the solution might need adjustment. If the assumptions hold this solution is extremely fast but requires you to keep the data in a different structure. However if you can change the structure then this will have a look of for each valid point in (almost) constant time. That is the time required to find M valid points in a set containing M point in total depends only on N.

The assumptions are:

  • Only positive integers are used for values of X and Y
  • No duplicate points in the list

`private IEnumerable> GetPointsByDistance(int xOrigin, int yOrigin, double maxDistance){

                    var maxDist = (int) Math.Floor(maxDistance);
                    //Find the lowest possible value for X
                    var minX = Math.Min(0, xOrigin - maxDist);
                    //Find the highest possible value for X
                    var maxX = Math.Min(MaxX, xOrigin + maxDist);
                    for (var x = minX; x <= maxX; ++x){
                        //Get the possible values for Y with the current X
                        var ys = points[x];
                        if (ys.Length > 0){
                            //Calculate the max delta for Y
                            var maxYDist =(int)Math.Floor(Math.Sqrt(maxDistance*maxDistance - x*x));
                            //Find the lowest possible Y for the current X
                            var minY = Math.Min(0, yOrigin - maxYDist);
                            //Find the highest possible Y for the current X
                            var maxY = Math.Min(ys.Length, yOrigin + maxYDist);
                            for (var y = minY; y <= maxY; ++y){
                                //The value in the array will be true if a point with the combination (x,y,) exists
                                if (ys[y]){
                                    yield return new KeyValuePair<int, int>(x, y);
                                }
                            }
                        }
                    }
                }`

The internal representation of the point in this code is bool[][]. The value will be true if the indices of the two arrays is a point included in the set. E.g. points[0][1] will be true for if the set was the six points you've included in the post and points[0][3] would be false.

If the assumptions are not met the same idea can still be used but will need tweeking. If you post what the assumptions should be I'll update the answer to the question.

An assumption that does not have to be met is that the points are close to sequential. If they are not this will be rather greedy when it comes to memory. Changes can be made if the points are not (close) to sequential and again post the invariants if my assumptions are wrong and I'll update.


I have no idea what your original code is doing. I see you've defined x and y and never use them and you refer to pos but it's not defined. So, instead I'm going to take the verbal description of the problem your trying to solve

Now I have a point P (4,4) and I want a list of structs in the above list of every item closer to P than 5,66f.

and solve it. This is easy:

var nearbyNeighbors = Map.Where(
    point => (point.X - 4) * (point.X - 4) +
             (point.Z - 4) * (point.Z - 4) <
             5.66f * 5.66f
    );

foreach(var nearbyNeighbor in nearbyNeighbors) {
    // do something with nearbyNeighbor
}

Here, I am assuming that you are using the standard Euclidean norm.

To utilize the fact that the collection is lexicographically sorted:

Binary search until |point.X - 4| >= 5.66. This splits the list in two contiguous sublists. Throw out the sublist with |point.X - 4| >= 5.66. From the remaining sublist, binary search until |point.Y - 4| >= 5.66 This splits the remaining sublist into two contiguous sublists. Throw out the sublist with |point.Y - 4| >= 5.66. Then search linearly through the remaining sublist.


This problem is related to this:

http://en.wikipedia.org/wiki/Nearest_neighbor_search

Take a look for sub-linear solutions.

Also take a look at this question

which data structure is appropriate to query "all points within distance d from point p"

0

精彩评论

暂无评论...
验证码 换一张
取 消