开发者

Effecient way to check if N number of (x,y)coordinates in K number of rectangles

开发者 https://www.devze.com 2023-03-12 05:02 出处:网络
Is there an efficient way to see if N number of (x,y) points are inside K number of rectangles?Right now I am doing a brute force approach and looping through all points and rectangles but its taking

Is there an efficient way to see if N number of (x,y) points are inside K number of rectangles? Right now I am doing a brute force approach and looping through all points and rectangles but its taking about 2 minutes and 30 seconds with 200,000 points and 44 rectangles.

I am working with Google maps and creating a program to check if points are close to a route on a map.开发者_运维技巧 I calculate multiple Rectangles and Circles along the path and test to see if the existing points lay within these rectangles and circles.

1.The rectangles can overlap depending on the nature of the route.

2.The point only has to be in ONE of the rectangles

3.If the point is on the edge of the rectangle I would like to make it count as inside the rectangle (but if its easier to not count then I won't count it)

4.The rectangles are dependent on what the area I want to search for off the route. Typically they will be 2 miles High (1 mile each direction from point) and the distance from point1 to point2 Wide.


In theory at the very best you'll have to iterate through all 200,000 points -- and in the worst case you'll have to check all of those points against all 44 rectangles (which is what you're doing right now).

Since you know you'll have to loop through all 200,000 points the best you can do is attempt to not have to check all 44 rectangles.

In order to do this you'll have to do some calculations on the rectangles, you find the closest two rectangles and form a larger rectangle which encloses both of them (a cluster if you will). Then you find the next closest rectangle to the rectangle you just formed form another cluster rectangle. You keep doing this until you enclose all of the rectangles (You'll end up having 43 cluster rectangles).

Now loop through the points and check against the largest rectangle, if the point falls within that rectangle, then you check the next largest rectangle, if it doesn't fall within that rectangle then you only need to check the to see if it falls within the rectangle which was used to form that rectangle. If it doesn't fall within that rectangle, then you can move onto the next point because that point doesn't fall within any rectangles (and you discovered this with only 3 checks).


Here are a few possible ideas:

  • Fuzzy match - if your points don't have to be 100% accurately marked as being within a particular rectangle, you could write an algorithm that makes a "best guess" that is more efficient, but sacrifices being 100% accurate
  • Fuzzy first, accurate after - give an approximate answer quickly, maybe by just calculating the distance between a given point, and the top-left corner of a rectangle, or the center of a circle. This will give an approximate answer which may not be 100% accurate, but then allow you to asynchronously continue the calculation, to refine it, and update the display after some time with the 100% accurate data
  • Group the points - when the point it created, have it "register" itself with a rectangle, basically pre-calculating whether or not a point is in a given rectangle, if you can.
  • Pre-calculate+cache - and then cache the list of rectangles a particular point fits into, in the point itself. Then it becomes a simple lookup instead of having to re-calculate it each time
  • Asynchronous loading - can you start displaying the answers as it's being calculated? If it takes 2.5 minutes to do the whole batch, can you show the results in 1,000 point chunks, as you calculate them? This way the user quickly begins to get some feedback while the calculation is finishing the work. At 2.5 minutes, that's 150 seconds. If you could deliver results in 1,000-point chunks (about 1/200th of the data at a time), you might be able to update the point map once every second with results as they become available.
0

精彩评论

暂无评论...
验证码 换一张
取 消