开发者

Nearest Neighbor Search on large database table - SQL and/or ArcGis

开发者 https://www.devze.com 2023-04-04 23:49 出处:网络
Sorry for posting something that\'s probably obvious, but I don\'t have much database experience. Any help would be greatly appreciated - but remember, I\'m a beginner :-)

Sorry for posting something that's probably obvious, but I don't have much database experience. Any help would be greatly appreciated - but remember, I'm a beginner :-)

I have a table like this:

Table.fruit

ID type Xcoordinate Ycoordinate Taste Fruitiness

1 Apple 3 3 Good 1,5

2 Orange 5 4 Bad 2,9

3 Apple 7 77 Medium 1,4

4 Banana 4 69 Bad 9,5

5 Pear 9 15 Medium 0,1

6 Apple 3 38 Good -5,8

7 Apple 1 4 Good 3

8 Banana 15 99 Bad 6,8

9 Pear 298 18789 Medium 10,01

… … … … … …

1000 Apple 1344 1388 Bad 5

… … … … … …

1958 Banana 759 1239 Good 1

1959 Banana 3 4 Medium 5,2

I need: A table that gives me

The n (eg.: n=5) closest points to EACH point in the original table, including distance Table.5nearest (please note that the distances are fake). So the resulting table has ID1, ID2 and distance between ID1 and ID2 (can't post images yet, unfortunately).

ID.Fruit1 ID.Fruit2 Distance 1 1959 1 1 7 2 1 2 2 1 5 30 1 14 50 2 1959 1 2 1 2 … … … 1000 1958 400 1000 Xxx Xxx … … …

How can I do this (ideally with 开发者_高级运维SQL/database management) or in ArcGis or similar? Any ideas? Unfortunately, my table contains 15000 datasets, so the resulting table will have 75000 datasets if I choose n=5. Any suggestions GREATLY appreciated.

EDIT:

Thank you very much for your comments and suggestions so far. Let me expand on it a little: The first proposed method is sort of a brute-force scan of the whole table rendering huge filesizes or, likely, crashes, correct? Now, the fruit is just a dummy, the real table contains a fix ID, nominal attributes ("fruit types" etc), X and Y spatial columns (in Gauss-Krueger) and some numeric attributes. Now, I guess there is a way to code a "bounding box" into this, so the distances calculation is done for my point in question (let's say 1) and every other point within a square with a certain edge length. I can imagine (remotely) coding or querying for that, but how do I get the script to do that for EVERY point in my ID column. The way I understand it, this should either create a "subtable" for each record/point in my "Table.Fruit" containing all points within the square around the record/point with a distance field added - or, one big new table ("Table.5nearest"). I hope this makes some kind of sense. Any ideas? THanks again


To get all the distances between all fruit is fairly straightforward. In Access SQL (although you may need to add parentheses everywhere to get it to work :P):

select   fruit1.id,
         fruit2.id,
         sqr(((fruit2.xcoordinate - fruit1.xcoordinate)^2) + ((fruit2.ycoordinate - fruit1.ycoordinate)^2)) as distance
from     fruit as fruit1
join     fruit as fruit2
on       fruit2.id <> fruit1.id
order by distance;

I don't know if Access has the necessary sophistication to limit this to the "top n" records for each fruit; so this query, on your recordset, will return 225 million records (or, more likely, crash while trying)!


Thank you for your comments so far; in the meantime, I have gone for a pre-fabricated solution, an add-in for ArcGis called Hawth's Tools. This really works like a breeze to find the n closest neighbors to any point feature with an x and y value. So I hope it can help someone with similar problems and questions. However, it leaves me with a more database-related issue now. Do you have an idea how I can get any DBMS (preferably Access), to give me a list of all my combinations? That is, if I have a point feature with 15000 fruits arranged in space, how do I get all "pure banana neighborhoods" (apple, lemon, etc.) and all other combinations? Cheers and best wishes.

0

精彩评论

暂无评论...
验证码 换一张
取 消