Does @geodist
search use 开发者_C百科any sort of geospatial indexes (like R-trees) for performance?
I'm interested in case when anchor is constant and each document has it's own latitude/longitude pair stored in radians.
I've tried to figure it out from Sphinx source code, but failed to find any mentions of any spatial index. If no indexes are used for geospatial search, then how is performance ensured?
Does Sphinx do a full scan if no keywords are provided?
Background: We have a dataset of 100+ millions of short entries. Some of newly added items will have latitude/longitude stored. Millions of entries are added each day. I predict that about 5-10% of newly added entries will have location information.
Our goal is to implement spatial search for location-enabled entries for queries like "get all entries in 100 meters radius around anchor point", "get 100 nearest entries around anchor point" with and without keyword search.
Some googling returned this forum thread which suggests using artificial grid-based index to ensure performance. Is this still the case?
No, sphinx does not have any inbuilt geospatial indexing - hence the reason for the tiles (to make a rudimentry geospatial index :)
It really does just do a spherical distance calculation against every row - a full-table scan. Its resonably quick, because attributes are all held in memory.
Check the source: http://codesearch.google.com/#vqMBzkK4ih0/src/sphinxexpr.cpp&exact_package=git://github.com/squadette/sphinxsearch.git&q=cos%20sphinxsearch&type=cs&l=1186
Most recent thread discussing this on sphinx forum http://sphinxsearch.com/forum/view.html?id=8644
精彩评论