I am working with some spreadsheet data and I have a set of cell regions that are of arbitrary bound开发者_运维百科s. Given any cell, what is the fastest way to determine the subset of regions which contain the cell?
Currently, the best I have is to sort the regions with the primary sort field being the region's starting row index, followed by its ending row index, starting column index, and then ending column index. When I want to search based on a given cell, I binary search to the first region whose starting row index is after the cell's row index and then I check all regions before that one to see if they contain the cell, but this is too slow.
Based on some Googling, this is an example of the two dimensional point enclosure searching problem, or the "stabbing problem". See:
http://www.cs.nthu.edu.tw/~wkhon/ds/ds10/tutorial/tutorial6.pdf
of here (starting at p.21/52):
http://www.cs.brown.edu/courses/cs252/misc/slides/orthsearch.pdf
The key data structure involved is the segment tree:
http://en.wikipedia.org/wiki/Segment_tree
For the 2-D case, it looks like you can build a segment tree containing segment trees and get O(log^2(n)) query complexity. (I think your current solution is O(n) since on average you'll just exclude half of your regions with your binary search.)
However, you said "spreadsheet", which means you've probably got a relatively small area to work with. More importantly, you've got integer coordinates. And you said "fastest", which means you're probably willing to trade space and setup time for a faster query.
You didn't say which spreadsheet, but the code below is a wildly-inefficient, but dirt-simple, brute-force Excel/VBA implementation of a 2-D lookup table that, once set up, has O(1) query complexity:
Public Sub brutishButShort()
Dim posns(1 To 65536, 1 To 256) As Collection
Dim regions As Collection
Set regions = New Collection
Call regions.Add([q42:z99])
Call regions.Add([a1:s100])
Call regions.Add([r45])
Dim rng As Range
Dim cell As Range
Dim r As Long
Dim c As Long
For Each rng In regions
For Each cell In rng
r = cell.Row
c = cell.Column
If posns(r, c) Is Nothing Then
Set posns(r, c) = New Collection
End If
Call posns(r, c).Add(rng)
Next cell
Next rng
Dim query As Range
Set query = [r45]
If Not posns(query.Row, query.Column) Is Nothing Then
Dim result As Range
For Each result In posns(query.Row, query.Column)
Debug.Print result.address
Next result
End If
End Sub
If you have a larger grid to worry about or regions that are large relative to the grid, you can save a ton of space and setup time by using two 1-D lookup tables instead. However, then you have two lookups, plus a need to take the intersection of the two resulting sets.
I think you want to determine if the Intersect of the cell and the region is Nothing
Sub RegionsContainingCell(rCell As Range, ParamArray vRegions() As Variant)
Dim i As Long
For i = LBound(vRegions) To UBound(vRegions)
If TypeName(vRegions(i)) = "Range" Then
If Not Intersect(rCell, vRegions(i)) Is Nothing Then
Debug.Print vRegions(i).Address
End If
End If
Next i
End Sub
Sub test()
RegionsContainingCell Range("B50"), Range("A1:Z100"), Range("C2:C10"), Range("B1:B70"), Range("A1:C30")
End Sub
精彩评论