I have a list of many Python objects like this:
class RangeClass(object):
def __init__(self,address,size):
self.address=address
开发者_如何学编程 self.size=size
#other attributes and methods...
Then, I have a list (rangelist) of RangeClass objects.
I need to find within which range a given value is.
I can use some code like this:
for r in ragelist:
if(value>=r.address and value<(r.address+r.size)):
return r
return None
But I think there is a faster way. Ranges have arbitrary size, but we can assume that they don't overlap.
Thank you.
If you have many values to test, then you could use the bisect module to find which range the values are in more quickly.
If
m
= the number of values to test, andn
=len(rangelist)
then looping through the values and rangelist as you suggest would take O(m*n)
time.
If you use bisection, then you must first sort the starting addresses O(nlogn)
and find each value's place in rangelist O(m*logn)
.
So if
O(nlogn + m*logn) < O(m*n)
then bisection wins. For large n
, O(m*logn)
is miniscule compared to O(m*n)
.
So the inequality above would be true if
O(nlogn) < O(m*n)
or equivalently, when
C log(n) < m
for some constant C.
Thus, when n
is large and C log(n) < m
, you might try something like
import bisect
class RangeClass(object):
def __init__(self,address,size):
self.address=address
self.size=size
def __str__(self):
return '[{0},{1})'.format(self.address,self.address+self.size)
def __lt__(self,other):
return self.address<other.address
rangelist=sorted([RangeClass(i,1) for i in (1,3,4,5,7.5)])
starts=[r.address for r in rangelist]
def find_range(value):
start_idx=bisect.bisect(starts,value)-1
try:
r=rangelist[start_idx]
except IndexError:
return None
start=r.address
end=r.address+r.size
if start<=value<end:
return rangelist[start_idx]
return None
print(','.join(str(r) for r in rangelist))
for value in (0,1,1.5,2,3,4,5,6,7,8,9,10):
r=find_range(value)
if r:
print('{v} in {r}'.format(v=value,r=str(r)))
else:
print('{v} not in any range'.format(v=value))
Not really. All you can do is take advantage of Python's relational operator chaining.
if r.address <= value < (r.address + r.size):
You could also define __contains__
on RangeClass
to allow you to use in
to find it instead.
class RangeClass(object):
...
def __contains__(self, val):
return self.address <= val < (self.address + self.size)
...
if value in r:
Implement the comparison operator in Range, have the range list sorted, and use bisect to search for the range a value belongs in:
import bisect
def find_range(value):
index = bisect.bisect(rangelist, value)
if index not in (0, len(rangelist)):
index -= 1
return rangelist[index]
Thank you to everyone,
I am actually using the method proposed by unutbu.
Moreover, I add another optimization:
if(value <= first_range_value or value >= last_range_value):
return None
Where first_range_value and last_range_value have been computed before, and they are the smallest r.address value and the largest r.address+r.size value .
This is worthing in my application, but it really depends on the distribution of ranges and values.
精彩评论