I have a long Python tuple t
. I would like to grab the elements at indices i1
, i2
, ..., iN
from t
as efficiently as possible. What's the best way?
One approach is:
(1) result = [t[j] for j in (i1, i2, ..., iN)]
but this would seem to cause N separate lookups into the tuple. Is there a faster way? When Python does slices like this:
(2) result = t[1:M:3]
I assume that it does not p开发者_开发百科erform M/3 separate lookups. (Maybe it uses a bitmask and does a single copy operation?) Is there some way for me to capitalize on whatever Python does in (2) to make my arbitrary-index slice happen in a single copy?
Thanks.
If you are doing a bunch of identical lookups, it may be worth using an itemgetter
from operator import itemgetter
mygetter = itemgetter(i1, i2, ..., iN)
for tup in lots_of_tuples:
result = mygetter(tup)
For one off, the overhead of creating the itemgetter is not worthwhile
Quick test in iPython shows:
In [1]: import random
In [2]: from operator import itemgetter
In [3]: t=tuple(range(1000))
In [4]: idxs = tuple(random.randrange(1000) for i in range(20))
In [5]: timeit [t[i] for i in idxs]
100000 loops, best of 3: 2.09 us per loop
In [6]: mygetter = itemgetter(*idxs)
In [7]: timeit mygetter(t)
1000000 loops, best of 3: 596 ns per loop
Obviously the difference will depend on the length of the tuple, the number of indices, etc.
The one you've listed is the most optimal way to get the elements from a tuple. You usually don't care about the performance in such expressions – it's a premature optimisation, and even if you did, such operations are already too slow even with the optimisations, i.e. if you optimise the access the loop itself will still be slow due to reference counting of the temporary variables and etc.
If you already have a performance issue or this is already part of CPU-heavy code you can try several alternatives:
1) numpy
arrays:
>>> arr = np.array(xrange(2000))
>>> mask = np.array([True]*2000)
>>> mask = np.array([False]*2000)
>>> mask[3] = True
>>> mask[300] = True
>>> arr[mask]
array([ 3, 300])
2) You can use the C API to copy the elements using PyTuple_GET_ITEM
which accesses the internal array directly, but be warned that using the C API is not trivial and can introduce a lot of bugs.
3) You can use C arrays with the C API, using e.g. the buffer interface of array.array
to glue the data access to Python.
4) You can use Cython with C arrays and a custom Cython type for data access from Python.
5) You can use Cython and numpy
together.
Inside the list comprehension there is an implicit for
loop, and I am pretty sure it is iterating through the tuple values with reasonable efficiency. I don't think you can improve on the list comprehension for efficiency.
If you just need the values you might be able to use a generator expression and avoid building the list, for a slight savings in time or memory.
Slicing can be more efficient because it has more constraints: the index must proceed in a linear fashion by a fixed amount. The list comprehension could be completely random so no optimization is possible.
Still it's dangerous to make assumptions about efficiency. Try timing both ways and see if there's a significant difference.
1) Are you sure you need the operation to go faster?
2) Another option is operator.itemgetter
: It returns a tuple picked by its indexes:
>>> t = tuple(string.ascii_uppercase)
>>> operator.itemgetter(13,19,4,21,1)(t)
('N', 'T', 'E', 'V', 'B')
The operator
module is implemented in C, so will likely outperform a Python loop.
精彩评论