Given two normal python lists, newlist
and oldlist
, with an integer index
< len(oldlist)
, I'd like to perform the following operation:
newlist.extend(oldlist[index:])
but without creating the intermediate list oldlist[index:]
, or equivalently,
newlist.extend(oldlist[i] for i in xrange(index, len(oldlist)))
without the overhead of a generator. Is that possible without using C?
Edit: This question derived from some looking at the c implementation of some list operations, in particular for list.extend()
, when the interpreter determines that it can guess the size of the tail being added to the list, it allocates that full size to the head list and copies the elements as they are generated; for other cases, it allocates a few elements at a time (about eight, if memory serves), and copies elements in a few at a time.
The specific cases when it does the full allocation seemed to be for python lists, and a few other types that have a __len__
. As far as I can tell, there's no built in type of 'list view' that would 开发者_JAVA技巧satisfy those requirements.
Don't guess, measure
create = """
oldlist = range(5000)
newlist = range(5000, 10000)
index = 500
"""
tests = [
"newlist.extend(oldlist[index:])",
"newlist.extend(oldlist[i] for i in xrange(index, len(oldlist)))",
"newlist.extend(islice(oldlist, index, None))",
"""\
while index < len(oldlist):
newlist.append(oldlist[index])
index+=1""",
]
import timeit
for test in tests:
t = timeit.Timer(create + test, setup='from itertools import islice')
print test, min(t.repeat(number=100000))
newlist.extend(oldlist[index:]) 17.2596559525
newlist.extend(oldlist[i] for i in xrange(index, len(oldlist))) 53.5918159485
newlist.extend(islice(oldlist, index, None)) 19.6523411274
while index < len(oldlist):
newlist.append(oldlist[index])
index+=1 123.556715012
The obvious solution would be:
while index < len(oldlist):
newlist.append(oldlist[index])
index += 1
But be wary of premature optimization, I've never run into a situation in which the loss of readability in this solution would be worth it. And, of course, benchmark all options to make sure that the solution you think is faster, actually is.
appendnew = newlist.append
try:
while 1:
appendnew(oldlist[index])
index += 1
except IndexError:
pass
or, slightly less bogglingly:
appendnew = newlist.append
for i in xrange(index, len(oldlist)):
appendnew(oldlist[i])
Some clues on better benchmarking
Measure the overhead and subtract it off.
Put the code inside a function or method (simulates reality; helps ensure no nasty effects from having variables as global).
from itertools import islice
def f0(newlist, oldlist, index):
pass
def f1(newlist, oldlist, index):
newlist.extend(oldlist[index:])
def f2(newlist, oldlist, index):
newlist.extend(oldlist[i] for i in xrange(index, len(oldlist)))
def f3(newlist, oldlist, index):
newlist.extend(islice(oldlist, index, None))
def f4(newlist, oldlist, index):
while index < len(oldlist):
newlist.append(oldlist[index])
index += 1
>python -mtimeit -s"old=range(1000);new=range(5000,10000);ix=500;import xtnd"; "xtnd.f4(new,old,ix)"
If the code being benchmarked has a variable N (in this case N = len(oldlist) - index), benchmark with more than one value of N. If you expect O(N) behaviour, O(1) results should be a cause for investigation.
Also compare results across pairs of candidates with reasonable expectations --- wild variations should be investigated; they could be caused by experimental error.
精彩评论