I am reading David Beazley's Python Reference book and he makes a point:
开发者_运维技巧For example, if you were performing a lot of square root operations, it is faster to use 'from math import sqrt' and 'sqrt(x)' rather than typing 'math.sqrt(x)'.
and:
For calculations involving heavy use of methods or module lookups, it is almost always better to eliminate the attribute lookup by putting the operation you want to perform into a local variable first.
I decided to try it out:
first()
def first():
from collections import defaultdict
x = defaultdict(list)
second()
def second():
import collections
x = collections.defaultdict(list)
The results were:
2.15461492538
1.39850616455
Optimizations such as these probably don't matter to me. But I am curious as to why the opposite of what Beazley has written comes out to be true. And note that there is a difference of 1 second, which is singificant given the task is trivial.
Why is this happening?
UPDATE:
I am getting the timings like:
print timeit('first()', 'from __main__ import first');
print timeit('second()', 'from __main__ import second');
The from collections import defaultdict
and import collections
should be outside the iterated timing loops, since you won't repeat doing them.
I guess that the from
syntax has to do more work that the import
syntax.
Using this test code:
#!/usr/bin/env python
import timeit
from collections import defaultdict
import collections
def first():
from collections import defaultdict
x = defaultdict(list)
def firstwithout():
x = defaultdict(list)
def second():
import collections
x = collections.defaultdict(list)
def secondwithout():
x = collections.defaultdict(list)
print "first with import",timeit.timeit('first()', 'from __main__ import first');
print "second with import",timeit.timeit('second()', 'from __main__ import second');
print "first without import",timeit.timeit('firstwithout()', 'from __main__ import firstwithout');
print "second without import",timeit.timeit('secondwithout()', 'from __main__ import secondwithout');
I get results:
first with import 1.61359190941
second with import 1.02904295921
first without import 0.344709157944
second without import 0.449721097946
Which shows how much the repeated imports cost.
I'll get also similar ratios between first(.)
and second(.)
, only difference is that the timings are in microsecond level.
I don't think that your timings measure anything useful. Try to figure out better test cases!
Update:
FWIW, here is some tests to support David Beazley's point.
import math
from math import sqrt
def first(n= 1000):
for k in xrange(n):
x= math.sqrt(9)
def second(n= 1000):
for k in xrange(n):
x= sqrt(9)
In []: %timeit first()
1000 loops, best of 3: 266 us per loop
In [: %timeit second()
1000 loops, best of 3: 221 us per loop
In []: 266./ 221
Out[]: 1.2036199095022624
So first()
is some 20% slower than second()
.
first()
doesn't save anything, since the module must still be accessed in order to import the name.
Also, you don't give your timing methodology but given the function names it seems that first()
performs the initial import, which is always longer than subsequent imports since the module must be compiled and executed.
My guess, your test is biased and the second implementation gains from the first one already having loaded the module, or just from having it loaded recently.
How many times did you try it? Did you switch up the order, etc..
There is also the question of efficiency of reading/understanding the source code. Here's a real live example (code from a stackoverflow question)
Original:
import math
def midpoint(p1, p2):
lat1, lat2 = math.radians(p1[0]), math.radians(p2[0])
lon1, lon2 = math.radians(p1[1]), math.radians(p2[1])
dlon = lon2 - lon1
dx = math.cos(lat2) * math.cos(dlon)
dy = math.cos(lat2) * math.sin(dlon)
lat3 = math.atan2(math.sin(lat1) + math.sin(lat2), math.sqrt((math.cos(lat1) + dx) * (math.cos(lat1) + dx) + dy * dy))
lon3 = lon1 + math.atan2(dy, math.cos(lat1) + dx)
return(math.degrees(lat3), math.degrees(lon3))
Alternative:
from math import radians, degrees, sin, cos, atan2, sqrt
def midpoint(p1, p2):
lat1, lat2 = radians(p1[0]), radians(p2[0])
lon1, lon2 = radians(p1[1]), radians(p2[1])
dlon = lon2 - lon1
dx = cos(lat2) * cos(dlon)
dy = cos(lat2) * sin(dlon)
lat3 = atan2(sin(lat1) + sin(lat2), sqrt((cos(lat1) + dx) * (cos(lat1) + dx) + dy * dy))
lon3 = lon1 + atan2(dy, cos(lat1) + dx)
return(degrees(lat3), degrees(lon3))
Write your code as usual, importing a module and referencing its modules and constants as module.attribute
. Then, either prefix your functions with the decorator for binding constants or bind all modules in your program by using the bind_all_modules
function below:
def bind_all_modules():
from sys import modules
from types import ModuleType
for name, module in modules.iteritems():
if isinstance(module, ModuleType):
bind_all(module)
def bind_all(mc, builtin_only=False, stoplist=[], verbose=False):
"""Recursively apply constant binding to functions in a module or class.
Use as the last line of the module (after everything is defined, but
before test code). In modules that need modifiable globals, set
builtin_only to True.
"""
try:
d = vars(mc)
except TypeError:
return
for k, v in d.items():
if type(v) is FunctionType:
newv = _make_constants(v, builtin_only, stoplist, verbose)
try: setattr(mc, k, newv)
except AttributeError: pass
elif type(v) in (type, ClassType):
bind_all(v, builtin_only, stoplist, verbose)
精彩评论