The period of the Mersenne Twister used in the module random
is (I am told) 2**19937 - 1. As a binary number, that is 19937 '1's in a row (if I'm not mistaken). Python converts it to decimal pretty darned fast:
$ python -m timeit '2**19937'
10000000 loops, best of 3: 0.0271 usec per loop
$ python -m timeit -s 'result = 0' 'result += 2**19937'
100000 loops, best of 3: 2.09 usec per loop
I guess the second version is the one that requires conversion?
And it's not just binary. This is also fast. (Rather than show the numbers, I show the length of the decimal converted to a string):
>>> import math
>>> N = 1000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
10787
>>> N = 5000
>>> s = str((int(N*math.e))**(int(N*math.pi)))
>>> len(s)
64921
Timing:
python -m timeit -s 'import math' -s 'N=1000' 's = str((int(N*math.e))**(int(N*math.pi)))'
10 loops, best of 3: 51.2 msec per loop
The question is: how is this actually done?
Am I just naive to be impressed? I find the sight of the Python shell generating a number of 5000 or so places in an instant truly spectacular.
Edit:
Additional timings suggested by @dalke and @truppo
$ python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 230 usec per loop
$ python -m timeit 'x=2' 'int(x**19937)'
1000 loops, best of 3: 232 usec per loop
$ python -m timeit 'x=2' 'str(x**19937)'
100 loops, best of 3: 16.6 m开发者_StackOverflow中文版sec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937'
1000 loops, best of 3: 237 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'int(result)'
1000 loops, best of 3: 238 usec per loop
$ python -m timeit -s 'result = 0' 'x = 2' 'result += x**19937' 'str(result)'
100 loops, best of 3: 16.6 msec per loop
So it looks to me like result = 0; result += 2**19937
probably does force the conversion.
Hate to rain on your parade, but the reason it's so fast is because the math module is actually not implemented in Python.
Python supports loading shared libraries that export Python APIs, but are implemented in other languages. math.so, which provides the module you get from import math
, happens to be one of those (and so is _random.so).
When compiling to byte code, constant expressions such as 2**19937
will optimized down to a single constant:
>>> def foo(): return 2**10
...
>>> import dis
>>> dis.dis(foo)
1 0 LOAD_CONST 3 (1024)
3 RETURN_VALUE
>>>
Consider instead:
[~] python -m timeit 'x=2' 'x**19937'
1000 loops, best of 3: 210 usec per loop
Python converts it to decimal pretty darned fast.
I don't know Python, but no, it needn't to do that. 2^19937 don't need computations, it is simply a binary shift ("<<") with 19937, so it is very fast. Only if you print that in decimal the actual conversion is necessary and that is much slower.
EDIT: Exponentiation can be the same as shifting (=moving the point) if the number base is identical to the exponent base.
10^-1 = 0.1
10^0 = 1
10^1 = 10
10^2 = 100
10^3 = 1000
10^n = 1 (n zeroes)
You see that exponentiation of 10 with the exponent n simply shifts the number. Now computers mostly use the internal base 2 (bits) , so calculating 2^19937 is setting a bit in position 19937 (counting bit positions from zero).
As additional information: The conversion to decimal is normally implemented by a conquer-and-divide algorithm which successively divides the number by powers of ten. As you see,
the actual conversion is slower by a factor of half a million.
The second example is more interesting: As you are computing m^n with large integers m,n the fastest way is multiplying it in succession and store the temporary results.
Example: 10^345
a = 10^2
b = aa = 10^4
c = bb = 10^16
d = c*c = 10^256
result = dccccccccbb*10
(Can be further optimized: see Knuth, Seminumerical Algorithms)
So you only need long multiplications and they can be computed pretty effectively.
EDIT: The exact implementation of multiplication depends: Besides the normal school multiplication Karatsuba, Tom-Cooke and Schoenhagen-Strasse (FFT) multiplication is used.
I know little to nothing about how this is actually implemented in Python, but given that this is basically primitive multiplication and logarithms, I'm not terribly surprised that it's reasonably fast even on quite large numbers.
There are arbitrary precision math libraries such as GMP that implement a wide variety of operations in a very effective manner, optimized in assembly, for just this purpose.
精彩评论