开发者

Can this loop be sped up in pure Python?

开发者 https://www.devze.com 2023-02-04 16:28 出处:网络
I was trying out an experiment with Python, trying to find out how many times it could add one to an integer in one minute\'s time. Assuming two computers are the same except for the speed of the CPUs

I was trying out an experiment with Python, trying to find out how many times it could add one to an integer in one minute's time. Assuming two computers are the same except for the speed of the CPUs, this should give an estimate of how fast some CPU operations may take for the computer in question.

The code below is an example of a test designed to fulfill the requirements given above. This version is about 20% faster than the first attempt and 150% faster than the third attempt. Can anyone make any suggestions as to how to get the most additions in a minute's time span? Higher numbers are desireable.

EDIT 1: This experiment is being written in Python 3.1 and is 15% faster than the fourth speed-up attempt.

def start(seconds):
    import time, _thread
    def stop(seconds, signal):
        time.sleep(seconds)
        signal.pop()
    total, signal = 0, [None]
    _thread.start_new_thread(stop, (seconds, signal))
    while signal:
        total += 1
    return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))

EDIT 2: Regarding using True instead of 1 in the while loop: there should be no speed difference. The following experiment proves that they are the same. First, create a file named main.py and copy the following code into it.

def test1():
    total = 0
    while 1:
        total += 1

def test2():
    total = 0
    while True:
        total += 1

if __name__ == '__main__':
    import dis, main
    dis.dis(main)

Running the code should produce the following output that shows how the code was actually compiled and what the generated Python Virtual Machine Instructions turned out to be.

Disassembly of test1:
  2           0 LOAD_CONST               1 (0) 
              3 STORE_FAST               0 (total) 

  3           6 SETUP_LOOP              13 (to 22) 

  4     >>    9 LOAD_FAST                0 (total) 
             12 LOAD_CONST               2 (1) 
             15 INPLACE_ADD          
             16 STORE_FAST               0 (total) 
             19 JUMP_ABSOLUTE            9 
        >>   开发者_Python百科22 LOAD_CONST               0 (None) 
             25 RETURN_VALUE         

Disassembly of test2:
  7           0 LOAD_CONST               1 (0) 
              3 STORE_FAST               0 (total) 

  8           6 SETUP_LOOP              13 (to 22) 

  9     >>    9 LOAD_FAST                0 (total) 
             12 LOAD_CONST               2 (1) 
             15 INPLACE_ADD          
             16 STORE_FAST               0 (total) 
             19 JUMP_ABSOLUTE            9 
        >>   22 LOAD_CONST               0 (None) 
             25 RETURN_VALUE         

The emitted PVMIs (byte codes) are exactly the same, so both loops should run without any difference in speed.


I see almost the same but consistently better (~2%) results than the @Amber's one on my machine on Python 3.1.2 for the code:

import signal

class Alarm(Exception):
    pass

def alarm_handler(signum, frame):
    raise Alarm

def jfs_signal(seconds):
    # set signal handler
    signal.signal(signal.SIGALRM, alarm_handler)
    # raise Alarm in `seconds` seconds
    signal.alarm(seconds)

    total = 0
    try:
        while 1:
            total += 1
    finally:
        signal.alarm(0) # disable the alarm
        return total

Here's variant that uses subprocess module to run interruptible loop:

#!/usr/bin/env python
# save it as `skytower.py` file
import atexit
import os
import signal
import subprocess
import sys
import tempfile
import time

def loop():
    @atexit.register
    def print_total():
        print(total)

    total = 0
    while 1:
        total += 1

def jfs_subprocess(seconds):
    # start process, redirect stdout/stderr
    f = tempfile.TemporaryFile() 
    p = subprocess.Popen([sys.executable, "-c",
                          "from skytower import loop; loop()"],
                         stdout=f, stderr=open(os.devnull, 'wb'))
    # wait 
    time.sleep(seconds)

    # raise KeyboardInterrupt
    #NOTE: if it doesn't kill the process then `p.wait()` blocks forever
    p.send_signal(signal.SIGINT) 
    p.wait() # wait for the process to terminate otherwise the output
             # might be garbled

    # return saved output
    f.seek(0) # rewind to the beginning of the file
    d = int(f.read())
    f.close()
    return d

if __name__ == '__main__':
    print('total:', jfs_subprocess(60))

It is ~20% slower than the signal.alarm()'s variant on my machine.


About a 20-25% improvement, FWIW - but like others, I'd propose that Python incrementing integers probably isn't the best benchmarking tool.

def start(seconds):
    import time, _thread
    def stop(seconds):
        time.sleep(seconds)
        _thread.interrupt_main()
    total = 0
    _thread.start_new_thread(stop, (seconds,))
    try:
        while True:
            total += 1
    except:
        return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))


This exercise on learning more about Python and computers was satisfying. This is the final program:

def start(seconds, total=0):
    import _thread, time
    def stop():
        time.sleep(seconds)
        _thread.interrupt_main()
    _thread.start_new_thread(stop, ())
    try:
        while True:
            total += 1
    except KeyboardInterrupt:
        return total

if __name__ == '__main__':
    print('Testing the CPU speed ...')
    print('Relative speed:', start(60))

Running it on Windows 7 Professional with a 2.16 GHz CPU produced the following output within IDLE:

Python 3.1.3 (r313:86834, Nov 27 2010, 18:30:53) [MSC v.1500 32 bit (Intel)] 
on win32
Type "copyright", "credits" or "license()" for more information.
>>> ================================ RESTART ================================
>>> 
Testing the CPU speed ...
Relative speed: 673991388
>>> 

Edit: The code up above only runs on one core. The following program was written to fix that problem.

#! /usr/bin/env python3

def main(seconds):
    from multiprocessing import cpu_count, Barrier, SimpleQueue, Process
    def get_all(queue):
        while not queue.empty():
            yield queue.get()
    args = seconds, Barrier(cpu_count()), SimpleQueue()
    processes = [Process(target=run, args=args) for _ in range(cpu_count())]
    for p in processes:
        p.start()
    for p in processes:
        p.join()
    print('Relative speed:', sorted(get_all(args[-1]), reverse=True))

def run(seconds, barrier, queue):
    from time import sleep
    from _thread import interrupt_main, start_new_thread
    def terminate():
        sleep(seconds)
        interrupt_main()
    total = 0
    barrier.wait()
    start_new_thread(terminate, ())
    try:
        while True:
            total += 1
    except KeyboardInterrupt:
        queue.put(total)

if __name__ == '__main__':
    main(60)
0

精彩评论

暂无评论...
验证码 换一张
取 消