开发者

Fast-Responding Command Line Scripts

开发者 https://www.devze.com 2023-01-31 08:48 出处:网络
I have been writing command-line Python scripts for a while, but recently I felt really frustrated with speed.

I have been writing command-line Python scripts for a while, but recently I felt really frustrated with speed.

I'm not necessarily talking about processing speed, dispatching tasks or other command-line tool-specific processes (that is usually a design/implementation problem), but rather I am talking of simply running a tool to get a help menu, or display minimum information.

As an example, Mercurial is at around 0.080scs and GIT is at 0.030scs

I have looked into Mercurial's source code (it is Python after all) but t开发者_运维知识库he answer to have a fast-responding script still eludes me.

I think imports and how you manage them is a big reason to initial slow downs. But is there a best-practice for fast-acting, fast-responding command line scripts in Python?

A single Python script that import os and optparse and executes main() to parse some argument options takes 0.160scs on my machine just to display the help menu...

This is 5 times slower than just running git!

Edit:

I shouldn't have mentioned git as it is written in C. But the Mercurial part still stands, and no, pyc don't feel like big improvement (to me at least).

Edit 2:

Although lazy imports are key to speedups in Mercurial, they key to slowness in regular Python scripts is not having auto-generated scripts with pkg_resources in them, like:

from pkg_resources import load_entry_point

If you have manually generated scripts that don't use pkg_resources you should see at least 2x speed increases.

However! Be warned that pkg_resources does provide a nice way of version dependencies so make sure you are aware that not using it basically means possible version conflicts.


In addition to compiling the Python files, Mercurial modifies importing to be on demand which does indeed reduce the start-up time. It sets __builtin__.__import__ to its own import function in the demandimport module.

If you look at the hg script in /usr/lib/ (or wherever it is on your machine), you can see this for yourself in the following lines:

try:
    from mercurial import demandimport; demandimport.enable()
except ImportError:
    import sys
    sys.stderr.write("abort: couldn't find mercurial libraries in [%s]\n" %
                     ' '.join(sys.path))
    sys.stderr.write("(check your install and PYTHONPATH)\n")
    sys.exit(-1)

If you change the demandimport line to pass, you will find that the start-up time increases substantially. On my machine, it seems to roughly double.

I recommend studying demandimport.py to see how to apply a similar technique in your own projects.

P.S. Git, as I'm sure you know, is written in C so I'm not surprised that it has a fast start-up time.


I am sorry - but certainly is not the 0.08 seconds that is bothering you. Although you don say it it feels like you are running an "outter" shell (or other language) scritp thatis calling several hundred Python scripts inside a loop - That is the only way these start-up times cound make any difference. So, either are withholding this crucial information in your question, or your father is this guy.

So, assuming you have an external scripts that calls of the order of hundereds of python process: write this external script in Python, and import whatver python stuff you need in the same process and run it from there. There fore you will cut on interpretor start-up and module import for each script execution.

That applies even for mercurial, for example. You can import "mercurial" and the apropriate submodules and call functions inside it that perform the same actions than equivalent command line arguments

0

精彩评论

暂无评论...
验证码 换一张
取 消