开发者

Python PIL library perfomance

开发者 https://www.devze.com 2023-02-20 04:31 出处:网络
The problem I\'ve got is that I have a Python script that encodes about thousands of images each time I execute it and it doesn\'t take the most of memory and CPU.

The problem I've got is that I have a Python script that encodes about thousands of images each time I execute it and it doesn't take the most of memory and CPU.

How could I improve the perfomance avoiding the I/O overhead?

The script generates 5000 thumbnails each time is executed and I was wondering if it would be possible to store the images on memory and then, "flush" them to the hard disk to increase the perfomance.

Do you have some advices to enhace my script's perfomance?

A snippet of the code inside the loop:

开发者_如何学Python
im = Image.open(StringIO.StringIO(urllib.urlopen(imagen_url).read()))
im.thumbnail((100, 50), Image.ANTIALIAS)

if im.mode != "RGB":
    im = im.convert("RGB")

im.save(dir + (imagen % coche_id), "JPEG")

Most of the time is spent with the urlopen(), but im.save could be improvable I think.

I'm still starting with Python and I think you could help me to improve my code.

Thanks a lot!

pd: sorry if my english is not as good as it should be.


It's possible you could exploit some i/o concurrency by running several threads concurrently. This may help in two ways:

  • More TCP connections == faster (although being a good Internet citizen is important sometimes too).
  • The program you currently have first reads the remote url into memory, then processes it, then saves to disk. The CPU isn't being made full use of since some of the time you're waiting for data to be received. During this time you aren't doing the processing.

In this case the GIL isn't a problem since it's released during i/o operations.

If you want to prevent the files being written to disk immediately, one approach may be to turn off fsync on the device they're being written to if it's on currently.


Most of the time is spent with the urlopen(), but im.save could be improvable I think.

That is because urlopen (or rather the read) bit does IO over the network, which will be slow depending on the bandwidth available to and the server.

So yea, there's not a lot you can do to speed that up if you want to download 5000 images over the net.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号