I'm working on an application that utilizes Amazon's MWS API written mostly in PHP. This application allows Amazon sellers to sign up, provide some Amazon seller credentials and the application then begins to download this user's orders from Amazon placing them into a MySQL database.
I've built the methods to sync data accurately to the database using a script with multiple functions but noticed this takes way too long. This script simply loops through all users in the database and iterates all orders one at a time. Right now, with only 5 test users, th开发者_Go百科e time is doable but I'm looking for a more extensible method. Think about 500 users all running synchronously one at a time. WAY too long!
I'm pretty new to PHP and especially running asynchronous processes from it. The only way I've found that I can do this is to have a starter script that finds all users in the database and spawns a sync script for each user then releases it. I don't like this idea because if I did had 500+ users, my nightly sync would consist of 500 spawned instances of this PHP script.
Has anyone done something similar to this before? If so, I'd love to hear how best to make this sync more efficient.
Since PHP cannot be multi-threaded1, in practice you have only 2 choices (there are several methods to do this, but they all boil down to the following to categories):
- Have 1 process per user operation. As you say, with a large database, this could result in a lot of processes.
- Have a process that deals with more than one user. This will potentialy take longer, since each process is effectively synchronous.
I think the best bet is to combine the two approaches, so you have multiple processes that deal with multiple users. So if you had 500 users, spawn 100 processes that deal with 5 users each, or 50 processes that deal with 10 users each.
Alternatively, it might be worth writing a program in a language that is better suited to the task - something that supports multi-threading, like Java or Perl - that you can start from PHP if required.
1Edit 03/2013: PHP can now be multithreaded, but its not recommended for production use as the pthreads extension is still highly unstable, and it is also not recommended for the average user, only use this if you really know what you are doing
Agree w/DaveRandom, check out HTTP Request Pool for implementation, rather than rolling your own http://www.php.net/manual/en/class.httprequestpool.php
I ended up using passthru and passing the script to the background. Things seem to be working well so far.
精彩评论