mostly I find the answers on my q开发者_高级运维uestions on google, but now i'm stuck. I'm working on a scraper script, which first scrapes some usernames of a website, then gets every single details of the user. there are two scrapers involved, the first goes through the main page, gets the first name, then gets the details of it's profile page, then it goes forward to the next page... the first site I'm scraping has a total of 64 names, displayed on one main page, while the second one, has 4 pages with over 365 names displayed.
the first one works great, however the second one keeps getting me the 500 internal error. I've tried to limit the script, to scrape only a few names, which works like charm, so I'm more then sure that the script itself is ok! the max_execution_time in my php ini file is set to 1500, so I guess that's not the problem either, however there is something causing the error... not sure if adding a sleep command after every 10 names for example will solve my situation, but well, i'm trying that now!
so if any of you have any idea what would help solve this situation, i would appreciate your help!
thanks in advance, z
support said i can higher the memory upto 4gigabytes
Typical money gouging support answer. Save your cash & write better code because what you are doing could easily be run from the shared server of a free web hosting provider even with their draconian resource limits.
Get/update the list of users first as one job then extract the details in smaller batches as another. Use the SQL BULK Insert command to reduce connections to the database. It also runs much faster than looping through individual INSERTS.
Usernames and details is essentially a static list, so there is no rush to get all the data in realtime. Just nibble away with a cronjob fetching the details and eventually the script will catch up with new usernames being added to the incoming list and you end up with a faster,leaner more efficient system.
This is definitely a memory issue. One of your variables is growing past the memory limit you have defined in php.ini. If you do need to store a huge amount of data, I'd recommend writing your results to a file and/or DB at regular intervals (and then free up your vars) instead of storing them all in memory at run time.
- get user details
- dump to file
- clear vars
- repeat..
If you set your execution time to infinity and regularly dump the vars to file/db your php script should run fine for hours.
精彩评论