开发者

Fetching multiple URLs and updating DB with PHP script

开发者 https://www.devze.com 2023-01-25 12:14 出处:网络
I have a website that uses MySQL. I am using a table named \"People\" that each row represents, obviously, a person. When a user enters a page I would like to introduce news related to that person (al

I have a website that uses MySQL. I am using a table named "People" that each row represents, obviously, a person. When a user enters a page I would like to introduce news related to that person (along with the informati开发者_开发百科on from the MySQL table). For that purpose, I decided to use BING News Source API.

The problem with the method of calling the BING API for each page load is that I am increasing the load time of my page (round tip to BING servers). Therefore, I have decided to pre-fetch all the news and save them in my table under a coloumn named "News".

Since my table contains 5,000+ people, running a PHP script to download all news for every person and update the table at once results a Fatal error: Maximum execution time (I would not like to disable the timeout, since it is a good security measure).

What will be a good and efficient way to run such a script? I know I can run a cron job every 5 minutes that will update only a portion of rows everytime - but even in that case - what will be the best way to save the current offset? Should i save the offset in MySQL, or as a server var?


  • use cronjob for complex job
  • you should increase the timeout if you plan to run as cronjob (you are pulling things from other site, not for public)
  • consider create a master script (triggered by the cronjob) and this master script will spawn multiple sub-scripts (with certain control), so that you can pull the data from BING News Source (with this you can multi download the 5000+ profiles) without have to download one-by-one sequentially (think batch processing)

Update

Cron is a time-based job scheduler in Unix-like computer operating systems. The name cron comes from the word "chronos", Greek for "time". Cron enables users to schedule jobs (commands or shell scripts) to run periodically at certain times or dates. It is commonly used to automate system maintenance or administration, though its general-purpose nature means that it can be used for other purposes, such as connecting to the Internet and downloading email

Cron - on Wiki


Why not load the news section of the page via AJAX? This would mean that the rest of the page would load quickly, and the delay created from waiting for BING would only affect the news section, which you could allocate a loading placeholder to.

Storing the news in the DB doesnt sound like as very efficient/practical solution, the ongoing management of the records alone would potentially cause a headache in future.

0

精彩评论

暂无评论...
验证码 换一张
取 消