开发者

Importing /scraping page content form other sites?

开发者 https://www.devze.com 2023-03-04 06:39 出处:网络
i\'ve been play开发者_开发技巧ing with php and also http://www.alchemyapi.com/, and embed.ly but i was wondering if there other options out there to import and parse a webpage, any page, either is a n

i've been play开发者_开发技巧ing with php and also http://www.alchemyapi.com/, and embed.ly but i was wondering if there other options out there to import and parse a webpage, any page, either is a news site or a blog...

thanks


To fetch the data: curl, file_get_contents (may be others those are the two common)

To parse the data: PHP: DOM, SimpleXML preg_match**

Since it was tagged with PHP, I only gave working information for PHP. There are tons of ways to do this, if you can narrow your question down to what you are trying to do it would help. The better ways to parse any site, is through their RSS feed if they have one, or through their API, speculating that they offer up the content you want via RSS/API.


** preg_match is not a great alternative it does "work" but better to use the DOM / Simple XML functions if possible.


I wrote a crawler at work using cURL and preg_match

Before I chose to do it that way, I had looked at DOM Parsers http://php.net/manual/en/book.dom.php

0

精彩评论

暂无评论...
验证码 换一张
取 消