开发者

any html/css parsing library for ruby & PHP?

开发者 https://www.devze.com 2022-12-18 04:14 出处:网络
I am 开发者_StackOverflowabout to finish my script that parses/scrapes website using mechanize&ruby.

I am 开发者_StackOverflowabout to finish my script that parses/scrapes website using mechanize&ruby.

I need to port my script to PHP in the future.

My question is

  • if there is any library available for both ruby and php or
  • if anybody can recommend any other approach to this?


There's no PHP equivalent of Ruby and Mechanize.

However, Zend_Framework offers some great scraping-related libraries including

  • Zend_URI and Zend_HTTP_Client
  • Zend_Dom


As standard, PHP comes with several tools for parsing XML (and the DOM one can cope with a lot of badly formed HTML)

See

http://uk3.php.net/manual/en/refs.xml.php

C.


For DOM manipulation in PHP use the DOMDocument class

Simple and easy :)


Another DOM manipulation tool for php is phpQuery.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号