开发者

Parse HTML Offline

开发者 https://www.devze.com 2022-12-29 16:12 出处:网络
Are there any HTML parsers that parse HTML docs offline, i.e. stored on your computer? If so, can anyone name some good ones please?

Are there any HTML parsers that parse HTML docs offline, i.e. stored on your computer? If so, can anyone name some good ones please?

UPDATE: Hah, NVM, found the answer, would anyone be able to provide an example of this in html Jericho?

UPDATE2: I thought I had f开发者_Go百科ound the answer but I am wrong, mistook InputStream for FileInputStream :(


Here's a few you could look at:

  • For Python: BeautifulSoup
  • For .NET: HTML Agility Pack
  • For Java: TagSoup


How about HTML Parser.


Nutch has an HTML parser as a subcomponent. Javadoc here.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号