开发者

Getting started with a parser in Java code

开发者 https://www.devze.com 2022-12-27 16:22 出处:网络
I am new to parsers. I like to fetch specific data from a website. I need to use parsers for that. How to get started with pa开发者_如何学Crsers? What do I need to download?

I am new to parsers. I like to fetch specific data from a website. I need to use parsers for that. How to get started with pa开发者_如何学Crsers? What do I need to download? What would the code be to fetch the data from a website using parsers in Java?


My advice would be to use an open source HTML parser such as HTMLCleaner - http://htmlcleaner.sourceforge.net/

You can use HTMLCleaner (or similar) to create a representation of the web page DOM, and then use this to extract whatever information you want from the web pages.

The process looks something like this:

URL url = new URL("website you want to load");
HTMLCleaner h = new HTMLCleaner();
TagNode HtmlNode = h.clean(url.openStream());
//perform queries on the DOM to extract information
0

精彩评论

暂无评论...
验证码 换一张
取 消