开发者

HTML type string parsing question!

开发者 https://www.devze.com 2023-02-07 15:18 出处:网络
<a href=\"http://www.google.com/map\" class=\"more-link\">look at the Google map</a> Is there any parser to get the link(www.google.com/map) from the <a> tag?
<a href="http://www.google.com/map" class="more-link">look at the Google map</a> 

Is there any parser to get the link(www.google.com/map) from the <a> tag?

or the be开发者_Python百科st way just to write a custom one~


jQuery, for instance:

var href = $('a.more-link').attr('href');


There is many 3:rd party solutions but I am not sure which exist for Java, maybe HTML agility pack exists in a version for Java.

But another solution would be to use regex

/<a\s+[^<]*?href\s*=\s*(?:(['"])(.+?)\1.*?|(.+?))>/

Fixed the regex to handle problems suggested in comments.

Looked up some real HTML parsers for Java if you find you need more than the regex aproach

http://htmlparser.sourceforge.net/

http://jericho.htmlparser.net/docs/index.html

http://jsoup.org/

0

精彩评论

暂无评论...
验证码 换一张
取 消