According to this answer:
HTML 4.01 specifies that <a> elements may only contain inline elements. A <div> is a block element, so it may not appear inside an <a>.
But...
HTML5 allows <a> elements to contain blocks.
Well, I just tried selecting a <div class="m">
within an <a>
block, using:
Elements elems = a.select("m");
and elmes returns empty, despite the div being there.
So I am thinking: Either I am not using the correct syntax for selecting a div within an a or... Jsoup doesn't support this HTML5-only feature?
What is the right Jsoup syntax for selecting a div
within an a
?
Update: I just tried
Elements elems = a.getElementsByClass("m");
And Jsoup had no problems with it (i.e. it returns the correct number of such divs within a).
So my question now is: Why?
Wh开发者_运维技巧y does a.getElementsByClass("m")
work whereas a.select("m")
doesn't?
Update: I just tried, per @Delan Azabani's suggestion:
Elements elems = a.select(".m");
and it worked. So basically the a.select()
works but I was missing the .
in front of the class name.
The select
function takes a selector. If you pass 'm'
as the argument, it'll try to find m
elements that are children of the a
element. You need to pass '.m'
as the argument, which will find elements with the m
class under the a
element.
The current version of jsoup (1.5.2) does support div
tags nested within a
tags.
In situations like this I suggest printing out the parse tree, to ensure that jsoup has parsed the HTML like you expect, or if it hasn't to know what the correct selector to use.
E.g.:
Document doc = Jsoup.parse("<a href='./'><div class=m>Check</div></a>");
System.out.println("Parse tree:\n" + doc);
Elements divs = doc.select("a .m");
System.out.println("\nDiv in A:\n" + divs);
Gives:
Parse tree:
<html>
<head></head>
<body>
<a href="./">
<div class="m">
Check
</div></a>
</body>
</html>
Div in A:
<div class="m">
Check
</div>
精彩评论