开发者

Jsoup baseUri gone after select

开发者 https://www.devze.com 2023-03-22 06:20 出处:网络
I just discovered that setting the baseUri is necessary for each Element you get by doing a select. It would be a lot better if the baseUri of the Document is applied to each Element.

I just discovered that setting the baseUri is necessary for each Element you get by doing a select. It would be a lot better if the baseUri of the Document is applied to each Element.

Document d = Jsoup.parse(myString);
doc.setBaseUri("http://www.goog开发者_开发问答le.de");

If I execute

Element e = d.select(....).get(0);

The baseUri of e is empty.

Is this a bug or is it intended?


The base URI is specific to each element, as there are cases in HTML where the base URI can change throughout the parse. Currently, setting it on the document after the parse does not bubble it down to child nodes.

Just specify it when you parse the HTML string, e.g.:

Document doc = Jsoup.parse(myString, "http://www.google.de");

If you fetch the HTML from a URL and parse that (with Jsoup.connect), the base URI is automatically set.

0

精彩评论

暂无评论...
验证码 换一张
取 消