I found it very difficult to work with htmlunit in terms of creating new html content on the fly like we can do in jquery.
For example given a text node:
I am text
I want change that text node into (if the word is greater than 3 chars it is replaced with span):
I am <span>text</span>
After this I want to replace the original text node ( I am text) with
I am <span>text</span>
in the html document wherever it occurred.
So how can I achieve this开发者_高级运维 using htmlunit? Is there better alternative to htmlunit in Java applications for screen scraping or modify dom on the fly type of applications?
In htmlunit I could not even find how to construct a new element as constructors are mostly missing or declared protected.
It's not clear what you want to do exactly, but HtmlUnit is a programmatic browser. Its API allows doing in Java what a user would do with his keyboard and mouse in a standard browser. And modifying the DOM of a web page is not what a user does with his browser.
Its API allows accessing the DOM tree anyway (though not via the W3C DOM interfaces), and you should thus be able to do in Java what you would do in JavaScript with the DOM. HtmlElement
instances can be created through the createElement method of HtmlPage
. But of course, there is no "JQuery in Java for HtmlUnit".
HtmlUnit allows you to run JS script in context of a page. Like:
String query = <your query>;
HtmlPage page = webClient.getPage(url);
ScriptResult sr = page.executeJavaScript(query);
HtmlPage newPage = sr.getHtmlPage();
newPage will be a copy of original page modified by your script.
HtmlUnit lets you interact with a page via Java roughly the same way a human would interact with the page via a browser.
How would you modify the DOM in a browser?
You don't, not directly: instead you click or type to trigger Javascript in the page, which in turn modifies the DOM. Likewise, with HtmlUnit your Java code triggers Javascript in the page which in turn modifies the DOM.
精彩评论