Do you know of any good lightweight library in java to make good and safe HTML representation of user input? That's very generic task, I think. Consider: user leaves a 开发者_如何学JAVAcomment in the blog - my task is to convert user comment into safe & nice HTML content.
Use the jsoup HTML Cleaner with a configuration specified by a Whitelist.
String unsafe =
"<p><a href='http://example.com/' onclick='stealCookies()'>Link</a></p>";
String safe = Jsoup.clean(unsafe, Whitelist.basic());
// now: <p><a href="http://example.com/" rel="nofollow">Link</a></p>
Excerpt from the Jsoup Cookbook.
HTML Parser is a Java library used to parse HTML in either a linear or nested fashion. Primarily used for transformation or extraction, it features filters, visitors, custom tags and easy to use JavaBeans.
HTML Parser
Open Source HTML Parsers in Java
精彩评论