开发者

How can I transform some HTML fragment into XHTML using groovy?

开发者 https://www.devze.com 2023-03-18 00:47 出处:网络
I开发者_如何学Go have an input String containing some HTML fragment like the following example I would have enever thought that <b>those infamous tags</b>,

I开发者_如何学Go have an input String containing some HTML fragment like the following example

I would have enever thought that <b>those infamous tags</b>, 
born in the <abbr title="Don't like that acronym">SGML</abbr> realm,
would make their way into the web of objects that we now experience.

Obviously, real one is by far more complex (including links, iamges, divs, and so on), and I would like to write a method having the following prototype

String toXHTML(String html) {
     // What do I have to write here ?
}


Without a description of the input format, it will probably be some html-like stuff. Parsing such a mess gets ugly quickly. But it looks like someone else did a good job already:

#!/usr/bin/env groovy
@Grapes(
    @Grab(group='jtidy', module='jtidy', version='4aug2000r7-dev')
)
import org.w3c.tidy.*
def tidy = new Tidy()
tidy.parse(System.in, System.out)

Use the force, Riduidel.


Check out this: http://blog.foosion.org/2008/06/09/parse-html-the-groovy-way/ It might be something you are looking for.

0

精彩评论

暂无评论...
验证码 换一张
取 消