My XML is mal-formated for tag. Specifically, I want every tag that is not ended with to be corrected. How do I match开发者_如何学Go such pattern and using ReplaceAll to do that?
Pattern r = "<img.*?[^/]>" // sth like that?
You forgot a semicolon :)
No seriously, use a (X)HTML parser/cleanup API which can convert tagsoup (HTML) to XHTML. Under each JTidy can do that in a single call:
new Tidy().parseDOM(inputStream, outputStream);
Regex is simply not well suited for this job.
精彩评论