开发者

Regex subbing out 'section character' in java

开发者 https://www.devze.com 2023-01-12 02:15 出处:网络
I\'m running a series of regex substitutions (i.e. String.replaceAll calls) to convert al开发者_如何转开发l the special characters in a text file to XML parseable special characters. For example:

I'm running a series of regex substitutions (i.e. String.replaceAll calls) to convert al开发者_如何转开发l the special characters in a text file to XML parseable special characters. For example:

string_out = string_out.replaceAll("&", "&");

I've hit a stumbling block replacing the 'section character' that is, this little squiggle: §

For starters, I'm doing my editing in vi, so I can't even paste the character in there, it being not a member of standard or extended ascii. I can't see specifying it by hex code in the regex working either, for the same reason.

How would you specify this character for a regex substitute? Or if you just want to drop in and tell me there's already a function tucked away somewhere to do the character conversion I'm doing by hand, that's cool, too.


Unicode: §
Hex:     0xA7
html:    §
name:    section sign

You can find it in the latin-1 supplement.


cant you simply use the unicode codepoint?

0

精彩评论

暂无评论...
验证码 换一张
取 消