I am trying to strip and replace a text string that looks as follows in the most elegant way possible:
element {"item"} {text {
} {$i/child::itemno}
To look like:
<item> {$i/child::itemno}
Hence removing the element text substituting its braces and removing text and its accompanying braces. These patterns may be ancountered several times. Am开发者_StackOverflow I better off using Java's java.util.regex.Pattern OR the simple replaceAll OR org.apache.commons.lang.StringUtils ?
Thanks for the reponses:
I now have the following but I am unsure as to the number of backslashes and also how to complete the final substitution which makes use of my group(1) and replaces it with < at its start and > at its end:
Pattern p = Pattern.compile("/element\\s*\\{\"([^\"]+)\"\\}\\s*{text\\s*{\\s*}\\s*({[^}]*})/ ");
// Split input with the pattern
Matcher m = p.matcher("element {\"item\"} {text {\n" +
" } {$i/child::itemno} text { \n" +
" } {$i/child::description} text {\n" +
" } element {\"high_bid\"} {{max($b/child::bid)}} text {\n" +
" }} ");
// For each instance of group 1, replace it with < > at the start and end
Find:
/element\s*\{"([^"]+)"\}\s*{text\s*{\s*}\s*({[^}]*})/
Replace:
"<$1> $2"
I think a simple string replacement will do. Here is a Python version (can be turned into a one-liner):
>>> a = """element {"item"} {text {
} {$i/child::itemno}"""
>>>
>>> a
'element {"item"} {text {\n } {$i/child::itemno}'
>>> a=a.replace(' ', '').replace('\n', '')
>>> a
'element{"item"}{text{}{$i/child::itemno}'
>>> a = a.replace('element {"', '<')
>>> a
'element{"item"}{text{}{$i/child::itemno}'
>>> a = a.replace('element{"', '<')
>>> a
'<item"}{text{}{$i/child::itemno}'
>>> a = a.replace('"}{text{}', '> ')
>>> a
'<item> {$i/child::itemno}'
>>>
精彩评论