开发者

Regular Expression- Help needed

开发者 https://www.devze.com 2023-01-18 07:58 出处:网络
I have a String template from which I need to get the list of #elseif blocks. For example the first #elseif block will be from

I have a String template from which I need to get the list of #elseif blocks. For example the first #elseif block will be from

#elseif ( $variable2 )Some sample text after 1st ElseIf.

,second #elseif block is from #elseif($variable2)This text can be repeated many times until do while is called. SECOND ELSEIF

and so on. I'm using the following regex for this.

String regexElseIf="\\#elseif\\s*\\((.*?)\\)(.*?)(?:#elseif|#else|#endif)"; 

But it returns just one match, ie first #elseif block and not second. I need to get the second #elseif block also. Could you please help me to do that? Please find the below string template.

  String template =
        "This is a sample document."
            + "#if ( $variable1 )"
            + "FIRST This text can be repeated many times until do while is called."
          开发者_StackOverflow中文版  + "#elseif ( $variable2 )"
            + "Some sample text after 1st ElseIf."
            + "#elseif($variable2)"
            + "This text can be repeated many times until do while is called. SECOND ELSEIF"
            + "#else "
            + "sample else condition  "
            + "#endif "
            + "Some sample text."
            + "This is the second sample document."
            + "#if ( $variable1 )"
            + "SECOND FIRST This text can be repeated many times until do while is called."
            + "#elseif ( $variable2 )"
            + "SECOND Some sample text after 1st ElseIf."
            + "#elseif($variable2)"
            + "SECOND This text can be repeated many times until do while is called. SECOND ELSEIF"
            + "#else " + "SECOND sample else condition  " + "#endif "
            + "SECOND Some sample text.";


This code

Pattern regexp = Pattern.compile("#elseif\\b(.*?)(?=#(elseif|else|endif))");
Matcher matcher = regexp.matcher(template);
while (matcher.find())
    System.out.println(matcher.group());

will produce

#elseif ( $variable2 )Some sample text after 1st ElseIf.
#elseif($variable2)This text can be repeated many times until do while is called. SECOND ELSEIF
#elseif ( $variable2 )SECOND Some sample text after 1st ElseIf.
#elseif($variable2)SECOND This text can be repeated many times until do while is called. SECOND ELSEIF

The secret lies in the positive lookahead (?=#(elseif|else|endif)), so #elseif, #else or #endif will be matched, but the characters are not consumed. This way they could be found by the next iteration.


#elseif\b(?:(?!#else\b|#endif\b).)*

will match everything from the first #elseif in a block up to (but not including) the nearest #else or #endif.

Pattern regex = Pattern.compile("#elseif\\b(?:(?!#else\\b|#endif\\b).)*", Pattern.DOTALL);
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
    // matched text: regexMatcher.group()
    // match start: regexMatcher.start()
    // match end: regexMatcher.end()
} 

If you then need to extract the single ´#elseif` blocks from that match, use

#elseif\b(?:(?!#elseif\b).)*

on the results from the first regex match above. In Java:

Pattern regex = Pattern.compile("#elseif\\b(?:(?!#elseif\\b).)*", Pattern.DOTALL);

etc.


The big problem here is that you need #elseif(..) both as a start and stop marker in your regular expression. The first match is the substring

#elseif ( $variable2 )Some sample text after 1st ElseIf.#elseif($variable2)

and then it starts looking for the next match after that sequence. So it will miss the second #elseif from the first #if expression, because the #elseif($variable2) sequence was already part of the previous match.

I'd try to split the string on the pattern "\\#elseif\\s*\\((.*?)\\)":

String[] temp = template.split("\\#elseif\\s*\\((.*?)\\)");

Now all temp entries starting from temp[1] have an #elseif block at their beginning. Another split on (?:#else|#endif) should give you strings containing nothing but the plain texts:

for (String s:temp)
  System.out.println(s.split("(?:#else|#endif)")[0]);

(wasn't able to test the second split, if it doesn't work, treat it as an advice on the strategy only ;))


private static final Pattern REGEX = Pattern.compile(
    "#elseif\\s*\\(([^()]*)\\)(.*?)(?=#elseif|#else|#endif)");

public static void main(String[] args) {
    Matcher matcher = REGEX.matcher(template);
    while (matcher.find()) {
        System.out.println(matcher.group(2));
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消