I want to extract all words enclosed in curly braces, so I have an expressions like this
foo {bar} moo {mar}
The string to match may have any number of these words, but I'm starting to think I'm approaching this problem in the wrong way.
My attempt
And I've tried to extract the words braces into groups so I can use every single match. So, I made a regex:
String rx = ".*\\{({GROUP}\\w+)\\}.*";
Note: I'm using JRegex syntax so I need to escape some of the curlies.
Result
The result I get is a single (albeit correct) match bar
, whilst I expect two matches bar
andmar
. What have I misunderstood, and how do I correct it? I probably need to use some other开发者_如何学编程 quantifier for the .
part, but I hope you can help me with that too.
Many thanks in advance!
Your regex .*\{({GROUP}\w+)\}.*
doesn't work because it matches all your input string in one time :
.*
matchesfoo
\{({GROUP}\w+)\}
matches{bar}
.*
matchesmoo {mar}
You should use something like this :
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("\\{([^}]*)\\}");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find())
{
matchList.add(regexMatcher.group());
}
Inner curly braces aren't handled by this regex
A variant, using the reluctant modifier ".*?" in the regex expression. You can find additionnal information about the search strategy of a regex (greedy, reluctant, possessive) here : http://javascript.about.com/library/blre09.htm
List<String> matchList = new ArrayList<String>();
Pattern regex = Pattern.compile("\\{(.*?)\\}");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
matchList.add(regexMatcher.group());
}
The syntax choice is yours. This regex will have the same comportement as @madgnome's one. Personnaly, I prefer using reluctant search rather than a character exclusion...
精彩评论