I'm trying to create a regex to tokenize a string. 开发者_开发技巧An example string would be.
John Mary, "Name=blah;Name=blahAgain" "Hand=1,2"
I'm trying to get back:
- John
- Mary
- Name=blah;Name=blahAgain
- Hand=1,2
This was easy:
([^ ])+
For that specific example, I would do:
([^\s]*)\s+([^,\s]*)\s*,\s*"([^"]*)"\s+"([^"]*)"
update: modified to split Mary and John
Since you're using Java, why not use StringTokenizer? E.g.:
StringTokenizer st = new StringTokenizer("String to tokenize", " ");
while (st.hasMoreTokens())
{
// get next token
String someVariable = st.nextToken();
}
This works for your example:
(\w+) (\w+), \"([^"]+)" \"([^"]+)
Do all your string have exactly the same pattern?
One possible way: split at ,
followed by a space
or at one of space
or quotation mark
:
"John Mary, \"Name=blah;Name=blahAgain\" \"Hand=1,2\"".split(",\\s|[\\s\"]")
精彩评论