开发者

RegEx to Tokenize String

开发者 https://www.devze.com 2022-12-17 23:30 出处:网络
I\'m trying to create a regex to tokenize a string.开发者_开发技巧An example string would be. John Mary, \"Name=blah;Name=blahAgain\" \"Hand=1,2\"

I'm trying to create a regex to tokenize a string. 开发者_开发技巧An example string would be.

John Mary, "Name=blah;Name=blahAgain" "Hand=1,2"

I'm trying to get back:

  • John
  • Mary
  • Name=blah;Name=blahAgain
  • Hand=1,2


This was easy:

([^ ])+


For that specific example, I would do:

([^\s]*)\s+([^,\s]*)\s*,\s*"([^"]*)"\s+"([^"]*)"

update: modified to split Mary and John


Since you're using Java, why not use StringTokenizer? E.g.:

StringTokenizer st = new StringTokenizer("String to tokenize", " ");
while (st.hasMoreTokens())
{
   // get next token
   String someVariable = st.nextToken();
}


This works for your example:

(\w+) (\w+), \"([^"]+)" \"([^"]+)

Do all your string have exactly the same pattern?


One possible way: split at , followed by a space or at one of space or quotation mark:

"John Mary, \"Name=blah;Name=blahAgain\" \"Hand=1,2\"".split(",\\s|[\\s\"]")
0

精彩评论

暂无评论...
验证码 换一张
取 消