I have a string similar to this one:
The boy said to his mother, "Can I have some candy?"
If I do a normal String.Split
on it, I get:
{ 'The', 'boy', 'said', 'to', 'his', 'mother', '"Can', 'I', 'have', 'some', 'candy?"' }
I want an array like so:
{ 'The', 'boy', 'said', 'to', 'hi开发者_开发知识库s', 'mother', 'Can I have some candy?' }
Obviously, I could just loop through character by character and keep track of whether I'm in a string or not and all that... but is there a better way? With Regexs perhaps?
How about finding all the matches of this regex:
"[^"]*"|\S+
Depends a bit on your requirements. E.g. do you need to treat AAA"BBB (no spaces) as signle word, or two words? If AAA"BBB is a single word, and " only starts a qouted field after delimiter - this looks like CSV parser. Of course, CSV has other rules, like double qoutes to mean literal quote, etc - but you would need to define some similar rules too.
So you can adapt any open source CSV parser, or see if e.g. Microsoft.VisualBasic.FileIO.TextFieldParser works for you
string msg = "The boy said to his mother, \"Can I have some candy?\"";
System.IO.MemoryStream s = new System.IO.MemoryStream(Encoding.Unicode.GetBytes(msg));
TextFieldParser p = new TextFieldParser(s, Encoding.Unicode);
p.Delimiters = new string[] { " ", "," };
foreach(var f in p.ReadFields().Where(f => f != ""))
Console.WriteLine(f);
精彩评论