开发者

C# advanced String.Split

开发者 https://www.devze.com 2023-03-11 14:02 出处:网络
I have a string similar to this one: The boy said to his mother, \"Can I have some candy?\" If I do a normal String.Split on it, I get:

I have a string similar to this one:

The boy said to his mother, "Can I have some candy?"

If I do a normal String.Split on it, I get:

{ 'The', 'boy', 'said', 'to', 'his', 'mother', '"Can', 'I', 'have', 'some', 'candy?"' }

I want an array like so:

{ 'The', 'boy', 'said', 'to', 'hi开发者_开发知识库s', 'mother', 'Can I have some candy?' }

Obviously, I could just loop through character by character and keep track of whether I'm in a string or not and all that... but is there a better way? With Regexs perhaps?


How about finding all the matches of this regex:

"[^"]*"|\S+


Depends a bit on your requirements. E.g. do you need to treat AAA"BBB (no spaces) as signle word, or two words? If AAA"BBB is a single word, and " only starts a qouted field after delimiter - this looks like CSV parser. Of course, CSV has other rules, like double qoutes to mean literal quote, etc - but you would need to define some similar rules too.

So you can adapt any open source CSV parser, or see if e.g. Microsoft.VisualBasic.FileIO.TextFieldParser works for you

        string msg = "The boy said to his mother, \"Can I have some candy?\"";
        System.IO.MemoryStream s = new System.IO.MemoryStream(Encoding.Unicode.GetBytes(msg));
        TextFieldParser p = new TextFieldParser(s, Encoding.Unicode);
        p.Delimiters = new string[] { " ", "," };
        foreach(var f in p.ReadFields().Where(f => f != ""))
            Console.WriteLine(f);
0

精彩评论

暂无评论...
验证码 换一张
取 消