What is the most efficient way to turn string to a list of words in C#?
For example:
Hello... world 1, this is amazing3,really , amazing! *bla*
should turn into th开发者_如何学Ce following list of strings:
["Hello", "world", "1", "this", "is", "amazing3", "really", "amazing", "bla"]
Note that it should support other languages other than English.
I need this because I want to collect a list of keywords from specific text.
Thanks.
How about using regular expressions? You could make the expression arbitrarily complex, but what I have here should work for most inputs.
new RegEx(@"\b(\w)+\b").Matches(text);
char[] separators = new char[]{' ', ',', '!', '*', '.'}; // add more if needed
string str = "Hello... world 1, this is amazing3,really , amazing! *bla*";
string[] words= str.Split(separators, StringSplitOptions.RemoveEmptyEntries);
You need a lexer.
精彩评论