开发者

splitting sentences using regex

开发者 https://www.devze.com 2022-12-14 13:07 出处:网络
i would like to split a text (using a regex开发者_运维知识库) to a dot followed by a whitespace or a dot followed by new line (\\n)

i would like to split a text (using a regex开发者_运维知识库) to a dot followed by a whitespace or a dot followed by new line (\n)

i'm working with c# .Net

Appreciate your answers!


using System.Text.RegularExpressions;
string[] parts = Regex.Split(mytext, "\.\n|\. "); 
# or "\.\s" if you're not picky about it matching tabs, etc.


The regular expression

/\.\s/

Will match a literal . followed by whitespace.


You don't need a regular expression for that. Just use the overload of string.Split that takes an array of strings:

string[] splitters = new string[] { ". ", ".\t", "." + Environment.NewLine };
string[] sentences = aText.Split(splitters, StringSplitOptions.None);
0

精彩评论

暂无评论...
验证码 换一张
取 消