I'm using this method to split some text:
String[] parts = sentence.split("[,\\s\\-:\\?\\!\\«\\»\\'\\´\\`\\\"\\.\\\\\\/]");
Which will split me the text according to the specified symbols. One of the symbols is "-", because my text have weird things like this: "-------------- words --- words2 --words3--words4". Which will match my needs because it wont divide like this (in case i dont add "-"): "---words3---words4 (which will be considered a word in case i dont add "-").
But there is a tricky thing. I want to allow words like this: "aaa-bbb", which is is verified by this pattern:
Pattern pattern = Pattern.compile("(?<![A-Za-z-])[A-Za-z]+-[A-Za-z]+(?![A-Za-z-])");
allow: aaa-bb, aaa-bbbbbbb not allow: aaa--bb, aa--bbb-cc
So my question is, is it possible to split my 开发者_运维问答text applying the split above, but also considering this pattern is a word separator(for words like aaa-bbb) ?
Thanks in advances, Richard
From what I gather you are after the following:
String[] parts = sentence.split(/[\-]{2,}/);
精彩评论