I'm trying to split a string formatted like Bananas|,|Bananas|||Bananas|Oranges|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Green Apples|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Red Apples|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Pears
with a regex, on the |||
or |,|
delimiters. I'm using [a-zA-Z |]+\|[,|\0]\|
, but I have a small issue: the triple-pipe delimiter is captured by the [a-zA-Z |]
character class.
Is there a way to change the [a-zA-Z |]
character class to only accept one pipe character in a row, while allowing any number of the other ones? (I.e. it should accept accessories|batteries
but not ac开发者_StackOverflowcessories||batteries
.)
More example: out of the original string, the regex should accept Bananas|Oranges|,|
or Bananas|||
, not Bananas|||Bananas|Oranges|,|
, with any number of single-pipe delimited names before the |[,|]|
.
I think you would want a group containing a bunch of these [a-zA-Z ]+
always followed by a \|
. The group can repeat many times, and is always terminated by ,|
or ||
(after trailing |
) so (,|\|)\|
Altogether: ([a-zA-Z ]+\|)+(,|\|)\|
Since you said you're using Java, an alternate approach would be to compute:
s.replaceAll("|||", "|,|").split("|,|");
where s is your starting string.
Why not use a non-greedy quantifier on your regular expression? That way it will stop at the first |||
or |,|
that it finds.
Am I missing something, but why can't you do a straight split using a regex == \|\|\||\|,\|
? Here is a tested script that works for me:
import java.util.regex.*;
public class TEST {
public static void main(String[] args) {
String subjectString = "Bananas|,|Bananas|||Bananas|Ora" +
"nges|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Gre" +
"en Apples|,|Bananas|||Bananas|Oranges|||Bananas|Orange" +
"s|Red Apples|,|Bananas|||Bananas|Oranges|||Bananas|Ora" +
"nges|Pears";
String[] splitArray = null;
Pattern regex = Pattern.compile("\\|\\|\\||\\|,\\|");
splitArray = regex.split(subjectString);
int i;
for (i = 0; i < splitArray.length; ++i) {
System.out.println(splitArray[i]);
}
}
}
Here is the output:
Bananas
Bananas
Bananas|Oranges
Bananas
Bananas|Oranges
Bananas|Oranges|Green Apples
Bananas
Bananas|Oranges
Bananas|Oranges|Red Apples
Bananas
Bananas|Oranges
Bananas|Oranges|Pears
精彩评论