开发者

Accepting only a single character in a row in a regular expression

开发者 https://www.devze.com 2023-02-22 05:13 出处:网络
I\'m trying to split a string formatted like Bananas|,|Bananas|||Bananas|Oranges|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Green Apples|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Red Apples|,|B

I'm trying to split a string formatted like Bananas|,|Bananas|||Bananas|Oranges|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Green Apples|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Red Apples|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Pears with a regex, on the ||| or |,| delimiters. I'm using [a-zA-Z |]+\|[,|\0]\|, but I have a small issue: the triple-pipe delimiter is captured by the [a-zA-Z |] character class.

Is there a way to change the [a-zA-Z |] character class to only accept one pipe character in a row, while allowing any number of the other ones? (I.e. it should accept accessories|batteries but not ac开发者_StackOverflowcessories||batteries.)

More example: out of the original string, the regex should accept Bananas|Oranges|,| or Bananas|||, not Bananas|||Bananas|Oranges|,|, with any number of single-pipe delimited names before the |[,|]|.


I think you would want a group containing a bunch of these [a-zA-Z ]+ always followed by a \|. The group can repeat many times, and is always terminated by ,| or || (after trailing |) so (,|\|)\|

Altogether: ([a-zA-Z ]+\|)+(,|\|)\|


Since you said you're using Java, an alternate approach would be to compute:

s.replaceAll("|||", "|,|").split("|,|");

where s is your starting string.


Why not use a non-greedy quantifier on your regular expression? That way it will stop at the first ||| or |,| that it finds.


Am I missing something, but why can't you do a straight split using a regex == \|\|\||\|,\|? Here is a tested script that works for me:

import java.util.regex.*;
public class TEST {
    public static void main(String[] args) {
        String subjectString = "Bananas|,|Bananas|||Bananas|Ora" +
        "nges|,|Bananas|||Bananas|Oranges|||Bananas|Oranges|Gre" +
        "en Apples|,|Bananas|||Bananas|Oranges|||Bananas|Orange" +
        "s|Red Apples|,|Bananas|||Bananas|Oranges|||Bananas|Ora" +
        "nges|Pears";
        String[] splitArray = null;
        Pattern regex = Pattern.compile("\\|\\|\\||\\|,\\|");
        splitArray = regex.split(subjectString);
        int i;
        for (i = 0; i < splitArray.length; ++i) {
            System.out.println(splitArray[i]);
        }
    }
}

Here is the output:

Bananas
Bananas
Bananas|Oranges
Bananas
Bananas|Oranges
Bananas|Oranges|Green Apples
Bananas
Bananas|Oranges
Bananas|Oranges|Red Apples
Bananas
Bananas|Oranges
Bananas|Oranges|Pears

0

精彩评论

暂无评论...
验证码 换一张
取 消