开发者

problem with java split()

开发者 https://www.devze.com 2023-01-10 13:31 出处:网络
I have a string: strArray= \"-------9---------------\"; I want to find 9 from the string. The string may be like this:

I have a string:

strArray= "-------9---------------";

I want to find 9 from the string. The string may be like this:

strArray= "---4-5-5-7-9---------------";

Now I want to find out only the digits from the string. I need the values 9,4, or such things and ignore the '-' . I tried the following:

strArray= strignId.split("-");

but it gets error, since there are mul开发者_C百科tiple '-' and I don't get my output. So what function of java should be used?

My input and output should be as follows:

input="-------9---------------";
    output="9";
input="---4-5-5-7-9---------------";
    output="45579";

What should I do?


The + is a regex metacharacter of "one-or-more" repetition, so the pattern -+ is "one or more dash". This would allow you to use str.split("-+") instead, but you may get an empty string as first element.

If you just want to remove all -, then you can do str = str.replace("-", ""). This uses replace(CharSequence, CharSequence) method, which performs literal String replacement, i.e. not regex patterns.

If you want a String[] with each digit in its own element, then it's easiest to do in two steps: first remove all non-digits, then use zero-length assertion to split everywhere that's not the beginning of the string (?!^) (to prevent getting an empty string as a first element). If you want a char[], then you can just call String.toCharArray()

Lastly, if the string can be very long, it's better to use a java.util.regex.Matcher in a find() loop looking for a digit \d, or a java.util.Scanner with a delimiter \D*, i.e. a sequence (possibly empty) of non-digits. This will not give you an array, but you can use the loop to populate a List (see Effective Java 2nd Edition, Item 25: Prefer lists to arrays).

References

  • regular-expressions.info/Repetition with Star and Plus, Character Class, Lookaround

Snippets

Here are some examples to illustrate the above ideas:

    System.out.println(java.util.Arrays.toString(
        "---4--5-67--8-9---".split("-+")
    ));
    // [, 4, 5, 67, 8, 9]
    // note the empty string as first element

    System.out.println(
        "---4--5-67--8-9---".replace("-", "")
    );
    // 456789

    System.out.println(java.util.Arrays.toString(
        "abcdefg".toCharArray()
    ));
    // [a, b, c, d, e, f, g]

The next example first deletes all non-digit \D, then splitting everywhere except the beginning of the string (?!^), to get a String[] each containing a digit:

    System.out.println(java.util.Arrays.toString(
        "@*#^$4@!#5ajs67>?<{8_(9SKJDH"
            .replaceAll("\\D", "")
            .split("(?!^)")
    ));
    // [4, 5, 6, 7, 8, 9]

This uses a Scanner, with \D* as delimiter, to get each digit as its own token, using it to populate a List<String>:

    List<String> digits = new ArrayList<String>();
    String text = "(&*!@#123ask45{P:L6";
    Scanner sc = new Scanner(text).useDelimiter("\\D*");
    while (sc.hasNext()) {
        digits.add(sc.next());
    }
    System.out.println(digits);
    // [1, 2, 3, 4, 5, 6]

Common problems with split()

Here are some common beginner problems when dealing with String.split:

Lesson #1: split takes a regular expression pattern

This is probably the most common beginner mistake:

System.out.println(java.util.Arrays.toString(
    "one|two|three".split("|")
));
// [, o, n, e, |, t, w, o, |, t, h, r, e, e]

System.out.println(java.util.Arrays.toString(
    "not.like.this".split(".")
));
// []

The problem here is that | and . are regex metacharacters, and since they are intended to be matched literally, they need to be escaped by preceding with a backslash, which as a Java string literal is "\\".

System.out.println(java.util.Arrays.toString(
    "one|two|three".split("\\|")
));
// [one, two, three]

System.out.println(java.util.Arrays.toString(
    "not.like.this".split("\\.")
));
// [not, like, this]

Lesson #2: split discards trailing empty strings by default

Sometimes it's desired to keep trailing empty strings (which are discarded by default split):

    System.out.println(java.util.Arrays.toString(
        "a;b;;d;;;g;;".split(";")
    ));
    // [a, b, , d, , , g]

Note that there are slots for the "missing" values for c, e, f, but not for h and i. To fix this, you can use a negative limit argument to String.split(String regex, int limit).

    System.out.println(java.util.Arrays.toString(
        "a;b;;d;;;g;;".split(";", -1)
    ));
    // [a, b, , d, , , g, , ]

You can also use a positive limit of n to apply the pattern at most n - 1 times (i.e. resulting in no more than n elements in the array).


Zero-width matching split examples

Here are more examples of splitting on zero-width matching constructs; this can be used to split a string but also keep "delimiters".

Simple sentence splitting, keeping punctuation marks:

    String str = "Really?Wow!This.Is.Awesome!";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<=[.!?])")
    )); // prints "[Really?, Wow!, This., Is., Awesome!]"

Splitting a long string into fixed-length parts, using \G

    String str = "012345678901234567890";
    System.out.println(java.util.Arrays.toString(
        str.split("(?<=\\G.{4})")
    )); // prints "[0123, 4567, 8901, 2345, 6789, 0]"

Split before capital letters (except the first!)

    System.out.println(java.util.Arrays.toString(
        "OhMyGod".split("(?=(?!^)[A-Z])")
    )); // prints "[Oh, My, God]"

A variety of examples is provided in related questions below.

References

  • regular-expressions.info/Lookarounds

Related questions

  • Can you use zero-width matching regex in String split?
    • "abc<def>ghi<x><x>" -> "abc", "<def>", "ghi", "<x>", "<x>"
  • How do I convert CamelCase into human-readable names in Java?
    • "AnXMLAndXSLT2.0Tool" -> "An XML And XSLT 2.0 Tool"
    • C# version: is there a elegant way to parse a word and add spaces before capital letters
  • Java split is eating my characters
  • Is there a way to split strings with String.split() and include the delimiters?
  • Regex split string but keep separators


You don't use split!

Split is to get the things BETWEEN the separator.

For this you want to eliminate the unwanted chars; '-'

The solution is simple

out=in.replaceAll("-","");


Use something like this to get the single values splitted. I'd rather eliminate the unwanted chars first to avoid getting empty/null String in the result array.


final Vector nodes = new Vector();
int index = original.indexOf(separator);
while (index >= 0) {
  nodes.addElement(original.substring(0, index));
  original = original.substring(index + separator.length());
  index = original.indexOf(separator);
}
nodes.addElement(original);
final String[] result = new String[nodes.size()];
if (nodes.size() > 0) {
  for (int loop = 0; loop smaller nodes.size(); loop++) {
    result[loop] = (String) nodes.elementAt(loop);
  }
}
return result;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消