开发者

Optimal regex to extract a single decimal number in an arbitrary string

开发者 https://www.devze.com 2023-04-03 15:43 出处:网络
Assuming a target string that\'s on one hand arbitrary but, on the other hand, guaranteed to contain a single decimal number (1 or more digits), I came up with the following regular regex pattern:

Assuming a target string that's on one hand arbitrary but, on the other hand, guaranteed to contain a single decimal number (1 or more digits), I came up with the following regular regex pattern:

.*?(\d+).*?

So, if the target string is "(this is number 200)", for example, Matcher.group(1) will contain the number.

Is there a more optimal regex pattern (or non-regex method) to extract this number?

By "optimal" I mean fastest (possibly with the least amou开发者_C百科nt of CPU cycles). Java only.


Just (\d+) is more than enough.


I am sure regex and parseInt will perform well enough for you. However for your interest, I have compared it with a simple loop.

public static final Pattern DIGITS = Pattern.compile("(\\d+)");

public static void main(String[] args) {
  String text = "Some text before a number 123456 and some after";
  for (int i = 0; i < 5; i++) {
    timeRegex(text);
    timeLooping(text);
  }
}

private static int timeLooping(String text) {
  int ret = 0;
  final int runs = 1000;
  long start = System.nanoTime();
  for (int r = 0; r < runs; r++) {
    for (int i = 0; i < text.length(); i++) {
      char ch = text.charAt(i);
      if (ch <= '9' && ch >= '0')
        ret = ret * 10 + ch - '0';
      else if (ret > 0)
        break;
    }
  }
  long time = System.nanoTime() - start;
  System.out.printf("Took %,d ns to use a loop on average%n", time / runs);
  return ret;
}

private static int timeRegex(String text) {
  int ret = 0;
  final int runs = 1000;
  long start = System.nanoTime();
  for (int r = 0; r < runs; r++) {
    Matcher m = DIGITS.matcher(text);
    if (m.find())
      ret = Integer.parseInt(m.group());
  }
  long time = System.nanoTime() - start;
  System.out.printf("Took %,d ns to use a matcher on average%n", time / runs);
  return ret;
}

prints

Took 19,803 ns to use a matcher on average
Took 85 ns to use a loop on average
Took 12,411 ns to use a matcher on average
Took 83 ns to use a loop on average
Took 8,199 ns to use a matcher on average
Took 79 ns to use a loop on average
Took 11,156 ns to use a matcher on average
Took 104 ns to use a loop on average
Took 4,527 ns to use a matcher on average
Took 94 ns to use a loop on average
0

精彩评论

暂无评论...
验证码 换一张
取 消