开发者

How do I make a regex match for measurement units?

开发者 https://www.devze.com 2022-12-17 07:08 出处:网络
I\'m building a small Java library which has to match units in strings. For example, if I have \"300000000 m/s^2\", I want it to match against \"m\" and \"s^2\".

I'm building a small Java library which has to match units in strings. For example, if I have "300000000 m/s^2", I want it to match against "m" and "s^2".

So far, I have tried most imaginable (by me) configurations resembling (I hope it's a good start)

"[[a-zA-Z]+[\\^[\\-]?[0-9]+]?]+"

To clarify, I need something that will match letters[^[-]numbers] (where [ ] denotes non obligatory parts). That means: letters, possibly followed by an exponent which is possibly negative.

I have studied regex a little bit, but I'm really not fluent, so any help will be greatly appreciated!

Thank you very much,

EDIT: I have just tried the first 3 replies

String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";

and it doesn't work... I know the code which tests the patterns work, because if I try something simple, like matching "[0-9]+" in "12345", it will match the whole string. So, I don't get what's still wrong. I'm trying with changing my brackets for parenthesis where needed at the moment...

CODE USED TO TEST:

public static void main(String[] args) {
    String input = "3000开发者_JAVA技巧0 m/s^2";

//    String input = "35345";

    String regex1 = "([a-zA-Z]+)(?:\\^(-?\\d+))?";
    String regex2 = "[a-zA-Z]+(\\^-?[0-9]+)?";
    String regex3 = "[a-zA-Z]+(?:\\^-?[0-9]+)?";
    String regex10 = "[0-9]+";
    String regex = "([a-zA-Z]+)(?:\\^\\-?[0-9]+)?";
    Pattern pattern = Pattern.compile(regex3);
    Matcher matcher = pattern.matcher(input);

    if (matcher.matches()) {
        System.out.println("MATCHES");
        do {
            int start = matcher.start();
            int end = matcher.end();
//            System.out.println(start + " " + end);
            System.out.println(input.substring(start, end));
        } while (matcher.find());
    }

}


([a-zA-Z]+)(?:\^(-?\d+))?

You don't need to use the character class [...] if you're matching a single character. (...) here is a capturing bracket for you to extract the unit and exponent later. (?:...) is non-capturing grouping.


You're mixing the use of square brackets to denote character classes and curly brackets to group. Try this instead:

[a-zA-Z]+(\^-?[0-9]+)?

In many regular expression dialects you can use \d to mean any digit instead of [0-9].


Try

"[a-zA-Z]+(?:\\^-?[0-9]+)?"
0

精彩评论

暂无评论...
验证码 换一张
取 消