I have the following regex:
\b((?:12345)(\d{1,4})1)\b
What I'm trying to do is make it so the end of the number must always be 1 regardless of the range or choice of numbers in between. What I'm having a difficult time figuring out is how to get around an example like 12345 111 1,(Note: I've separated the grouped numbers to avoid confusion). How can I make it so the regex can tell the 开发者_开发问答numbers in the range group apart from the '1' that follows the group?? Thank you for your help.
UPDATE:
No specific programming language, the regex is being stored in a database and being pulled and used as a reference check by Javascript.
UPDATE:
Some Examples to clarify:
-A user enters 1234511, how would the regex engine know if that's valid or not?? i.e. How does the engine know if it's valid 12345 1 1(a single '1' with the required ending '1') or invalid 12345 11 (the group of 1's are part of the (\d{1,4})
portion of the regex, but the string doesn't include the '1' at the end)
-A user enters 1234510111, this would be valid. 12345 1011 1(the '1011' is part of the group (\d{1,4})
) and includes the 1 at the end.
Summary:
The regex must recognize any group of numbers ranging from 1-4, but the string must always end with a 1
The regex already captures those numbers into a capturing group:
var str = "123459991";
var match = str.match(/\b((?:12345)(\d{1,4})1)\b/);
alert(match[2]); // 999
Capturing groups are the parentheses you have on your regex.
- Group 0 has the whole match by default (always
matches[0]
, for every regular expression) - Group 1 is
((?:12345)(\d{1,4})1)
, which is the whole match again (not too useful). - Group 2 is
(\d{1,4})
, which is the number you're looking for. 12345
is not captured to a group -(?...)
is a non-capturing group, and isn't included in the as a token in the results. Again, this doesn't add much, you could simple write12345
.
You can simplify the regex to /\b12345(\d{1,4})1\b/
and not lose any information.
How is it matched:
Generally, it's important to remember the regex engine tries really hard to match your input. Specifically, it tries all combinations (and all possible starting positions) until it can match the pattern to the text.
For example, if the text is 1234599991, matching is easy:
- match 12345 (character by character, of course)
- match \d{4} against 9999
- match 1
Next example: 123459991
- match 12345
- match \d against 9
- match \d against 9
- match \d against 9
- match \d against 1
- try to match 1, but fail (it was already consumed by \d{4})
- backtrack the last matched character:
- no \d{1,4} matched 999
- 1 can be matched.
- done.
See also: Backtracking, Greediness and Laziness
Have a look at this see if it achieves what you want.
var regex = /^(([0-9]{5})([0-9]{1,4})(1))$/i;
var match1 = "1234511111".match(regex);
var match2 = "123451111".match(regex);
var match3 = "12345111".match(regex);
var match4 = "1234511".match(regex);
var match5 = "123451".match(regex);
console.log(match1);
console.log(match2);
console.log(match3);
console.log(match4);
console.log(match5);
精彩评论