I am trying to validate an input in PHP with REGEX. I want to check whether the input has the %s
character group inside it and that it appears only once. Otherwise, the rule should fail.
Here's what I've tried:
preg_match('|^[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value);
(there are also some other rules besides this; I've tried the (%s){1}
part and it doesn't work).
I believe it is a very easy solution to this, but I'm not really into REGEX's...Thank you for your help!
If I understand your question, you need a positive lookahead. The lookahead causes the expression to only match if it finds a single %s.
preg_match('|^(?=[^%s].*?[%s][^%s]*$)[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value);
I'll explain how each part works
^(?=[^%s].*?[%s][^%s]*$)
is a zero-width assertion -- (?=regex)
a positive lookahead -- (meaning it must match, but does not "eat" any characters). It means that the whole line can have only 1 %s
.
[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$
The remaining part of the regex also looks at the entire string and ensures that the whole string is composed only of the characters in the character class (like your original regex).
I managed to do this with PHP's substr_count() function, following Johnsyweb suggestion to use an alternate way to perform the validation and because the REGEX's suggested seem pretty complicated.
Thank you again!
Alternatively, you can use preg_match_all
with your pattern and check the number of matches. If it's 1, then you're ok - something like this:
$result = (preg_match_all('|^[0-9a-zA-Z_-\s:;,\.\?!\(\)\p{L}(%s){1}]*$|u', $value) == 1)
Try this:
'|^(?=(?:(?!%s).)*%s(?:(?!%s).)*$)[0-9_\s:;,.?!()\p{L}-]+$|u'
The (%s){1}
sequence inside the square brackets probably doesn't do what you think it does, but never mind, the solution is more complex. In fact, {1}
should never appear anywhere in a regex. It doesn't ensure that there's only one of something, as many people assume. As a matter of fact, it doesn't do anything; it's pure clutter.
EDIT (in answer to the comment): To ensure that only one of a particular sequence is present in a string, you have to actively examine every single character, classifying it as either part-of-%s
or not part-of-%s
. To that end, (?:(?!%s).)*
consumes one character at a time, after the negative lookahead has confirmed that the character is not the start of %s
.
When that part of the lookahead expression quits matching, the next thing in the string has to be %s
. Then the second (?:(?!%s).)*$
kicks in to confirm that there are no more %s
sequences until the end of the string.
And don't forget that the lookahead expression must be anchored at both ends. Because the lookahead is the first thing after the main regex's start anchor you don't need to add another ^
. But the lookahead must end with its own $
anchor.
If you're not "into" regular expressions, why not solve this with PHP?
One call to the builtin strpos()
will tell you if the string has a match. A second call will tell you if it appears more than once.
This will be easier for you to read and for others to maintain.
精彩评论