开发者

validate comma separated list using regex

开发者 https://www.devze.com 2023-03-23 11:55 出处:网络
I want to validate that the user has entered a comma separated 开发者_开发百科list of words only using regex, will this work/is there a better way:

I want to validate that the user has entered a comma separated 开发者_开发百科list of words only using regex, will this work/is there a better way:

 $match = "#^([^0-9 A-z])(,\s|$))+$#";

This is not for parsing, as I will use explode for that, it is merely to validate that the user has correctly understood that values should be separated with commas.


I don't know what the separate values should look like, but perhaps this is something that can help you:

$value = '[0-9 A-Z]+';
$match = "~^$value(,$value)*$~i";

When you know what your values should look like you can change $value.


Update

Taken this comment from the question

I was trying to disallow 0-9 but allow a-Z, because, as I said it should be a list of words only, no spaces, just a list of single words.

Just change $value

$value = '[A-Z]+';

Note, that I use the i-modifier in $match, that will include all lowercase letters too.


However, "real" csv allows a little bit more than this, for example you can put quotes around every value. I would recommend that you parse the line and then test every single value

$values = str_getcsv($line);
$valid = true;
foreach ($values as $value) {
  $valid = $valid && isValid($value);
}

str_getcsv()


your $match gives an error. this

$str = 'sdfbdf,wefwef,323r,dfvdfv';
$match = "/[\S\,]+\S/";
preg_match($match,$str,$m);

can work, but why don't you use explode?


I think this is what you're looking for:

'#\G(?:[A-Za-z]+(?:\s*,\s*|$))+$#'

\G anchors the match either to the beginning of the string or the position where the previous match ended. That ensures that the regex doesn't skip over any invalid characters as it tries to match each word. For example, given this string:

'foo,&&bar'

It will report failure because it can't start the second match immediately after the comma.

Also, notice the character class: [A-Za-z]. You used [A-z] in your regex and [a-Z] in a comment, both of which are invalid (for this purpose, anyway). They may have been mere typos, but watch out for them nonetheless. Those typos could could end up causing subtle and/or serious bugs.

EDIT: \G isn't universally supported, so check before using it in any another regex flavors.


I suggest, if you want to test regular expressions, to use kiki.

About your regex: it won't work because of unmatched parentheses. Some more things to take into account:

  • You currently disallow numbers, a space and letters A-z. I think [0-9a-zA-Z ] was what you had in mind, although that still filters out letters like ö.
  • You are currently forcing a space after each comma
  • The dollar sign as an end anchor won't work unless it's the last character of the regex (excluding the #)

Besides, what's the point of validating whether or not this is a comma-seperated list?

0

精彩评论

暂无评论...
验证码 换一张
取 消