开发者

How do I write a Perl regular expression that will match a string with only these characters?

开发者 https://www.devze.com 2022-12-20 06:45 出处:网络
I am pretty new to regular expressions. I want to write a regular expression which validates whether the given string ha开发者_如何学Gos only certain characters. If the string has any other characters

I am pretty new to regular expressions. I want to write a regular expression which validates whether the given string ha开发者_如何学Gos only certain characters. If the string has any other characters than these it should not be matched.

The characters I want are:

 & ' : , / - ( ) . # " ; A-Z a-z 0-9


Try this:

$val =~ m/^[&':,\/\-().#";A-Za-z0-9]+$/;

$val will match if it has at least one character and consists entirely of characters in that character set. An empty string will not be matched (if you want an empty string to match, change the last + to a *).

You can test it out yourself:

# Here's the file contents. $ARGV[0] is the first command-line parameter.
# We print out the matched text if we have a match, or nothing if we don't.
[/tmp]> cat regex.pl
$val = $ARGV[0];
print ($val =~ m/^[&':,\/\-().#";A-Za-z0-9]+$/g);
print "\n";

Some examples:

# Have to escape ( and & in the shell, since they have meaning.
[/tmp]> perl regex.pl a\(bc\&
a(bc&

[/tmp]> perl regex.pl abbb%c


[/tmp]> perl regex.pl abcx
abcx

[/tmp]> perl regex.pl 52
52

[/tmp]> perl regex.pl 5%2


/\A[A-Za-z0-9&':,\().#";-]+\z/

Those so called special characters are not special in a character class.


There are two main approaches to construct a regular expression for this purpose. First is to make sure that all symbols are allowed. Another is to make sure that no symbols are not allowed. And you can also use the transliteration operator instead. Here's a benchmark:

use Benchmark 'cmpthese';

my @chars = ('0' .. '9', 'A' .. 'Z', 'a' .. 'z');
my $randstr = map $chars[rand @chars], 1 .. 16;
sub nextstr() { return $randstr++ }

cmpthese 1000000, {
    regex1 => sub { nextstr =~ /\A["#&'(),\-.\/0-9:;A-Za-z]*\z/ },
    regex2 => sub { nextstr !~ /[^"#&'(),\-.\/0-9:;A-Za-z]/ },
    tr     => sub { (my $dummy = nextstr) !~ y/"#&'(),\-.\/0-9:;A-Za-z/"#&'(),\-.\/0-9:;A-Za-z/c },
};

Results:

           Rate regex1 regex2     tr
regex1 137552/s     --   -41%   -60%
regex2 231481/s    68%     --   -32%
tr     341297/s   148%    47%     --


/^[&':,/-().#";A-Za-z0-9]*$/

0

精彩评论

暂无评论...
验证码 换一张
取 消