Which characters can be used as delimiters for a Perl regular expression? m/re/
, m(re)
and måreå
all seem to work, but I'd like to know al开发者_如何学Cl possibilities.
From perlop
:
With the m you can use any pair of non-whitespace characters as delimiters.
So anything goes, except whitespace. The full paragraph for this is:
If "/" is the delimiter then the initial m is optional. With the m you can use any pair of non-whitespace characters as delimiters. This is particularly useful for matching path names that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is the delimiter, then the match-only-once rule of ?PATTERN? applies. If "'" is the delimiter, no interpolation is performed on the PATTERN. When using a character valid in an identifier, whitespace is required after the m.
As is often the case, I wonder "can I write a Perl program to answer that question?".
Here is a pretty good first approximation of trying all of the printable ASCII chars:
#!/usr/bin/perl
use warnings;
use strict;
$_ = 'foo bar'; # something to match against
foreach my $ascii (32 .. 126) {
my $delim = chr $ascii;
next if $delim eq '?'; # avoid fatal error
foreach my $m ('m', 'm ') { # with and without space after "m"
my $code = $m . $delim . '(\w+)' . $delim . ';';
# print "$code\n";
my $match;
{
no warnings 'syntax';
($match) = eval $code;
}
print "[$delim] didn't compile with $m$delim$delim\n" if $@;
if (defined $match and $match ne 'foo') {
print "[$delim] didn't match correctly ($match)\n";
}
}
}
Just about any non-whitespace character can be used, though identifier characters have to be separated from the initial m by whitespace. Though when you use a single quote as the delimiter, it disables interpolation and most backslash escaping.
There is currently a bug in the lexer that sometimes prevents UTF-8 characters from being used as a delimiter, even though you can sneak Latin1 by it if you aren't in full Unicode mode.
精彩评论