开发者

Perl split pattern

开发者 https://www.devze.com 2023-02-03 16:19 出处:网络
According to the perldoc, the syntax for split is: split /PATTERN/,EXPR,LIMIT But the PATTERN can also be a single- or double-quoted string: split \"PATTERN\", EXPR. What difference does it make?

According to the perldoc, the syntax for split is:

split /PATTERN/,EXPR,LIMIT

But the PATTERN can also be a single- or double-quoted string: split "PATTERN", EXPR. What difference does it make?

Edit: A difference I'm aware of is splitting on backslashes: split /\\/ vs split '\\'. The second form doesn't wor开发者_JAVA百科k.


It looks like it uses that as "an expression to specify patterns":

The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at runtime. (To do runtime compilation only once, use /$variable/o .)

edit: I tested it with this:

my $foo = 'a:b:c,d,e';
print join(' ', split("[:,]", $foo)), "\n";
print join(' ', split(/[:,]/, $foo)), "\n";
print join(' ', split(/\Q[:,]\E/, $foo)), "\n";

Except for the ' ' special case, it looks just like a regular expression.


PATTERN is always interpreted as... well, a pattern -- never as a literal value. It can be either a regex1 or a string. Strings are compiled to regexes. For the most part the behavior is the same, but there can be subtle differences caused by the double interpretation.

The string '\\' only contains a single backslash. When interpreted as a pattern, it's as if you had written /\/, which is invalid:

C:\>perl -e "print join ':', split '\\', 'a\b\c'"
Trailing \ in regex m/\/ at -e line 1.

Oops!

Additionally, there are two special cases:

  • The empty pattern //, which splits on the empty string.
  • A single space ' ', which splits on whitespace after first trimming any leading or trailing whitespace.

1. Regexes can be supplied either inline /.../ or via a precompiled qr// quoted string.


I believe there's no difference. A string pattern is also interpreted as a regular expression.


perl -e 'print join("-",split("[a-e]","regular"))';
r-gul-r

As you see, the delimiter is interpreted as a regular expression, not a string literal.

So, it's mostly the same - with one important exception: split(" ",... ) and split(/ /,... ) are different.

I prefer to use /PATTERN/ to avoid confusion, it's easy to forget that it's a regexp otherwise.


Two observable rules:

  • the special case split(" ") is equivalent to split(/\s+/).
  • for everything else (it seems—don't nail me), split("something") is equal to split(/something/)
0

精彩评论

暂无评论...
验证码 换一张
取 消