What does this line in DBI.pm do?_问答_开发者_运维开发者技术经验分享

603   $dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
                         or '' =~ /()/; # ensure $1 etc are empty if match fails

I don't understand what $dsn =~ s/^dbi:(\w*?)(?:\((.*?)\开发者_开发百科))?://i is for,even more doubt about '' =~ /()/,seems useless to me..

The first part is extracting two parts of the dsn string in the form:

dbi: first match ( optional second match ) :

These matches will be placed into $1 and $2 for the use in later code. The second part will only run if the match was unsuccessful. This is achieved by using or which will short-circuit (i.e. not execute) the second expression if the first one was successful.

As the comment says quite succinctly, it ensures that $1, $2, etc. are empty. Presumably so later code can check them and produce an appropriate error if they were not set (i.e. could not be extracted from the dsn string).

Equals-tilde, or =~, is the match operator.

Try the following code -- put it in a file, make executable with chmod +x, and run it:

#!/usr/bin/perl

$mystring = "Perl rocks.";

if ($mystring =~ /rocks/) {
  print("Matches");
} else {
  print("No match");
}

It will output Matches.

As for your example, it checks if the connection string is in the correct format, and extracts the database name, etc:

print($dsn);

$dsn = "dbi:SQLPlatform:database_name:host_name:port";

$dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
                             or '' =~ /()/; # ensure $1 etc are empty if match fails

print($dsn);

Ouptuts database_name:host_name:port.

It's clear from the comments in the code:

602     # extract dbi:driver prefix from $dsn into $1
603     $dsn =~ s/^dbi:(\w*?)(?:\((.*?)\))?://i
604             or '' =~ /()/; # ensure $1 etc are empty if match fails

If you have problems understanding how s// and m// work see perlop and perlre.

If a capturing match fails $1 may still contain a value; the value of the last successful matching capture in the same dynamic scope, possibly from some other previous regexp. It appears the author didn't want a failed match at this point to leave some value in $1 from a previous regexp. To prevent this, he forced a "will always succeed" capturing match with nothing specified within the capturing parens. That means that there will be a match, and a capture of the empty string. In other words, $1 will now be empty rather than containing the match value from some previous successful match.

A more common idiom is simply to test for match success before executing whatever code will rely on $1's value, as in:

if( /(match)/ ) {
    say $1;
}

While that's often the simplest approach, unfortunately code sometimes is not simple, and forcing that test into some complex code may make a tricky section even harder to deal with. That being the case, it may just be easier to ensure that $1 contains nothing after a failed match, rather than what it contained before the failed match.

I actually think that's a good question. Finding documentation of the behavior of #$1 after a failed match isn't easy within the Perl POD. I believe a more thorough explanation is found either in the camel book or the llama book. But I don't have them at my fingertips right now to check.

What is left out of the answers so far is the reason for that mysterious or '' =~ /()/. Without that bit of trickiness, $1 will be undefined if the match fails. The code is probably using $1 in a concatenation or a string shortly after this match. Doing this with $1 undefined will result in a "Use of uninitialized value $1 in concatenation (.) or string" warning if use warnings is in effect. With that or '' =~ /()/ trickiness in play, $1 will be defined (but empty) should the regular expression fail to match. This keeps that code that uses $1 from spewing.

The comment # ensure $1 etc are empty if match fails is incorrect. Get rid of that 'etc' and the comment is correct. This action sets $1, and $1 only. This code does not set $2. $2 will be undefined if the regular expression does not match.