开发者

Perl regexp matching for strings with special characters

开发者 https://www.devze.com 2023-02-19 13:17 出处:网络
I have list of substrings which I need to match within a list of URL strings. The substrings 开发者_JAVA百科have special characters like \'|\', \'*\', \'-\', \'+\' etc. If the URL strings contains tha

I have list of substrings which I need to match within a list of URL strings. The substrings 开发者_JAVA百科have special characters like '|', '*', '-', '+' etc. If the URL strings contains that substring I need to do some operation. But for now lets just say I will print "TRUE" in the console.

I did this by first reading from the list of substrings and putting it into a hash. I then tried to perform a simple Regexp match of the entire list for each URL until a match is found. The code is something like this.

open my $ADS, '<', $ad_file or die "can't open $ad_file";

while(<$ADS>) {
        chomp;

        $ads_list_hash{$lines} = $_;
        $lines ++;
 }  

close $ADS;

open my $IN, '<', $inputfile or die "can't open $inputfile";      
my $first_line = <$IN>;

while(<$IN>) {      
       chomp;       

       my @hhfile = split /,/;       
       for my $count (0 .. $lines) {

            if($hhfile[9] =~ /$ads_list_hash{$count}/) {
                print "$hhfile[9]\t$ads_list_hash{$count}\n";

                print "TRUE !\n";
                last;
            }
       }

 }

 close $IN;

The problem is that the substrings have a lot of special characters which is causing errors in the match $hhfile[9] =~ /$ads_list_hash{$count}/. Few examples are;

+adverts/
.to/ad.php|
/addyn|*|adtech;

I get an error in lines like these which basically says "Quantifier follows nothing in regexp". Do I need to chanhge something in the regexp matching syntax to avoid these?


You need to escape the special characters in the string.

Enclosing the string between \Q and \E will do the job:

if($hhfile[9] =~ /\Q$ads_list_hash{$count}\E/) {
0

精彩评论

暂无评论...
验证码 换一张
取 消