开发者

Monster perl regex

开发者 https://www.devze.com 2023-01-09 20:12 出处:网络
I\'m trying to change strings like this: <a href=\'../Example/case23.html\'><img src=\'Blablabla.jpg\'

I'm trying to change strings like this:

<a href='../Example/case23.html'><img src='Blablabla.jpg'

To this:

<a href='../Example/case23.html'><img src='<?php imgname('case23'); ?>'

And I've got this mons开发者_运维技巧ter of a regular expression:

find . -type f | xargs perl -pi -e \
  's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/'

But it isn't working. In fact, I think it's a problem with Bash, which could probably be pointed out rather quickly.

r: line 4: syntax error near unexpected token `('
r: line 4: `  's/<a href=\'(.\.\.\/Example\/)(case\d\d)(.\.html\'><img src=\')*\'/\1\2\3<\?php imgname\(\'\2\'); \?>\'/''

But if you want to help me with the regular expression that'd be cool, too!


Teaching you how to fish:

s/…/…/

Use a separator other than / for the s operator because / already occurs in the expression.

s{…}{…}

Cut down on backslash quoting, prefer [.] over \. because we'll shellquote later. Let's keep backslashes only for the necessary or important parts, namely here the digits character class.

s{<a href='[.][.]/Example/case(\d\d)[.]html'>…

Capture only the variable part. No need to reassemble the string later if the most part is static.

s{<a href='[.][.]/Example/case(\d\d)[.]html'><img src='[^']*'}{<a href='../Example/case$1.html'><img src='<?php imgname('case$1'); ?>'}

Use $1 instead of \1 to denote backreferences. [^']* means everything until the next '.

To serve now as the argument for the Perl -e option, this program needs to be shellquoted. Employ the following helper program, you can also use an alias or shell function instead:

> cat `which shellquote`
#!/usr/bin/env perl
use String::ShellQuote qw(shell_quote); undef $/; print shell_quote <>

Run it and paste the program body, terminate input with Ctrl+d, you receive:

's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'

Put this together with shell pipeline.

find . -type f | xargs perl -pi -e 's{<a href='\''[.][.]/Example/case(\d\d)[.]html'\''><img src='\''[^'\'']*'\''}{<a href='\''../Example/case$1.html'\''><img src='\''<?php imgname('\''case$1'\''); ?>'\''}'


Bash single-quotes do not permit any escapes.

Try this at a bash prompt and you'll see what I mean:

FOO='\'foo'

will cause it to prompt you looking for the fourth single-quote. If you satisfy it, you'll find FOO's value is

\foo

You'll need to use double-quotes around your expression. Although in truth, your HTML should be using double-quotes in the first place.


Single quotes within single quotes in Bash:

set -xv
echo ''"'"''
echo $'\''


I wouldn't use a one-liner. Put your Perl code in a script, which makes it much easier to get the regex right without wondering about escaping quotes and such.

I'd use a script like this:

#!/usr/bin/perl -pi

use strict;
use warnings;

s{
    ( <a \b [^>]* \b href=['"] [^'"]*/case(\d+)\.html ['"] [^>]* > \s*
      <img \b [^>]* \b src=['"] ) [^'"<] [^'"]*
}{$1<?php imgname('case$2'); ?>}gix;

and then do something like:

find . -type f | xargs fiximgs

– Michael


if you install the package mysql, it comes with a command called replace.

With the replace command you can:

while read line 
do
 X=`echo $line| replace "<a href='../Example/"  ""|replace ".html'><" " "|awk '{print $1}'`
 echo "<a href='../Example/$X.html'><img src='<?php imgname('$X'); ?>'">NewFile   
done < myfile

same can be done with sed. sed s/'my string'/'replace string'/g.. replace is just easier to work with special characters.

0

精彩评论

暂无评论...
验证码 换一张
取 消