I have a Perl substitution which converts hyperlinks to lowercase:
's/(?<=<a href=")([^"]+)(?=")/\L$1/g'
I want the substitution to ignore any links which begin with a hash, for example I want it to change the path in <a href="FooBar/Foo.bar">Foo Bar</a>
to lowercase but skip if it comes across <a href="#Bar">Bar</a>
Nesting lookaheads to instruct it to skip these links isn't working correctly for me. This is the one-liner I've written:
perl -pi -e 's/(?<=<a href=" (?! (?<=<a href="#) ) )([^"]+)(?=")/\L$1/g' *;
Could anyone hint to me where I have gone wrong with this substitution? It executes just fine, but does not do anythi开发者_StackOverflow中文版ng.
As near as I can tell, your initial regex will work just fine, if you add the condition that the first character in the link may not be a hash #
or a double quote, e.g. [^#"]
s/(?<=<a href=")([^#"][^"]+)(?=")/\L$1/gi;
In the case you have links which do not start with a hash, e.g. <a href="FooBar/Foo.bar#BarBar">Foo Bar</a>
, it becomes slightly more complicated:
s{(?<=<a href=")([^#"]+)(#[^"]+)*(?=")}{ lc($1) . ($2 // "") }gei;
We now have to evaluate the substitution, since otherwise we get undefined variable warnings when the optional anchor reference is not present.
You don't need look-arounds, from what I see
use 5.010;
s/<a \s+ href \s* = \s* "\K([^#"][^"]*)"/\L$1"/gx;
means "keep" everything before it. It amounts to a variable-length look-behind.
For various reasons \K may be significantly more efficient than the equivalent
construct, and it is especially useful in situations where you want to efficiently remove something following something else in a string.