开发者

"eager" regexp matching

开发者 https://www.devze.com 2023-02-19 23:18 出处:网络
I have to remove the string between two delimiters, i.e From \"123XabcX321\" I want \"123321\". For a simple case, I\'m fine with:

I have to remove the string between two delimiters, i.e From "123XabcX321" I want "123321". For a simple case, I'm fine with:

$_=<>;
s/X(.*)X//;
print;

But if there's ambiguity in the input like "123XabcXasdfjXasdX321", it matches the first X with the last X and I get "123321" but I want "123asdfj321". Is there a way to specify an "eager" match that matches with the f开发者_StackOverflow社区irst valid possible delimiter and not the last?


It's normally called "ungreedy", you put a ? after the quantifier: s/X(.*?)X//;


Avoid the non-greedy modifier as anything but a performance hint if you can. Using it can lead to "unexpected" results because adding ? doesn't actually prevent .* from matching anything. For example,

$ perl -le'print for "XaXbXY" =~ /X(.*?)XY/;'
aXb

To avoid matching X, you can use the following:

s/X[^X]*X//g;

If X is really something larger than one character, you can use the following:

s/X(?:(?!X).)*X//g;
0

精彩评论

暂无评论...
验证码 换一张
取 消