开发者

How to reference captures in bash regex replacement

开发者 https://www.devze.com 2023-02-24 09:47 出处:网络
How can I include the regex match in the replacement expression in BASH? Non-working example: #!/bin/开发者_Python百科bash

How can I include the regex match in the replacement expression in BASH?

Non-working example:

#!/bin/开发者_Python百科bash
name=joshua
echo ${name//[oa]/X\1}

I expect to output jXoshuXa with \1 being replaced by the matched character.

This doesn't actually work though and outputs jX1shuX1 instead.


Perhaps not as intuitive as sed and arguably quite obscure but in the spirit of completeness, while BASH will probably never support capture variables in replace (at least not in the usual fashion as parenthesis are used for extended pattern matching), but it is still possible to capture a pattern when testing with the binary operator =~ to produce an array of matches called BASH_REMATCH.

Making the following example possible:

#!/bin/bash
name='joshua'
[[ $name =~ ([ao].*)([oa]) ]] && \
    echo ${name/$BASH_REMATCH/X${BASH_REMATCH[1]}X${BASH_REMATCH[2]}}

The conditional match of the regular expression ([ao].*)([oa]) captures the following values to $BASH_REMATCH:

$ echo ${BASH_REMATCH[*]}
oshua oshu a

If found we use the ${parameter/pattern/string} expansion to search for the pattern oshua in parameter with value joshua and replace it with the combined string Xoshu and Xa. However this only works for our example string because we know what to expect.

For something that functions more like the match all or global regex counterparts the following example will greedy match for any unchanged o or a inserting X from back to front.

#/bin/bash
name='joshua'
while [[ $name =~ .*[^X]([oa]) ]]; do
    name=${name/$BASH_REMATCH/${BASH_REMATCH:0:-1}X${BASH_REMATCH[1]}}
done 
echo $name

The first iteration changes $name to joshuXa and finally to jXoshuXa before the condition fails and the loop terminates. This example works similar to the look behind expression /(?<!X)([oa])/X\1/ which assumes to only care about the o or a characters which don't have a X prefixed.

The output for both examples:

jXoshuXa

nJoy!


bash> name=joshua  
bash> echo $name | sed 's/\([oa]\)/X\1/g'  
jXoshuXa


The question bash string substitution: reference matched subexpressions was marked a duplicate of this one, in spite of the requirement that

The code runs in a long loop, it should be a one-liner that does not launch sub-processes.

So the answer is:

If you really cannot afford launching sed in a subprocess, do not use bash ! Use perl instead, its read-update-output loop will be several times faster, and the difference in syntax is small. (Well, you must not forget semicolons.)

I switched to perl, and there was only one gotcha: Unicode support was not available on one of the computers, I had to reinstall packages.

0

精彩评论

暂无评论...
验证码 换一张
取 消