开发者

Number of matches in regex substitution

开发者 https://www.devze.com 2023-01-19 19:56 出处:网络
I am looking for a Pythonic way to simplify this code: fix = re.compile(r\'((?<=>\\n)(\\t){2}(?=<))\')

I am looking for a Pythonic way to simplify this code:

fix = re.compile(r'((?<=>\n)(\t){2}(?=<))')
fixed_output = re.sub(fix, 1*2*' ', fixed_output)
fix = re.compile(r'((?<=&开发者_C百科gt;\n)(\t){3}(?=<))')
fixed_output = re.sub(fix, 2*2*' ', fixed_output)
# and so on...

That is: if there are n tab characters between ">" and "<", they are replaced by *(n-1) * 2* characters. Can this be generalized to a single regular expression? In other words, is it possible to write a regular expression that uses the number of matches in order to determine the replacement string?


You can use a function instead of a fixed replacement string and take the number of matched tabulator characters to generate the replacement, for example:

re.sub(r'((?<=>\n)\t{2,}(?=<))', lambda m: (len(m.group(0))-1)*2*" ", string)

Here the lambda expression lambda m: (len(m.group(0))-1)*2*" " is used to replace n tabulator character by (n-1)·2 spaces.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号