开发者

Regex match for a non-english language in Python

开发者 https://www.devze.com 2023-02-06 10:18 出处:网络
I\'m trying to capture and match russian language chara开发者_Python百科cters in a python script. Since russian characters don\'t fall in [a-Z] type, what regex should I should to match them. I can\'t

I'm trying to capture and match russian language chara开发者_Python百科cters in a python script. Since russian characters don't fall in [a-Z] type, what regex should I should to match them. I can't use a (.*) because it would match everything.

linkpat = re.compile('name=[a-Z]+;size=[0-9]+')


Use unicode flag:

re.compile('name=\w+;size=\d+', re.U)

this would also match any letter in any language (plus underscore), not just Russian, though.


You can try \w with the correct LOCALE


Use character classes, which are locale dependent

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号