开发者

Problems matching caret in Python regex

开发者 https://www.devze.com 2023-01-12 02:25 出处:网络
开发者_运维百科I have the following regular expression, which I think should match any character that is not alphanumeric, \'!\', \'?\', or \'.\'
开发者_运维百科

I have the following regular expression, which I think should match any character that is not alphanumeric, '!', '?', or '.'

re.compile('[^A-z ?!.]')

However, I get the following weird result in iPython:

In [21]: re.sub(a, ' ', 'Hey !$%^&*.#$%^&.')
Out[21]: 'Hey !  ^  .   ^ .'

The result is the same when I escape the '.' in the regular expression.

How do I match the caret so that it is removed from the string as well?


You have an error in your regular expression. Note that the case of the a and z is important. A-z includes all characters between ASCII value 65 (A) and 122 (Z), which includes the caret character (ASCII code 94).

Try this instead:

re.compile('[^A-Za-z ?!.]')

Example:

import re
regex = re.compile('[^A-Za-z ?!.]')
result = regex.sub(' ', 'Hey !$%^&*.#$%^&.')
print result

Result:

Hey !     .     .


The caret falls between the upper and lower cases in ASCII. You need [^a-zA-Z ?!\.]

0

精彩评论

暂无评论...
验证码 换一张
取 消