开发者

Regular Expression Get US Zip Code

开发者 https://www.devze.com 2023-04-05 04:10 出处:网络
How do I 开发者_开发技巧\"extract\" zip code (US) from the following string? import re address = \"Moab, UT 84532\"

How do I 开发者_开发技巧"extract" zip code (US) from the following string?

import re
address = "Moab, UT 84532"
postal_code = re.match('^\d{5}(-\d{4})?$', address)
print postal_code


Firstly, you are using match, which will match only from the beginning of the string: see http://docs.python.org/library/re.html#matching-vs-searching

Also, even if you were using search, you are not grabbing the group that includes the 5 digits that are guaranteed to be there.

Lastly, even if you were using search, starting your regex with a carat ^ will force it to search from the beginning, which obviously won't work in your case.

>>> postal_code = re.search(r'.*(\d{5}(\-\d{4})?)$', address)
>>> postal_code.groups()
('84532', None)


Assuming the zip code is always 5 digit (is that the case in the US is it not?)

re.match('\d{5}$', address)

will do.

Comment is right about match vs search and if I want to include the extra 4 chars:

re.search('\d{5}(-\d{4})?$', address)

should do it.


you can use :

postal_code = re.match('^.*?(\d+)$', address)
if postal_code is not None:
    print postal_code.group(1)


This one works perfectly for all these formats:

99999-9999

99999 9999

99999

address = '123 Main St, 12345-5678 USA'
re.search('(\d{5})([- ])?(\d{4})?', address).groups()

The result is: ('12345', '-', '5678')

To get the whole match, use:

re.search('(\d{5})([- ])?(\d{4})?', address).group(0)

And group(1) & group(3) contain both portions of the zip code. I use match instead since I am applying this to a field that contains a zip code only. I also added ^ $ at the beginning and end respectively for this case.

zip_code = '12345-6655'
re.match('^(\d{5})([- ])?(\d{4})?$', zip_code).group(0)
0

精彩评论

暂无评论...
验证码 换一张
取 消