How do I 开发者_开发技巧"extract" zip code (US) from the following string?
import re
address = "Moab, UT 84532"
postal_code = re.match('^\d{5}(-\d{4})?$', address)
print postal_code
Firstly, you are using match, which will match only from the beginning of the string: see http://docs.python.org/library/re.html#matching-vs-searching
Also, even if you were using search, you are not grabbing the group that includes the 5 digits that are guaranteed to be there.
Lastly, even if you were using search, starting your regex with a carat ^ will force it to search from the beginning, which obviously won't work in your case.
>>> postal_code = re.search(r'.*(\d{5}(\-\d{4})?)$', address)
>>> postal_code.groups()
('84532', None)
Assuming the zip code is always 5 digit (is that the case in the US is it not?)
re.match('\d{5}$', address)
will do.
Comment is right about match vs search and if I want to include the extra 4 chars:
re.search('\d{5}(-\d{4})?$', address)
should do it.
you can use :
postal_code = re.match('^.*?(\d+)$', address)
if postal_code is not None:
print postal_code.group(1)
This one works perfectly for all these formats:
99999-9999
99999 9999
99999
address = '123 Main St, 12345-5678 USA'
re.search('(\d{5})([- ])?(\d{4})?', address).groups()
The result is: ('12345', '-', '5678')
To get the whole match, use:
re.search('(\d{5})([- ])?(\d{4})?', address).group(0)
And group(1) & group(3) contain both portions of the zip code. I use match instead since I am applying this to a field that contains a zip code only. I also added ^ $ at the beginning and end respectively for this case.
zip_code = '12345-6655'
re.match('^(\d{5})([- ])?(\d{4})?$', zip_code).group(0)
精彩评论