I have another question with regex, i want to catch info1,info2 and info3 :
>>> a
'|123|blabla bloblo|90'
>>> b
'|123|blabla[[blibli|bloblo]]|90'
>>> re.search('\|(?P<info1>\d+)\|(?P<info2>[^\|]*)\|(?P<info3>\d+)',a).groupdict()
{'info1开发者_如何学C': '123', 'info3': '90', 'info2': 'blabla bloblo'}
>>> re.search('\|(?P<info1>\d+)\|(?P<info2>[^\|]*)\|(?P<info3>\d+)',b).groupdict()
AttributeError: 'NoneType' object has no attribute 'groupdict'
I want to use | as a separator, except if it surround by [[ ]] or {{ }} for b I want :
{'info1': '123', 'info3': '90', 'info2': 'blabla[[blibli|bloblo]]'}
Thanks,
Just to give you alternative and assuming that you data doesn't contains quote "
or you can replace it with any other char, here is away to use csv module
import csv
import StringIO
data = '|123|blabla|[[blibli|bloblo]]|90'
# assuming data doesn't have quotes, we can convert [[ and ]] to quotes and use csv to parse it
data = data.replace('[[','"').replace(']]','"')
print data
for row in csv.reader(StringIO.StringIO(data), delimiter='|', quotechar='"'):
print row
output:
|123|blabla|"blibli|bloblo"|90
['', '123', 'blabla', 'blibli|bloblo', '90']
精彩评论