this is my code :
string ='''
{% emoji 'MONEY_BAG' %}<span style="color:#7F6C41;"><a href="{% mobile_url '/inventory/view_item/?category=weapon&inventory_id=%s' inventory_id %}">{{ item.name }}</a>を入手した!</span></span>
'''
a = r'''
{%\s+mobile_url\s+开发者_如何转开发['"]{1}(/inventory/view_item/\?)[^'"]*['"]{1}\s+([^%}]+)\s+%}
'''
def aa(x):
print x.group(1)
print x.group(2)
return ''
string = re.sub(a, aa, string)
print string
and it show :
{% emoji 'MONEY_BAG' %}<span style="color:#7F6C41;"><a href="{% mobile_url '/inventory/view_item/?category=weapon&inventory_id=%s' inventory_id %}">{{ item.name }}</a>を入手した!</span></span>
i want to print the x.group(1) and the x.group(2)
so what can i do ,
thanks
It's a bad idea to use regex to extract information from HTML. It's much easier with a HMTL Parser: http://docs.python.org/library/htmlparser.html
Or if you want to crawl a webpage for more information, you might want to use scrapy which is a truly great web crawler framework.
Your extra newline characters in a
are causing the regex to never match
a = r'''{%\s+mobile_url\s+['"]{1}(/inventory/view_item/\?)[^'"]*['"]{1}\s+([^%}]+)\s+%}'''
精彩评论