Guys please help me with开发者_StackOverflow社区 next problem. I need to find links with some key (string) and I used next code:
import urllib2, re
from BeautifulSoup import BeautifulSoup
url = 'http://5pd.ru'
page = urllib2.urlopen(url)
soup = BeautifulSoup(page)
print soup.findAll('a')
for link in soup.findAll('a'):
if '5' in link:
print link
It doesn't return anything
But in this example:
site_list = ['http://extra1.ru/', 'http://5pd.ru/', 'http://google.ru/', 'http://fun.ru/']
for i in site_list:
if '5' in i:
print i
It returned correct link
I just want to understand the most correct way to verify that link contain my string. Maybe I should make smth with soup.findAll('a')?
link is not string. use link['href'] instead of link inside for loop or force conversion to string with str(link)
findAll() with regular expression:
for link in soup.findAll('a', href=re.compile('5')):
print link['href']
精彩评论