Want to improve this question? Add details and clarify the problem by editing this post.
Closed 3 hours ago.
Improve this questionI am trying to capture links inside a div. In the attached screenshot I want to capture the all div's inside the "page-size mainclips".
When I search with soup.findall("div", class="page-size mainclips") I am not able to find anything?
What should I search with to get all the list of data-cliphref as highlighted in the screenshot.
How should I find the divs inside a particular div?
# create documenturl_to_scrap开发者_JS百科e="https://epaper.dishadaily.com'
html_document = getHTMLdocument(url_to_scrape)
# create soap object
soup = BeautifulSoup(html_document, 'html.parser')
linklist=[]
# find all the anchor tags with "href"
# attribute starting with "https://"
for link in soup.find_all('a',attrs={'href': re.compile("^https://")}):
# display the actual urls
# print(link.get('href'))
linklist.append(link.get('href'))
print('----------')
#print(linklist)
print(len(linklist))
substring = "latest?s="
for i in range(len(linklist)):
# print(i)
url_to_scrape1 = linklist[i]
if url_to_scrape1.find(substring) != -1:
print(url_to_scrape1)
html_document1 = getHTMLdocument(url_to_scrape1)
soup = BeautifulSoup(html_document1, 'html.parser')
for each_div in soup.find_all("div", attrs={"class":'clip-box clippageview'}):
# for each_div2 in soup.find_all('div', {'id':True}):
print(each_div)
I am using BeautifulSoup and Python.
精彩评论