开发者

Capture custom divs and div details using BeautifulSoup [closed]

开发者 https://www.devze.com 2022-12-07 20:10 出处:网络
Closed. This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post.
Closed. This question needs details or clarity. It is not currently accepting answers.

Want to improve this question? Add details and clarify the problem by editing this post.

Closed 3 hours ago.

Improve this question

I am trying to capture links inside a div. In the attached screenshot I want to capture the all div's inside the "page-size mainclips".

When I search with soup.findall("div", class="page-size mainclips") I am not able to find anything?

What should I search with to get all the list of data-cliphref as highlighted in the screenshot.

How should I find the divs inside a particular div?

# create documenturl_to_scrap开发者_JS百科e="https://epaper.dishadaily.com'
html_document = getHTMLdocument(url_to_scrape)

# create soap object
soup = BeautifulSoup(html_document, 'html.parser')
linklist=[]


# find all the anchor tags with "href"
# attribute starting with "https://"
for link in soup.find_all('a',attrs={'href': re.compile("^https://")}):
    # display the actual urls
#   print(link.get('href'))
    linklist.append(link.get('href'))   

print('----------')
#print(linklist)
print(len(linklist))
substring = "latest?s="
for i in range(len(linklist)):
#   print(i)
    url_to_scrape1 = linklist[i]
    if url_to_scrape1.find(substring) != -1:
        print(url_to_scrape1)
        html_document1 = getHTMLdocument(url_to_scrape1)
        soup = BeautifulSoup(html_document1, 'html.parser')
        for each_div in soup.find_all("div", attrs={"class":'clip-box clippageview'}):
#           for each_div2 in soup.find_all('div', {'id':True}):
            print(each_div)
            

I am using BeautifulSoup and Python.

Capture custom divs and div details using BeautifulSoup [closed]

0

精彩评论

暂无评论...
验证码 换一张
取 消