开发者

How to remove duplicates in Links genrated using mechnize in Python?

开发者 https://www.devze.com 2023-01-10 06:08 出处:网络
Here is my code in python which Genrates a list of link objects. I want to remove duplicates form them.

Here is my code in python which Genrates a list of link objects. I want to remove duplicates form them.

cb = list()
for link in br.links(url_regex="inquiry-results.jsp"):
        cb.append(link)
print set(cb)

But It returns the error unhashable instance. link is something lik开发者_如何转开发e this -

Link(
    base_url='http://casesearch.courts.state.md.us/inquiry/inquirySearch.jis',
    url='/inquiry/inquiry-results.jsp?action=..........',
    text='12',
    tag='a',
    attrs=[('href', '/inquiry/inquiry-results.jsp?action=.......'),
    ('title', 'Go to page 12')]
    ),

[Added newlines and dots just for convenience]

How can I remove duplicates?


You can construct a dictionary using URLs as keys and the get its values:

cb = {}
for link in br.links(url_regex="inquiry-results.jsp"):
    cb[link.url] = link
print cb.values()
0

精彩评论

暂无评论...
验证码 换一张
取 消