for link in hxs.select("//a[contains(@href,'/women-')]"):
if ('.a[notcontains(@href,"/women-shoes")]'):
self.log("LINKS2 :: %s" % attribute::href())
The first statement is selecting all the hyperlinks which contain /women-
in their url. Basically I want to select all links which have /women-
in their url but not /women-shoes
.
- How can I put that condition in the
for
loop itself. I am looking for the correct usage ofnot
operator in the loop condition. Also, - If I want to do something like select all links with
/women-
in their url and then in the if condition I want to check if the link doesn't have/women-shoes
in the url, how do I do tha开发者_如何学编程t?
I think it would be more optimized as it will see first url not contains /women-shoes and then check whether it contains /women- :
queryStr = "//a[not(contains(@href,'/women-shoes')) and contains(@href,'/women-') ]
Why not filter within your query?
queryStr = "//a[contains(@href,'/women-') and not(contains(@href,'/women-shoes'))]"
for link in hxs.select(queryStr):
self.log("LINKS2 :: %s" % attribute::href())
精彩评论