I'm trying to scrape the price field from this website using the HTML Agility Pack.
My code is as follows;
var web = new HtmlWeb();
var doc = web.Load(String.Format(overClockersURL, componentID));
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
I obtained the XPath query by using Firebug's "Copy as XPath" feature.
The problem I'm having is that SelectSingleNode is returning null - it doesn't seem to find the element specified by the query. I'm a bit stumped as to why, but I don't 开发者_Go百科have much experience with XPath, so would appreciate some pointers as to what I've done wrong.
When that happens, you should check if the page is being loaded correctly (you said you're through a HTTP Proxy?)
Try writing the content of doc.DocumentNode.OuterHtml
to a text file so you can see if the page is being loaded correctly. Maybe you're getting an error page instead of the original page.
If I run this code:
var web = new HtmlWeb();
var doc = web.Load("http://www.overclockers.co.uk/showproduct.php?prodid=GX-033-HS");
var priceContent = doc.DocumentNode.SelectSingleNode("//*[@id=\"prodprice\"]");
Console.WriteLine("price=" + priceContent.InnerHtml);
It outputs:
price=529.99
So it seems to be working. You can also use //span[@id=\"prodprice\"]"
which is better as it avoids all non SPAN tags.
精彩评论