Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question 开发者_如何学CI am somewhat new to XPATH and understand most of the basics, but I am having some trouble with a particular query.
I am attempting to parse a Motley Fool page and return the source of the image for the caps score of a stock.
For example: if you look at the source for the page: http://caps.fool.com/Ticker/SLT.aspx I want the source for http://g.foolcdn.com/art/ratings/stars/trans/5stars-trans-lg.png
I only want what follows the src= if possible.
I am currently working with:
xpath = "//div[@class='subtle marginT']"
This however is returning nothing. I know it might be asking a lot, but if you feel like answering, I would also greatly appreciate a quick reasoning for the answer as I want to learn XCAP, not just get this query to work.
Based on your URL this worked for me:
var imageNode = doc.DocumentNode.SelectSingleNode("//table[@id='tickerStats']/tbody/tr/td/img");
string imageText = imageNode.Attributes["src"].Value;
Basically just grabbing the closest element that has an id, then walking the tree down to where you want to be.
Alternatively this would work too and seems a little cleaner (since you don't really care about the DOM structure in the table itself as long as there is just one image):
var statsNode = doc.DocumentNode.SelectSingleNode("//table[@id='tickerStats']");
var imageNode = statsNode.SelectSingleNode(".//img");
string imageText = imageNode.Attributes["src"].Value;
Use:
//table[@id='tickerStats']/tbody/tr/td/img/@src
This selects any attribute named src
of any element named img
that is a child of a td
that is a child of a tr
that is a child of a tbody
that is a child of any table
in the document, that has an id
attribute with value 'tickerStats'.
If you need just the string value of this attribute (assuming the above XPath expression selects a single attribute node), use:
string(//table[@id='tickerStats']/tbody/tr/td/img/@src)
精彩评论