How 开发者_Python百科can I loop through table and row that have an attribute id or name to get inner text in deep down in each td cell? I work on asp.net, c#, and the newest html agility package. Please guide. Thank you.
An html file have several tables. One of them has an attribute id=main-part. In that identified table, there are many rows. Some of those rows have same attribute name=display. In those named rows, there are many columns which I have to extract text from. Something like this:
<body>
<table>
...
</table>
<table>
...
</table>
<table id="main-part">
<tr>
<td></td>
...
</tr>
<tr>
<td></td>
...
</tr>
<tr name="display">
<td>Jan</td>
<td>Feb</td>
<td>Mar</td>
...
</tr>
<tr name="display">
<td>Apr</td>
<td>May</td>
<td>June</td>
...
</tr>
<tr name="display">
<td>Jul</td>
<td>Aug</td>
<td>Sep</td>
...
</tr>
<tr>
<td></td>
...
</tr>
<tr name="display">
<td>Oct</td>
<td>Nov</td>
<td>Dec</td>
...
</tr>
<tr>
<td></td>
...
</tr>
</table>
<table>
...
</table>
</body>
You need to select these nodes using xpath:
foreach(HtmlNode cell in doc.DocumentElement.SelectNodes("//tr[@name='display']/td")
{
// get cell data
}
It worked! Thank you very much Oded.
HtmlDocument doc = new HtmlDocument();
doc.Load(@"C:/samplefolder/sample.htm");
foreach(HtmlNode cell in doc.DocumentNode.SelectNodes("//tr[@name='display']/td"))
{
string test = cell.InnerText;
Response.Write(test);
}
It showed result like JanFebMarAprMayJuneJulAugSepOctNovDec. How can I sort them out, separate by a space or a tab? Thank you.
精彩评论