Being new to programming I read a lot of sample code and try to hack things together in an attempt to learn what works. I am working with html agility pack trying to scrap a news webpage.
Problem: One of the nodes that I am testing does not use a static value, it uses time of viewing. How can I apply this to a switch {case} method. I am also open for any suggestions if I am way off base in this entire approach.
Note also: I don't need to capture this node, if there is a way to skip it that works for me.
I decided to use an exapmle that uses a switch,
var rows = doc.DocumentNode.SelectNodes(".//*[@id='weekdays']/tr");
foreach (var row in rows)
{
var cells = row.SelectNodes("./td");
string title = cells[0].InnerText;
var valueRow = cells[2];
switch (title)
{
case "Date":
HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
Console.WriteLine("UPC=A:\t" + date.InnerText);
break;
case "":
string Time = valueRow.InnerText;
Console.WriteLine("Time:\t" + Time);
break;
case "News":
string Time = valueRow.InnerText;
Console.WriteLine("News:\t" + News);
break;
}
excerpt of html
<table id="weekdays" cellpadding="6" cellspacing="0" border="0" width="100%">
<tr>
<td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; text-align: center; width:8%; padding-left: 6px;">Date</td>
<td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; width:8%; text-align: center; white-space:nowrap"><a href="guestcp.php?do=customoptions" title="Time & Date Options"><img style="position:relative; vertical-align: bottom;" src="images/misc/clock_small.gif" title="Time & Date Options" alt="Time & Date Options" border="0" /></a><a href="guestcp.php?do=customoptions" title="Time & Date Options"><span id="ff_nowtime_clock">3:20pm</span></a></td>
<td class="thead" style="border-bottom: 1px solid #d1d1e1;font-weight:normal; text-align: center; width:8%;">News</td>
.........
开发者_如何学C <tr id="detail_row_seek_37876">
<td id="toprow_9" class="alt1 espace" rowspan="3" style="vertical-align: top; text-align: center;" nowrap="nowrap">
<span class="smallfont">
<div>Sat</div>
Apr 9
</span>
</td>
<td class="alt1 espace" style="text-align: center;" nowrap="nowrap">
<span class="smallfont">Day 3</span>
</td>
<td class="alt1 espace" style="text-align: center;"><span class="smallfont">EUR</span></td>
<td class="alt1 espace" style="padding-top: 2px" align="center">
<a name="chart=37876" style="position:absolute; margin-top: -10px;"></a><a name="details=37876" style="position:absolute; margin-top: -10px;"></a>
<div class="cal_imp_medium" title="Medium Impact Expected"></div></td>
<td class="alt1 espace">
<div class="smallfont" id="title_37876" style="padding-left: 11px;">ECOFIN Meetings</div>
</td>
The problem is: The so called time column is not static it actual uses a time value. Is there a way to use a wild card in the case or a way to do a "contains" to get around this very wordy problem?
You must use constant values in each case of the switch statement.
The only way I can think of for you to do what you are looking to do is to use the default:
case - within this default case you can test the value you are looking for using a contains
, Parse
or Regex
test using if
.
I couldn't quite follow your HTML sample code (sorry!) - but the modified C# might look something like:
switch (title)
{
case "Date":
HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
Console.WriteLine("UPC=A:\t" + date.InnerText);
break;
case "News":
string News = valueRow.InnerText;
Console.WriteLine("News:\t" + News);
break;
default:
if (regexTime.Match(title))
{
string Time = valueRow.InnerText;
Console.WriteLine("Time:\t" + Time);
}
break;
}
You could use the "case default:" and put a condition inside that would check.
switch (title) {
case "Date":
HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
Console.WriteLine("UPC=A:\t" + date.InnerText);
break;
case "News":
string Time = valueRow.InnerText;
Console.WriteLine("News:\t" + News);
break;
case default:
if (whatever you need) {
...
}
break;
}
The case has to be a constant expression. see MSDN. If you can switch to using if-else, you'll have more freedom.
Add a default implementation, and put your extra condition checks there. Or you may be better off with just if statements.
switch (title)
{
case "Date":
HtmlNode date = valueRow.SelectSingleNode("//*[starts-with(@id, 'detail_row_seek')]/td");
Console.WriteLine("UPC=A:\t" + date.InnerText);
break;
case "":
string Time = valueRow.InnerText;
Console.WriteLine("Time:\t" + Time);
break;
case "News":
string Time = valueRow.InnerText;
Console.WriteLine("News:\t" + News);
break;
default:
// put special time condition check logic here.
}
精彩评论