开发者

Parsing HTML - How to get a number from a tag?

开发者 https://www.devze.com 2023-02-22 14:52 出处:网络
I am developing a Windows Forms application which is interacting with a web site. Using a WebBrowser control I am controlling the web site and I can iterate through the tags using:

I am developing a Windows Forms application which is interacting with a web site.

Using a WebBrowser control I am controlling the web site and I can iterate through the tags using:

HtmlDocument webDoc1 = this.webBrowser1.Document;
HtmlElementCollection aTags = webDoc1.GetElementsByTagName("a");

Now, I want to get a particular text from the tag which is below:

<a href="issue?status=-1,1,2,3,4,5,6,7&amp;@sort=-acti开发者_运维问答vity&amp;@search_text=&amp;@dispname=Show Assigned&amp;@filter=status,assignedto&amp;@group=priority&amp;@columns=id,activity,title,creator,status&amp;assignedto=244&amp;@pagesize=50&amp;@startwith=0">Show Assigned</a><br>

Like here I want to get the number 244 which is equal to assignedto in above tag and save it into a variable for further use.

How can I do this?


You can try splitting a string by ';' values, and then each string by '=' like this:

string aTag = ...;
foreach(var splitted in aTag.Split(';'))
{
   if(splitted.Contains("="))
   {
      var leftSide = splitted.Split('=')[0];
      var rightSide = splitted.Split('=')[1];
      if(leftSide == "assignedto")
      {
          MessageBox.Show(rightSide); //It should be 244
          //Or...
          int num = int.Parse(rightSide);
      }
   }
}

Other option is to use Regexes, which you can test here: www.regextester.com. And some more info on regexes: http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx

Hope it helps!


If all cases are similar to this and you don't mind a reference to System.Web in your Windows Forms application, tou can do something like this:

using System;

public class Program
{
    static void Main()
    {
        string href = @"issue?status=-1,1,2,3,4,5,6,7&amp;
@sort=-activity&amp;@search_text=&amp;@dispname=Show Assigned&amp;
@filter=status,assignedto&amp;@group=priority&amp;
@columns=id,activity,title,creator,status&amp;assignedto=244&amp;
@pagesize=50&amp;@startwith=0";

        href = System.Web.HttpUtility.HtmlDecode(href);

        var querystring = System.Web.HttpUtility.ParseQueryString(href);

        Console.WriteLine(querystring["assignedto"]);
    }
}

This is a simplified example and first you need to extract the href attribute text, but that should not be complex. Having the href attribute text you can take advantage that is basically a querystring and reuse code in .NET that already parses query strings.

To complete the example, to obtain the href attribute text you could do:

HtmlElementCollection aTags = webBrowser.Document.GetElementsByTagName("a");

foreach (HtmlElement element in aTags)
{
    string href = element.GetAttribute("href");
}
0

精彩评论

暂无评论...
验证码 换一张
取 消