Get href tags in html data in c#_问答_开发者_运维开发者技术经验分享

开发者 https://www.devze.com 2022-12-25 16:52 出处：网络

I am using web client class to HTML data from a web page. Now I开发者_Go百科 want to get the complete href tags and there titles from the HTML data. Initially I used loops, Felling inefficient I switc

相关专题：

He is the initial code:

for (int i = 0; i < htmldata.Length - 5; i++)
{
  if (htmldata.Substring(i, 5) == "href=")
  {
    n1 = htmldata.Substring(i + 6, htmldata.Length - (i + 6)).IndexOf("\"");
    Sublink = htmldata.Substring(i + 6, n1);
    var absoluteUri = new Uri(baseUri, temp);
    n2 = htmldata.Substring(i + n1 + 1, htmldata.Length - (i + n1 + 1)).IndexOf("<");
    subtitle = htmldata.Substring(i + 6 + n1 + 2, n2 - 7); 
  }
}

This code is getting some of the links like this.

/l.href.replace(new RegExp(

/advanced_search?hl=en&q=&hl=en&

and titles like this

onclick=gbar.qs(this) class=gb2>Photos

")+"q="+encodeURIComponent(b)})}i.qs=n;function o(a,b,d,c,f,e){var g=document.getElementById(a);if(g){var

Which are absolutely invalid. Please suggest me the correct code for getting valid relative href links and titles.

Use the HTML Agility pack to parse the HTML for you, then you can use XPath expressions to select all links in the page and associated data.

Trying to parse out HTML by yourself is error prone and brittle, as you have already discovered.

RegEx match open tags except XHTML self-contained tags

Get href tags in html data in c#

精彩评论

关注公众号

热门标签

图文推荐

Get href tags in html data in c#

更多 问答 相关资讯：

精彩评论

关注公众号

热门标签

图文推荐

更多问答相关资讯：