开发者

Select all <p>'s from a Node's children using HTMLAgilityPack

开发者 https://www.devze.com 2022-12-17 21:53 出处:网络
I\'ve got the following code that I\'m using to get a html page. Make the urls absolute and then make the links rel nofollow and open in a new window/tab. My issue is around the adding of the attribut

I've got the following code that I'm using to get a html page. Make the urls absolute and then make the links rel nofollow and open in a new window/tab. My issue is around the adding of the attributes to the <a>s.

        string url = "http://www.mysite.com/";
        string strResult = "";            

        HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
        HttpWebResponse response = (HttpWebResponse)request.GetResponse();

        if ((request.HaveResponse) && (response.StatusCode == HttpStatusCode.OK)) {
            using (StreamReader sr = new StreamReader(response.GetResponseStream())) {
                strResult = sr.ReadToEnd();
                sr.Close();
            }
        }

        HtmlDocument ContentHTML = new HtmlDocument();
        ContentHTML.LoadHtml(strResult);
        HtmlNode ContentNode = ContentHTML.GetElementbyId("content");

        foreach (HtmlNode node in ContentNode.SelectNodes("/a")) {
            node.Attributes.Append("rel", "nofollow");
            node.Attributes.Append("target", "_blank");
        }

        return ContentNode.WriteTo();

Can anyone see what I'm doing wrong? Been try for a while here with no luck. This code comes up that ContentNode.SelectNodes("/a") isn't set开发者_如何学Python to an instance of an object. I though to try and set the steam to 0?

Cheers, Denis


Is ContentNode null? You might need to select-single with the query "//*[@id='content']".

For info, "/a" means all anchors at the root. does "descendant::a" work? There is also HtmlElement.GetElementsByTagName which might be easier - i.e. yourElement.GetElementsByTagName("a").

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号