开发者

How to get raw page source (not generated source) from c#

开发者 https://www.devze.com 2023-04-06 18:08 出处:网络
The goal is to get the raw source of the page, I mean do not run the scripts or let the browsers format the page at all. for example: suppose the source is <table><tr></table> after

The goal is to get the raw source of the page, I mean do not run the scripts or let the browsers format the page at all. for example: suppose the source is <table><tr></table> after the response, I don't want get <table><tbody><tr></tr></tbody></table>, how to do this开发者_运维技巧 via c# code?

More info: for example, type "view-source:http://feeds.gawker.com/kotaku/full" in the browser's address bar will give u a xml file, but if you just call "http://feeds.gawker.com/kotaku/full" it will render a html page, what I want is the xml file. hope this is clear.


Here's one way, but it's not really clear what you actually want.

using(var wc = new WebClient())
{
    var source = wc.DownloadString("http://google.com");
}


If you mean when rendering your own page. You can get access the the raw page content using a ResponseFilter, or by overriding page render. I would question your motives for doing this though.

Scripts run client-side, so it has no bearing on any c# code.


You can use a tool such as Fiddler to see what is actually being sent over the wire.

disclaimer: I think Fiddler is amazing

0

精彩评论

暂无评论...
验证码 换一张
取 消