HttpWebRequest an Unicode characters_问答_开发者

开发者 https://www.devze.com 2023-04-07 08:44 出处：网络

I am using this code: HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url); string result = null;

I am using this code:

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(url);
string result = null;
using (HttpWebResponse resp = (HttpWebResponse)req.GetResponse())
{
   StreamReader reader = new StreamReade开发者_开发百科r(resp.GetResponseStream());
   result = reader.ReadToEnd();
   reader.Close();
}

In result I get text like 003cbr /003e003cbr /003e (I think this should be 2 line breaks instead). I tried with the 2, 3 parameter versions of Streamreader but the string was the same. (the request returns a json string)

Why am I getting those characters, and how can I avoid them?

It's not really clear what that text is, but you're not specifying an encoding at the moment. What content encoding is the server using? StreamReader will default to UTF-8.

It sounds like actually you're getting some sort of oddly-encoded HTML, as U+003C is < and U+003E is >, giving <br /><br /> as the content. That's not JSON...

Two tests:

Use WebClient.DownloadString, which will detect the right encoding to use
See what gets shown using the same URL in a browser

EDIT: Okay, now that I've seen the text, it's actually got:

\u003cbr /\u003e

The \u part is important here - that's part of the JSON which states that the next four characters form ar the hex representation of a UTF-16 code unit.

Any JSON API used to parse that text should perform the unescaping for you.