开发者

XmlDocument Load vs. LoadXml with Right Quote

开发者 https://www.devze.com 2023-03-09 07:58 出处:网络
I am seeing a difference between the way XmlDocument Load( ) and LoadXml( ) work in .NET 2.0.Given the XML document below (note the use of a right quote, ascii code 146):

I am seeing a difference between the way XmlDocument Load( ) and LoadXml( ) work in .NET 2.0. Given the XML document below (note the use of a right quote, ascii code 146):

<?xml version="1.0" encoding="utf-8"?>
<nodes>
  <node>Some Data ’</node>
</nodes>

Why does it load fine with LoadXml( 开发者_运维问答) when passed in as a string but fails if passed in to Load( ) as a document. I.E. and other XML editors also will not load and display this file.

Simplified Code Example:

[WebMethod]
public bool SubmitData(string xmlDoc)
{
   try
   {
      XmlDocument doc = new XmlDocument();
      doc.LoadXml(xmlDoc);
   }
   catch
   {
      return false;
   }

   return true;
}

I know the code is poor but it is just meant to demonstrate the problem. If the string "xmlDoc" is not a legitimate xml document, then I am trying to get this to fail.

I can't control the content of the XML sent to me. I just receive it and work with it through the web service. Apparently, the people calling it are copying and pasting data from a Word Doc. I didn't design this either but I am stuck maintaining it. :)


The difference is the encoding. When loading from a file, utf-8 decoding is applied and your code 146 is probably not valid utf-8 in your case. LoadXml ignores the encoding because .Net strings do not need to be decoded. Therfore your special character is a valid character and everything is fine.


In case anyone is interested, we found another way to "validate" the string to at least check for proper characters. We had tried similar things before but didn't use the correct constructor on UTF8Encoding and hence, didn't get the results we hoped for. This won't necessarily check for XML correctness, but it will at least validate the characters being sent to us.

string xmlDoc = ""; // whatever has been passed in.
try
{
   System.Text.UTF8Encoding utf8 = new System.Text.UTF8Encoding(false, true);
   byte[] bytes = new System.Text.UnicodeEncoding().GetBytes(xmlDoc);
   utf8.GetChars(bytes);
}
catch (Exception ex)
{
   Console.WriteLine(ex.Message);
   return false;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消