I'm opening an existing XML file with C#, and I replace some nodes in there. All works fine. Just after I save it, I get the following characters at the beginning of the file:
 (EF BB BF in HEX)
The whole first line:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
The rest of the file looks like a normal XML file. The simplified code is here:
XmlDocument doc = new XmlDocument();
开发者_StackOverflowdoc.Load(xmlSourceFile);
XmlNode translation = doc.SelectSingleNode("//trans-unit[@id='127']");
translation.InnerText = "testing";
doc.Save(xmlTranslatedFile);
I'm using a C# Windows Forms application with .NET 4.0.
Any ideas? Why would it do that? Can we disable that somehow? It's for Adobe InCopy, and it does not open it like this.
UPDATE: Alternative Solution:
Saving it with the XmlTextWriter works too:
XmlTextWriter writer = new XmlTextWriter(inCopyFilename, null);
doc.Save(writer);
It is the UTF-8 BOM, which is actually discouraged by the Unicode standard:
http://www.unicode.org/versions/Unicode5.0.0/ch02.pdf
Use of a BOM is neither required nor recommended for UTF-8, but may be encountered in contexts where UTF-8 data is converted from other encoding forms that use a BOM or where the BOM is used as a UTF-8 signature
You may disable it using:
var sw = new IO.StreamWriter(path, new System.Text.UTF8Encoding(false));
doc.Save(sw);
sw.Close();
It's a UTF-8 Byte Order Mark (BOM) and is to be expected.
You can try to change the encoding of the XmlDocument. Below is the example copied from MSDN
using System; using System.IO; using System.Xml;
public class Sample {
public static void Main() {
// Create and load the XML document.
XmlDocument doc = new XmlDocument();
string xmlString = "<book><title>Oberon's Legacy</title></book>";
doc.Load(new StringReader(xmlString));
// Create an XML declaration.
XmlDeclaration xmldecl;
xmldecl = doc.CreateXmlDeclaration("1.0",null,null);
xmldecl.Encoding="UTF-16";
xmldecl.Standalone="yes";
// Add the new node to the document.
XmlElement root = doc.DocumentElement;
doc.InsertBefore(xmldecl, root);
// Display the modified XML document
Console.WriteLine(doc.OuterXml);
}
}
As everybody else mentioned, it's Unicode issue.
I advise you to try LINQ To XML. Although not really related, I mention it as it's super easy compared to old ways and, more importantly, I assume it might have automatic resolutions to issues like these without extra coding from you.
精彩评论