开发者

fast way to deserialize XML with special characters

开发者 https://www.devze.com 2023-02-08 19:41 出处:网络
I am looking for fast way to deserialize xml, 开发者_如何学Pythonthat has special characters in it like ö.

I am looking for fast way to deserialize xml, 开发者_如何学Pythonthat has special characters in it like ö.

I was using XMLReader and it fails to deserialze such characters.

Any suggestion?

EDIT: I am using C#. Code is as follows:

XElement element =.. //has the xml
XmlSerializer serializer =   new XmlSerializer(typeof(MyType));
XmlReader reader = element.CreateReader();
Object o= serializer.Deserialize(reader);


I'd guess you're having an encoding issue, not in the XMLReader but with the XmlSerializer.

You could use the XmlTextWriter and UTF8 encoding with the XmlSerializer like in the following snippet (see the generic methods below for a way nicer implementation of it). Works just fine with umlauts (äöü) and other special characters.

class Program
{
    static void Main(string[] args)
    {
        SpecialCharacters specialCharacters = new SpecialCharacters { Umlaute = "äüö" };

        // serialize object to xml

        MemoryStream memoryStreamSerialize = new MemoryStream();
        XmlSerializer xmlSerializerSerialize = new XmlSerializer(typeof(SpecialCharacters));
        XmlTextWriter xmlTextWriterSerialize = new XmlTextWriter(memoryStreamSerialize, Encoding.UTF8);

        xmlSerializerSerialize.Serialize(xmlTextWriterSerialize, specialCharacters);
        memoryStreamSerialize = (MemoryStream)xmlTextWriterSerialize.BaseStream;

        // converts a byte array of unicode values (UTF-8 enabled) to a string
        UTF8Encoding encodingSerialize = new UTF8Encoding();
        string serializedXml = encodingSerialize.GetString(memoryStreamSerialize.ToArray());

        xmlTextWriterSerialize.Close();
        memoryStreamSerialize.Close();
        memoryStreamSerialize.Dispose();

        // deserialize xml to object

        // converts a string to a UTF-8 byte array.
        UTF8Encoding encodingDeserialize = new UTF8Encoding();
        byte[] byteArray = encodingDeserialize.GetBytes(serializedXml);

        using (MemoryStream memoryStreamDeserialize = new MemoryStream(byteArray))
        {
            XmlSerializer xmlSerializerDeserialize = new XmlSerializer(typeof(SpecialCharacters));
            XmlTextWriter xmlTextWriterDeserialize = new XmlTextWriter(memoryStreamDeserialize, Encoding.UTF8);

            SpecialCharacters deserializedObject = (SpecialCharacters)xmlSerializerDeserialize.Deserialize(xmlTextWriterDeserialize.BaseStream);
        }
    }
}

[Serializable]
public class SpecialCharacters
{
    public string Umlaute { get; set; }
}

I personally use the follwing generic methods to serialize and deserialize XML and objects and haven't had any performance or encoding issues yet.

public static string SerializeObjectToXml<T>(T obj)
{
    MemoryStream memoryStream = new MemoryStream();
    XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
    XmlTextWriter xmlTextWriter = new XmlTextWriter(memoryStream, Encoding.UTF8);

    xmlSerializer.Serialize(xmlTextWriter, obj);
    memoryStream = (MemoryStream)xmlTextWriter.BaseStream;

    string xmlString = ByteArrayToStringUtf8(memoryStream.ToArray());

    xmlTextWriter.Close();
    memoryStream.Close();
    memoryStream.Dispose();

    return xmlString;
}

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(StringToByteArrayUtf8(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));

        using (StreamReader xmlStreamReader = new StreamReader(memoryStream, Encoding.UTF8))
        {
            return (T)xmlSerializer.Deserialize(xmlStreamReader);
        }
    }
}

public static string ByteArrayToStringUtf8(byte[] value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetString(value);
}

public static byte[] StringToByteArrayUtf8(string value)
{
    UTF8Encoding encoding = new UTF8Encoding();
    return encoding.GetBytes(value);
}


What works for me is similar to what @martin-buberl suggested:

public static T DeserializeXmlToObject<T>(string xml)
{
    using (MemoryStream memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(xml)))
    {
        XmlSerializer xmlSerializer = new XmlSerializer(typeof(T));
        StreamReader reader = new StreamReader(memoryStream, Encoding.UTF8);
        return (T)xmlSerializer.Deserialize(reader);
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消