开发者

XmlReader square brackets causing reader to go to errored state

开发者 https://www.devze.com 2022-12-10 12:16 出处:网络
I have an XmlReader that is trying to read text into a list of elements. I am having trouble getting it to reader the text: \"a [ z ]\". If I try with the text \"a开发者_开发知识库 [ z ]\" (same but w

I have an XmlReader that is trying to read text into a list of elements. I am having trouble getting it to reader the text: "a [ z ]". If I try with the text "a开发者_开发知识库 [ z ] " (same but with two trailing spaces) it works fine. Below is an example:

TextReader tr = new StringReader("a [ z ]");
XmlReaderSettings settings = new XmlReaderSettings
{
    ConformanceLevel = ConformanceLevel.Fragment,
    ProhibitDtd = false,
    ValidationType = ValidationType.None,
    XmlResolver = null,
    CheckCharacters = false,
    IgnoreProcessingInstructions = true,
};
XmlReader reader = XmlReader.Create(tr, settings);
reader.Read();

StringBuilder sb = new StringBuilder();

while (!reader.EOF)
{
    if (reader.NodeType == XmlNodeType.Text || reader.NodeType == XmlNodeType.Whitespace)
    {
        sb.Append(reader.Value);
        reader.Read();
    }   
}

// sb.ToString() should be "a [ z ]"

When you run it fails with the message: "System.Xml.XmlException : Unexpected end of file has occurred. Line 1, position 7." and a stack trace:

at System.Xml.XmlTextReaderImpl.Throw(Exception e) 
at System.Xml.XmlTextReaderImpl.ParseText(Int32& startPos, Int32& endPos, Int32& outOrChars)
at System.Xml.XmlTextReaderImpl.FinishPartialValue()
at System.Xml.XmlTextReaderImpl.get_Value()
at LocalisationFormats.Tests.Shared.InlineElements.InlineElementHelperTest.Test()

When you attempt to debug it, the Reader is in a ReadState of "Error" and the Reader.Value is "a [ z", and then you break the reader and get an OutOfMemoryExecption.

Anyone any suggestions?

EDIT: removed extra if block from code snippet on suggestion from Gregoire.


I believe the problem is that when you are loading a non-Xml formatted string into an XmlReader object.

"XmlReader provides forward-only, read-only access to a stream of XML data. The XmlReader class conforms to the W3C Extensible Markup Language (XML) 1.0 and the Namespaces in XML recommendations." & "XmlReader throws an XmlException on XML parse errors." - MSDN XmlReader Class Article http://msdn.microsoft.com/en-us/library/system.xml.xmlreader.aspx

Try loading and reading actual Xml data instead by changing:

TextReader tr = new StringReader("a [ z ]");

to:

TextReader tr = new StringReader("<node>a [ z ]</node>");

or alternately, if you need each piece in its own node:

TextReader tr = new StringReader("<node>a</node><node> </node><node>[</node><node> </node><node>z</node><node> </node><node>]</node>");

I'm providing complete source for the latter example, because I THINK that's what you're aiming at here.

TextReader tr = new StringReader("<node>a</node><node> </node><node>[</node><node> </node><node>z</node><node> </node><node>]</node>");
XmlReaderSettings settings = new XmlReaderSettings
{
    ConformanceLevel = ConformanceLevel.Fragment,
    ProhibitDtd = false,
    ValidationType = ValidationType.None,
    XmlResolver = null,
    CheckCharacters = false,
    IgnoreProcessingInstructions = true,
};
XmlReader reader = XmlReader.Create(tr, settings);
reader.Read();

StringBuilder sb = new StringBuilder();

while (!reader.EOF)
{
    string s = reader.ReadElementString();

    if (s != " ")
    {
        sb.Append(s);
    }
}

This will allow you to iterate through the nodes, getting the full string values with no exceptions.

~md5sum~


I've checked and this has been fixed in .Net 4, but still broken in .Net 3.5 as of this post.


Sorry for dredging up a three-year-old issue, but I just had the same problem. For any googlers from the future:

It looks like OP raised this with the guys at Microsoft - connect.microsoft.com/VisualStudio/feedback:

Thanks for reporting this issue. We have fixed this in .NET 4.0. We do not plan to fix it in previous versions of .NET. Upgrading to .NET 4.0 will fix this issue.

Thanks, Arun Chandrasekhar, Senior Program Manager, XML Team

For those of us still stuck with .Net < 4.0 (in my case 2.0) I worked around it with this horrendous hack:

const string openSquareBracketReplacement = "##OSB##";
const string closeSquareBracketReplacement = "##CSB##";

xml = xml
    .Replace("[", openSquareBracketReplacement)
    .Replace("]", closeSquareBracketReplacement);

// Build an XmlReader and use it.

return xml
    .Replace(openSquareBracketReplacement, "[")
    .Replace(closeSquareBracketReplacement, "]");

Obviously this will break CDATA handling completely, but this is OK for my purposes.

0

精彩评论

暂无评论...
验证码 换一张
取 消