I am trying to figure out a way to parse an xml tag where content is passed in with CDATA tags for some input, but not for all.
For example, the following is sample content I would receive for data which contains CDATA tags. But there is some other scenarios where the CDATA tags are ommited.
<Data><![CDATA[ <h1>CHAPTER 2<br/> EDUCATION</h1>
<P> Analysis paragraph </P> ]]></Data>
Is there an elegant way to somehow detect that, and implement ReadXml method that can parse both types of input (with or without CDATA)? So far my ReadXml() implementation is as follows, but am getting errors parsing when CDATA tag is omitted.
public void ReadXml(XmlReader rea开发者_如何转开发der)
{
bool isEmpty = reader.IsEmptyElement;
reader.ReadStartElement();
if (isEmpty)
{
_data = string.Empty;
}
else
{
switch (reader.MoveToContent())
{
case XmlNodeType.Text:
case XmlNodeType.CDATA:
_data = reader.ReadContentAsString();
break;
default:
_data = string.Empty;
break;
}
reader.ReadEndElement();
}
}
The code below is tested on the following samples:
<Data><h1>CHAPTER 2<br/> EDUCATION</h1><P> Analysis paragraph </P></Data>
<Data>test<h1>CHAPTER 2<br/> EDUCATION</h1><P> Analysis paragraph </P></Data>
<Data><![CDATA[ <h1>CHAPTER 2<br/> EDUCATION</h1><P> Analysis paragraph </P> ]]></Data>
<Data></Data>
I use an XPathNavigator instead as it allows backtracking.
public void ReadXml(XmlReader reader)
{
XmlDocument doc = new XmlDocument {PreserveWhitespace = false};
doc.Load(reader);
var navigator = doc.CreateNavigator();
navigator.MoveToChild(XPathNodeType.Element);
_data = navigator.InnerXml.Trim().StartsWith("<") ? navigator.Value : navigator.InnerXml;
}
精彩评论