I am having difficulty preserving certain nodes (in this case <b>
) when parsing XML with LINQ to XML. I first grab a node with the following LINQ query...
IEnumerable<XElement> node = from el in _theData.Descendants("msDict") select el;
Which returns the following XML (as the first XElement
)...
<msDict lexid="m_en_us0000002.001" type="core">
<df>(pr开发者_如何学编程eceding a numeral) <b>pound</b> or <b>pounds</b> (of money)
<genpunc tag="df">.</genpunc></df>
</msDict>
I then collect the content with the following code...
StringBuilder output = new StringBuilder();
foreach (XElement elem in node)
{
output.append(elem.Value);
}
Here's the breaking point. All of the XML nodes are stripped, but I want to preserve all instances of <b>
. I am expecting to get the following as output...
(preceding a numeral) <b>pound</b> or <b>pounds</b> (of money).
Note: I know that this is a simple operation in XSLT, but I would like to know if there an easy way to do this using LINQ to XML.
In the category of "it works but it's messy and I can't believe I have to resort to this":
StringBuilder output = new StringBuilder();
foreach (XElement elem in node)
{
output.append(string.Join("", elem.Nodes().Select(n => n.ToString()).ToArray()));
}
Personally, I think this cries out for an extension method on XElement...
UPDATE: If you want to exclude all element tags except <b> then you'll need to use a recursive method to return node values.
Here's your main method body:
StringBuilder output = new StringBuilder();
foreach (XElement elem in node)
{
output.Append(stripTags(elem));
}
And here's stripTags:
private static string stripTags(XNode node)
{
if (node is XElement && !((XElement)node).Name.ToString().Equals("b", StringComparison.InvariantCultureIgnoreCase))
{
return string.Join(string.Empty, ((XElement)node).Nodes().Select(n => stripTags(n)).ToArray());
}
else
{
return node.ToString();
}
}
So the real answer is that no, there isn't an easy way to do this using LINQ to XML, but there's a way...
精彩评论