I have an XML structure that has many doc
nodes, and each node may have zero or more extract paragraphs (paras
).
<doc>
<docitem>3</docitem>
<docid>129826</docid>
<doctitle>sample title</doctitle>
<docdatetime>2009-07-03T16:59:00</docdatetime>
<collectdatetime>2009-07-03T16:59:23</collectdatetime>
<summary>
<summarytext>sample summary</summarytext>
</summary>
<paras>
<paraitemcount>2</paraitemcount>
<para>
<paraitem>1</paraitem>
<paratext>sample text 1</paratext>
</para>
<para>
<paraitem>2</paraitem>
<paratext>sample text 2</paratext>
</para>
<开发者_如何学编程;/paras>
</doc>
<doc>
...
</doc>
I also has some Linq code to populate some Document objects:
List<Document> documentsList = (from doc in xmlDocument.Descendants("doc")
select new Document
{
DocId = doc.Element("docid").Value,
DocTitle = doc.Element("doctitle").Value,
DocDateTime = DateTime.Parse(doc.Element("docdate").Value),
DocSummary = doc.Element("summary").Value,
DocParas = "" ///missing code to populate List<string>
}
).ToList<Document>();
Is it possible add all the paras nodes into the Document.DocParas List<string>
using Linq and Xpath, or should I do this task in a different way?
Note: I'm using .NET C# 3.5
You could use smth like this:
DocParas = doc.XPathSelectElements("paras/para/paratext").Select(xElement => xElement.Value).ToList();
Not that XPathSelectElements
is declared in System.Xml.XPath
namespace.
I would use XML Serialization in this case. Since your parsing all your document (or at least a big part of it) to a model and your code begins to struggle with the levels in the XML, I think it's simpler to let the serialization framework do it's Job.
One way to get the para's :
XElement xElement2 = XElement.Parse(@"
<doc>
<docitem>3</docitem>
<docid>129826</docid>
<doctitle>sample title</doctitle>
<docdatetime>2009-07-03T16:59:00</docdatetime>
<collectdatetime>2009-07-03T16:59:23</collectdatetime>
<summary>
<summarytext>sample summary</summarytext>
</summary>
<paras>
<paraitemcount>2</paraitemcount>
<para>
<paraitem>1</paraitem>
<paratext>sample text 1</paratext>
</para>
<para>
<paraitem>2</paraitem>
<paratext>sample text 2</paratext>
</para>
</paras>
</doc>");
List<string> docs = xElement2.Descendants().Where(x => x.Parent.Name == "paras" && x.Name == "para").Select(x => x.Value).ToList();
so in your code I think it becomes:
...//
...//
DocParas = doc.Descendants().Where(x => x.Parent.Name == "paras" && x.Name == "para").Select(x => x.Value).ToList()
}
精彩评论