开发者

Finding specific attributes in a large XML document

开发者 https://www.devze.com 2023-01-30 12:24 出处:网络
I have a large XML document that is around 100mb. I need to find attributes for two tags in this document. I can do this by using similar code to the following:

I have a large XML document that is around 100mb. I need to find attributes for two tags in this document. I can do this by using similar code to the following:

XmlDocument xmlDocument = new XmlDocument ( );
xmlDocument.Load ( "C:\\myxml.xml" );

XmlNode node1 = xmlDocument.SelectSingleNode ( "/data/objects[@type='data type 1']" );
if ( null != node1 )
{
   result = node1 [ "Version" ].Value;
}

But doing so loads the entire XML into memory which seems to take around 200mb. Is there anyway I can make this more efficient?

Edit: Lots of nice answers using the XmlTextReader wh开发者_运维问答ich I have written my code to use now. (It will be more memory efficient, but ugly :).


For performance, SAX is much better than DOM since you actually need only one value. SAX implementation in .NET Framework is XmlTextReader.


You should try to use an XmlReader.

From MSDN :

Like the SAX reader, the XmlReader is a forward-only, read-only cursor. It provides fast, non-cached stream access to the input. It can read a stream or a document. It allows the user to pull data, and skip records of no interest to the application. The big difference lies in the fact that the SAX model is a "push" model, where the parser pushes events to the application, notifying the application every time a new node has been read, while applications using XmlReader can pull nodes from the reader at will.

An example here.


You can use the XmlReader class to do this. A simple but working example that does the same as your code above looks like this:

string result = null;

using (var reader = XmlReader.Create(@"c:\\myxml.xml"))
{
    while (reader.Read())
    {
        if (reader.NodeType == XmlNodeType.Element
            && reader.Depth == 1
            && reader.LocalName == "objects"
            && reader.GetAttribute("type") == "data type 1")
        {
            result = reader.GetAttribute("Version");
            break;
        }
    }
}
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号