开发者

Merging of xml documents

开发者 https://www.devze.com 2023-04-12 10:41 出处:网络
All of the solutions I have come across regarding merging XML documents do not accomplish what I desire.Let me explain:

All of the solutions I have come across regarding merging XML documents do not accomplish what I desire. Let me explain:

XML Document 1:

<?xml version="1.0" encoding="utf-8" ?>
<a>
    <b title="Original Section">
        <b title="Original Child Section"></b>
        <b title="Original Child Section 2"></b>
    </b>
</a>

XML Document 2:

<?xml version="1.0" encoding="utf-8" ?>
<a>
    <b title="New Section">
        <b title="New Child Section"></b>
    </b>
    <b title="Original Section">
        <b title="Original Child Section">
            <b title="New Child For Old Section"></b>
        </b>
    </b>    
</a>

Into a final doc like this:

<?xml version="1.0" encoding="utf-8" ?>
<a>
    <b title="Original Section">
        <b title="Original Child Section">
            <b title="New Child For Old Section"></b>
        </b>
        <b title="Original Child Section 2"></b>
    </b>    
    <b title="New Section">
        <b title="New Child Section"></b>
    </b>
</a>

The documents are similar in content, but can have an arbitrary number of child nodes. I also would like to eliminate duplicates. I consider duplicates being elements with the same attributes (based on attribute name and value). Has anyone seen a working example of this implementation? I can envision how I would write it using some loops and a bi开发者_运维技巧t of recursion, but to me, that just doesn't seem like the best way to accomplish what I want :)

Cheers and thanks in advance!

* EDIT *

Since the consensus is that loops and recursion are a must, what would be the most elegant and efficient way to accomplish this? I suppose another fundamental question to this problem is what is the best way to compare the nodes as you iterate?


Eventually any solution to this problem will boil down to loops and/or recursion. You're talking basic set theory, and linq may be useful for distilling the process, but it will ultimately be iterating over both sets and merging the results.


I'd write an IEqualityComparer that specifies when two nodes are a 'match' - i.e. sets the title matching rule.

class XElementComparer : IEqualityComparer<XElement>
{
    public bool Equals(XElement x, XElement y)
    {
        var xTitle = x.Attribute("title");
        var yTitle = y.Attribute("title");

        if (xTitle == null || yTitle == null) return false;

        return xTitle.Value == yTitle.Value;
    }

    public int GetHashCode(XElement obj)
    {
        return base.GetHashCode();
    }
}

And then write a recursive method to trawl through your XML, merging nodes that match according to the comparer.

private XElement Merge(XElement node1, XElement node2)
{
    // trivial cases
    if (node1 == null) return node2;
    if (node2 == null) return node1;

    var elements1 = node1.Elements();
    var elements2 = node2.Elements();

    // create a merged root
    var result = new XElement(node1.Name, node1.Attribute("title")); 

    var comparer = new XElementComparer();
    var mergedNodes = elements1.Union(elements2, comparer).ToList();

    // for the union of the elements, insert their merge values
    foreach (var title in mergedNodes)
    {
        var child1 = elements1.SingleOrDefault(e => comparer.Equals(e, title));
        var child2 = elements2.SingleOrDefault(e => comparer.Equals(e, title));

        result.Add(Merge(child1, child2));
    }

    return result;
}
0

精彩评论

暂无评论...
验证码 换一张
取 消