What is the fastest/shortest/one-liner (not possible :p) way to build a unique tree of elements from a tree where many of the elements are duplicated/missing in some nodes, given the tree has a defined set of nodes (which we'd use this algorithm to figure out so we don't have to manually do it).
It could be XML/JSON(hash), or whatever. So something like this:
root {
nodes {
nodeA {}
nodeB {
subNodeA {}
}
}
nodes {
nodeA {
subNodeA {}
}
nodeB {
subNodeX {}
}
}
}
...converted to this:
root {
nodes {
nodeA {
subNodeA {}
}
nodeB {
subNodeA {}
subNodeX {}
}
}
}
Same with xml:
<root>
<nodes>
<nodeA/>
<nodeB>
<subNodeA/>
</nodeB>
</nodes>
<nod开发者_运维知识库es>
<nodeA>
<subNodeA/>
</nodeA>
<nodeB>
<subNodeX/>
</nodeB>
</nodes>
</root>
<root>
<nodes>
<nodeA>
<subNodeA/>
</nodeA>
<nodeB>
<subNodeA/>
<subNodeX/>
</nodeB>
</nodes>
</root>
The xml/json files could be decently large (1MB+), so having to iterate over every element depth-first or something seems like it would take a while. It could also be as small as the example above.
This'll get you a set of unique paths:
require 'nokogiri'
require 'set'
xml = Nokogirl::XML.parse(your_data)
paths = Set.new
xml.traverse {|node| next if node.text?; paths << node.path.gsub(/\[\d+\]/,"").sub(/\/$/,"")}
Does that get you started?
[response to question in comment]
Adding attibute-paths is also easy, but let's go at least a little bit multi-line:
xml.traverse do |node|
next if node.text?
paths << (npath = node.path.gsub(/\[\d+\]/,"").sub(/\/$/,""))
paths += node.attributes.map {|k,v| "#{npath}@#{k}"}
end
精彩评论