开发者

Java: How do I efficiently merge multiple xml files to create a new xml?

开发者 https://www.devze.com 2022-12-17 05:49 出处:网络
In Java: Suppose I have 3 xml file开发者_如何学运维s <student>lin</student> --file1.xml

In Java:

Suppose I have 3 xml file开发者_如何学运维s

<student>lin</student> --  file1.xml

<student>Eric</student> --  file2.xml

<student>joe</student> --  file3.xml

How can I merge these xml’s (considering that they don’t have the DTD or namespace declaration) to create

<class><student>lin</student> <student>Eric</student>
<student>joe</student> </class> -- file4.xml

class being the wrapping node I supply manually

Ps: I used xstream to create the xml’s


I your files are large I would use a SAXParser where your ContentHandler would echo the tags and the content.

Something like (pseudo-code):

print("<class>")
foreach(file in files)
  {
  mysaxparser.parse(new Handler() 
     {
     content="";

     void endElement(tag)
         {
          if(tag.equals("student")) print("<student>"+escapeXML(content)+"</student>"); 
         content="";
         }
     void characters(str)
         {
         content+=str;
         }
     },file);
  }
print("</class>");

If your files are small enough to fit into the memory: load the DOM of each document using a DocumentBuilder and call importNode to merge the documents into one.


If you know the files are each well-formed you could concatenate them together (after removing any prolog entry from them), along with a header and footer containing the root element start and end tags.

String[] filenames = new String[]{"header.xml", "file1.xml", "file2.xml", "file3.xml", "footer.xml"};
OutputStream outputStream = new BufferedOutputStream(new FileOutputStream("merged.xml");
for (String filename : filenames) {
    InputStream inputStream = new BufferedInputStream(new FileInputStream(filename);
    org.apache.commons.io.IOUtils.copy(inputStream, outputStream);
    inputStream.close();
}
outputStream.close();


I think the correct way to do it is to load the three files into DOM documents and then have one of them adopt the nodes from the other two documents, this way it is all handled by the dom api and sould be foolproof, instead of text manipulation.

You can achieve this by looking into the DomDocument javadoc.

0

精彩评论

暂无评论...
验证码 换一张
取 消