how does a DTD inside an xml makes it easier to parse using java DOM?_问答_开发者

how does a DTD inside an xml makes it easier to parse using java DOM?

开发者 https://www.devze.com 2023-02-19 19:59 出处：网络

hey friends, i have been assigned a query planning project. in this project if enter an sql like query, which i have to slice and turn into an xml.

相关专题：dom dtd xml

hey friends, i have been assigned a query planning project. in this project if enter an sql like query, which i have to slice and turn into an xml. i did this part, however i am required to add a DTD for this xml, because as the project mentions it helps to parse (using java DOM)开发者_如何学Python this query and find easily the selection - duplicates and joins specified in the query.

i don't understand, how does a DTD help while using DOM to parse the xml and find the different parts of this xml?

i could use DOM to find the different parts of the xml without a dtd... Can anybody give me and example of the difference?

thanks

A DTD tells the parser which tags are allowed, and where in the document they should be expected. Without a DTD, the parser will read the tags but it won't know if the tag was an expected one, or if it was in the right place.

If you parse your XML with SAX or DOM, it doesn't matter, neither parser will know if your tags are expected or unexpected without a DTD (or one of it's more recent replacements like XSD, RelaxNG, etc).

DTD does not help to parse the XML, but it provides a rudimentary validation by defining certain rules about the document. If the document violates the validation rules, the parsing should fail (or produce a warning message, that should depend on the validators configuration). They may call it "helpful", because your DOM navigation code would be able to make better assumptions about the structure of the document, without fear of ungraceful failure.

They probably want you to include the DTD because otherwise the DTD has to be recognized from PUBLIC/SYSTEM document identifier, and be hosted somewhere. Or they should have a predefined DTD in the source code(also a version of this "hosting somewhere"), which may not be an option if the same code has to process different documents without prior knowledge to the structure.

The statement about finding duplicates easier may be because they plan to throw the document out in case of a duplicate; not sure how helpful it is for the joins, not without knowing the details of the slicing and turning.