Possible Duplicate:
Should I use Elements or Attributes in XML?
I have never been able to figure out when to use xml attributes. I always use elements. I just read this w3schools article. The article states that it is bad practice to use attributes because:
- attributes cannot contain multiple values (child elements can)
- attributes are not easily expandable (for future changes)
- attributes cannot describe structures (child elements can)
- attributes are more difficult to manipulate by program code
- attribute values are not easy to test against a DTD
The only exception that it states is when you are assigning an id to a tag.
Is this correct? Why do attributes even exist then? Was it a design error with xml? Is there something I am missing here?
The only reason I could think of for using attributes would be for one to one relationships. ie: name. But it would have to be a one to one relationship to something that is a primitive (or string). Because it would be important that in the future you would not want to break it up into several differnt sections. ie:
<date> May 23, 2001 </date>
to:
<date>
<month> May </month>
<d> 23 </d>
<yr> 2001 </yr>
</date>
Because this would not be possible with an attribute.
Bonus question: In the date example would it be possible to do something like this:
<date>
<default> May 23, 200 </default>
<month> May </month>
<d> 23 </d>
<yr> 2001 </yr>
</date>
To give future applications more (or different) information while still offering existing apps the same format? Or would you have to do this:
<date> May 23, 2001 </date>
<NEWdate>
<month> May </month>
<d> 23 </d>
<yr> 2001 </yr>
</NEWdate>
Attributes are good when you want to attach information to other information, perhaps to describe how the information should be interpreted. For example:
<speed unit="mph">65</speed>
The points you list about elements are correct, and I would add the following:
- elements generally make prettier (more readable) diffs when you need to compare revisions of a file
But sometimes using an element to model a data point is overkill -- particularly when you have a lot of small, heterogeneous data points within a single parent element. Using attributes for simple things can improve readability. Some will probably argue that XML isn't readable or meant to be read/edited by humans... but I do it all the time.
Consider this example (basic hyperlink):
<a href="http://www.htmlhelp.com/" title="Help Information" target="_top">Web Design Group</a>
Would you like it if you had to write or read it this way instead?
<a>
<href>http://www.htmlhelp.com/</href>
<title>Help Information</title>
<target>_top</target>
<text>Web Design Group</text>
</a>
To me that looks like a lot of noise.
Don't forget that attributes are parsed as part of the start tag. This means while you're parsing, you get those values right away, you don't have to wait for the close tag. Plus, you don't invoke all the parsing events (if you're doing stream parsing) for all the element tags.
I prefer to use attributes for metadata about the comtained element. For example, I like to express dates as <date format="dd-MMM-yyyy">20-Jan-2010</date>
. If you've got unambiguous data elements, go ahead and just make them attributes. <name first="Tom" last="Jones"/>
works for many cases.
attributes are just that attributes of the element. if you need to nest multiple elements then you use elements. In your date example I usually just use attributes, because it is smaller.
<date month="12" day="31" year="2009"/>
if much easier to deal with and smaller to store and send over the wire as well, and arguably easier for a human to read as well. A date will never have multiple days, months or years so there is no reason to make them elements.
Think of a block of contact information...
<!-- attribute version -->
<person name="Matt" age="27">
<phone type="mobile" value="1234567890" />
<phone type="work" value="1234560987" />
<address type="home"
city="NoWhere"
state="OH"
street="123 Lost Ave."
zipcode="12345" />
</person>
<!-- element version -->
<person>
<name>Matt</name>
<age>27</age>
<phone>
<type>mobile</type>
<value>1234567890</value>
</phone>
<phone>
<type>work</type>
<value>1234560987</value>
</phone>
<address>
<type>home</type>
<city>NoWhere</city>
<state>OH</state>
<street>123 Lost Ave.</street>
<zipcode>12345</zipcode>
</address>
</person>
... you could expand these out into elements. However if you are processing hundreds, and possibly millions of records, the extra overhead from the end tags can bloat the files. This could cause problems on memory/processor constrained systems and/or slow datalinks. Littering your XML with elements can also make it much more difficult to read and understand your XML visually. While the visual experience of data may not matter for transfer and storage, and can be very important for configuration and maintenance.
Another problem that can come out of using elements from everything is when you try to use data from outside of your code base; you have a much more difficult time knowing if the elements can repeat or if they should only contain a simple piece of information. Yes, you can constrain this with XSD and DTD but that is typically more difficult then just making the XML easy to understand.
As for your bonus question... Versioning of XML schemas would depend on the platform you are developing against and how strict your code and platform are against schema. XML (and binary files) can be very flexible... that really why XML is eXtensible.
All those points from the w3schools article are absolutely valid and correct. I agree - I hardly ever use attributes in my XML documents.
The only time I would use them might be when I need to identify an entity, e.g.
<Customer Id="123123">
....
</Customer>
But even here, it's a toss-up. You could just as easily put that ID into an <ID>123123</ID>
element.
Furthermore, in my case, since the WCF DataContractSerializer doesn't support XML attributes (for performance reasons), that's one more reason not to use them (much):
"Why do attributes even exist then?"
To allow for more concise XML code, just for save your typing. And, of course, any XML file containing attributes
<element attr1="val1" attr2="val2" ... attrN="valN">
<nestedElement>
...
</nestedElement>
</element>
can be easyly converted to an "attributeless" one:
<element>
<attributes>
<attr1>val1</attr1>
<attr2>val2</attr2>
...
<attrN>valN</attrN>
</attributes>
<nestedElement>
...
</nestedElement>
</element>
This question have already made me scratch my head too. For me, it's a matter of semantics. It seems more natural for me to do
<page size="a4">
than
<page>
<size>a4</size>
</page>
I generally use attributes for the minimum set of fields that make a node unique. In other words, they represent the primary key. This makes some things easier if you need to correlate XML with a relational database.
精彩评论