I have been experimenting with Scala and XML and I found a strange difference in behavior between a XML tag created with XML.load (or loadString) and writing it as a literal. Here is the code :
import scala.xml._
// creating a classical link HTML tag
val in_xml = <link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>
// The same as a String
val in_str = """<link type="text/css" href="/css/main.css" rel="stylesheet" xmlns="http://www.w3.org/1999/xhtml"></link>"""
// Convert the String into XML
val from_str = XML.loadString(in_str)
println("in_xml : " + in_xml)
println("from_str: "+ from_str)
println("val_xml == from_str: "+ (in_xml == from_str))
println("in_xml.getClass() == from_str.getClass(): " +
(in_xml.getClass() == from_str.getClass()))
And here, the output :
in_xml : <link href="/css/main.css" rel="stylesheet" type="text/css" xmlns="http://www.w3.org/1999/xhtml"></link>
from_str: <link rel="stylesheet" href="/css/main.css" type="text/css" xmlns="http://www.w3.开发者_JAVA技巧org/1999/xhtml"></link>
val_xml == from_str: false
in_xml.getClass() == from_str.getClass(): true
The types are the same. But there is not equality. The order of the attributes changes. It is never the same as the original one. The attributes of the litteral are alphabetically sorted (only hazard ?).
This would not be a problem if both solutions did not behave differently when I try to transform them. I picked up some intresting Code from Daniel C. Sobral at How to change attribute on Scala XML Element and wrote my own rule in order to remove the first slash of the "href" attribute. The RuleTransformer works well with the in_xml, but has no effect on from_str !
Unfortunately, most of my programs have to read there XML via XML.load(...). So, I'm stuck. Does someone know about this topic ?
Best regards,
Henri
From what I can see, in_xml
and from_str
are not equals because the order of the attributes is different. This is unfortunate and due to the way the XML is created by the compiler. That causes the attributes to be different:
scala> in_xml.attributes == from_str.attributes
res30: Boolean = false
You can see see that if you replace the attributes the comparison will work:
scala> in_xml.copy(attributes=from_str.attributes) == from_str
res32: Boolean = true
With that said, I'm not clear why that would cause a different behavior in the code that replaces the href
attribute. In fact I suspect that something is wrong with the way attribute mapping works. For instance, if I replace the in_str
with:
val in_str = """<link type="text/css" rel="stylesheet" href="/css/main.css"
xmlns="http://www.w3.org/1999/xhtml"></link>"""
It works fine. Could it be that the attribute code from Daniel only works if the attribute is in the head position of MetaData
?
Side note: unless in_xml
is null
, equals
and ==
would return the same value. The ==
version will check whether the first operand is null before calling equals
.
Some further testing: Maybe, my initial equality test is not appropriate:
in_xml == from_str
and if I test :
in_xml.equals(in_xml)
I get also get false. Maybe, I should use another testing method (like corresponds, but I did not find out what a predicate I should use as second parameter...)
That said, if I test the following in the REPL
<body id="1234"></body> == XML.loadString("<body id=\"1234\"></body>")
I get true, even without calling the equals method...
Back to my initial example: I defined a rewrite rule
def unSlash(s: String) = if (s.head == '/') s.tail else s
val changeCSS = new RewriteRule {
override def transform(n: Node): NodeSeq = n match {
case e: Elem if (n \ "@rel").text == "stylesheet" =>
e.copy(attributes = mapMetaData(e.attributes) {
case g @ GenAttr(_, key, Text(v), _) if key == "href" =>
g.copy(value = Text(unSlash(v)))
case other => other
})
case n => n
}
}
It uses the helper classes/methods defined by Daniel C. Sobral at How to change attribute on Scala XML Element. If I apply:
new RuleTransformer(changeCSS).transform(in_xml)
new RuleTransformer(removeComments).transform(from_str)
I get the expected result with in_xml, but no modification with from_str...
精彩评论