I have a large xml file which contents a lot of self-closed tags. How could remove all them by using XSLT.
eg.
<?xml version="1.0" encoding="utf-8" ?>
<Persons>
<Person>
<Name>user1</Name>
<Tel />
<Mobile>123</Mobile>
</Person>
<Person>
<Name>user2</Name>
<Tel>456</Tel>
<Mobile />
</Person>
<Person>
<Name />
<Tel>123</Tel>
<Mobile />
</Person>
<Person>
<Name>user4</Name>
<Tel />
<Mobile />
</Person>
</Persons>
I'm expecting the result:
<?xml version="1.0" encoding="utf-8" ?>
<Persons>
<Person>
<Name>user1</Name>
<Mobile>123</Mobile>
</Person>
<Person>
<Name>user2</Name>
<Tel>456</Tel>
</Person>
<Person>
<Tel>123</Tel>
</Pers开发者_如何学Goon>
<Person>
<Name>user4</Name>
</Person>
</Persons>
Note: there are thousands of different elements, how can I programmatically remove all the self-closed tags. Another question is how to remove the empty element such as <name></name>
as well.
Can anyone help me on this? Many thanks.
The self-closed tags are equivalent to empty tags. You can remove all empty tags, but you have no way of knowing whether they were self-closed in the input XML or not (<tag/>
and <tag></tag>
are indistinguishable).
<!-- the identity template copies everything that has no special handler -->
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*" />
</xsl:copy>
</xsl:template>
<!-- special handler for elements that have no child nodes:
they are removed by this empty template -->
<xsl:template match="*[not(node())]" />
If elements that contain whitespace only are "empty" by your definition as well, then replace the second template with:
<xsl:template match="*[normalize-space() = '']" />
From the XML point of view, there is no difference between "self-closed" element like and empty element like (see spec).
Here is a transformation to strip all empty elements:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes" encoding="utf-8" />
<xsl:strip-space elements="*" />
<xsl:template match="@*|node()">
<xsl:if test=".!=''">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:if>
</xsl:template>
</xsl:stylesheet>
You might want to check if they are required. It should look something like this if they are: use="required". Also check if they are: type="nonEmptyString".
You can remove all empty elements - ones that do not have nested elements and attributes declared. If this solution works for you you can do following:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="*">
<xsl:if test="string(.) != '' or descendant-or-self::*/@*[string(.)]">
<xsl:element name="{name()}" >
<xsl:copy-of select="@*[string(.)]"/>
<xsl:apply-templates select="* | text()" />
</xsl:element>
</xsl:if>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="."/>
</xsl:template>
</xsl:stylesheet>
The reason to post this answer is, that you haven't accepted any of the existing answers yet.
Well. This is very simple XSLT challenge. Just match a node with text data as null and close the template tag, so that, the node will not appear in the output.
like this, <xsl:template match=*[.='']/>
add it along with your identity template. Similar to the way Tomolak has nailed.
The problem with this approach is, it deletes even your parent node (<Person/>
tag for example) if it null.
If this is your xml:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
<data3/>
</Person>
<Person/>
</Persons>
From the above xml even the tag is removed. So the output xml will be:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
</Person>
</Persons>
If you want to avoid that, then add an exception.
<xsl:template match="*[name()!='Person' and not(node())]"/>
Add it your identity template. Your XSLT will be:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
<xsl:output method="xml" indent="yes"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="*[name()!='Person' and not(node())]"/>
</xsl:stylesheet>
And the output xml will be:
<Persons>
<Person>
<data>text</data>
<data2>text</data2>
</Person>
<Person/>
</Persons>
精彩评论