开发者

XSLT: How to remove the self-closed element

开发者 https://www.devze.com 2022-12-23 15:57 出处:网络
I have a large xml file which contents a lot of self-closed tags. How could remove all them by using XSLT.

I have a large xml file which contents a lot of self-closed tags. How could remove all them by using XSLT.

eg.

<?xml version="1.0" encoding="utf-8" ?>
<Persons>
  <Person>
    <Name>user1</Name>
    <Tel />
    <Mobile>123</Mobile>
  </Person>
  <Person>
    <Name>user2</Name>
    <Tel>456</Tel>
    <Mobile />
  </Person>
  <Person>
    <Name />
    <Tel>123</Tel>
    <Mobile />
  </Person>
  <Person>
    <Name>user4</Name>
    <Tel />
    <Mobile />
  </Person>
</Persons>

I'm expecting the result:

<?xml version="1.0" encoding="utf-8" ?>
<Persons>
  <Person>
    <Name>user1</Name>
    <Mobile>123</Mobile>
  </Person>
  <Person>
    <Name>user2</Name>
    <Tel>456</Tel>
    </Person>
  <Person>
    <Tel>123</Tel>
  </Pers开发者_如何学Goon>
  <Person>
    <Name>user4</Name>
  </Person>
</Persons>

Note: there are thousands of different elements, how can I programmatically remove all the self-closed tags. Another question is how to remove the empty element such as <name></name> as well.

Can anyone help me on this? Many thanks.


The self-closed tags are equivalent to empty tags. You can remove all empty tags, but you have no way of knowing whether they were self-closed in the input XML or not (<tag/> and <tag></tag> are indistinguishable).

<!-- the identity template copies everything that has no special handler -->
<xsl:template match="node()|@*">
  <xsl:copy>
    <xsl:apply-templates select="node()|@*" />
  </xsl:copy>
</xsl:template>

<!-- special handler for elements that have no child nodes:
     they are removed by this empty template -->
<xsl:template match="*[not(node())]" />

If elements that contain whitespace only are "empty" by your definition as well, then replace the second template with:

<xsl:template match="*[normalize-space() = '']" />


From the XML point of view, there is no difference between "self-closed" element like and empty element like (see spec).

Here is a transformation to strip all empty elements:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes" encoding="utf-8" />
    <xsl:strip-space elements="*" />

    <xsl:template match="@*|node()">
        <xsl:if test=".!=''">
            <xsl:copy>
                <xsl:apply-templates select="@*|node()"/>
            </xsl:copy>
        </xsl:if>
    </xsl:template>

</xsl:stylesheet>


You might want to check if they are required. It should look something like this if they are: use="required". Also check if they are: type="nonEmptyString".


You can remove all empty elements - ones that do not have nested elements and attributes declared. If this solution works for you you can do following:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

  <xsl:template match="*">
    <xsl:if test="string(.) != '' or descendant-or-self::*/@*[string(.)]">
      <xsl:element name="{name()}" >
        <xsl:copy-of select="@*[string(.)]"/>
        <xsl:apply-templates select="* | text()" />
      </xsl:element>
    </xsl:if>
  </xsl:template>

  <xsl:template match="text()">
    <xsl:value-of select="."/>
  </xsl:template>

</xsl:stylesheet>


The reason to post this answer is, that you haven't accepted any of the existing answers yet.

Well. This is very simple XSLT challenge. Just match a node with text data as null and close the template tag, so that, the node will not appear in the output.
like this, <xsl:template match=*[.='']/> add it along with your identity template. Similar to the way Tomolak has nailed.

The problem with this approach is, it deletes even your parent node (<Person/> tag for example) if it null.

If this is your xml:

<Persons>
   <Person>
     <data>text</data>
     <data2>text</data2>
     <data3/>
   </Person>
   <Person/>
</Persons>

From the above xml even the tag is removed. So the output xml will be:

<Persons>
   <Person>
     <data>text</data>
     <data2>text</data2>
   </Person>
</Persons>

If you want to avoid that, then add an exception.

<xsl:template match="*[name()!='Person' and not(node())]"/>

Add it your identity template. Your XSLT will be:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl"
>
    <xsl:output method="xml" indent="yes"/>

    <xsl:template match="@* | node()">
        <xsl:copy>
            <xsl:apply-templates select="@* | node()"/>
        </xsl:copy>
    </xsl:template>
  <xsl:template match="*[name()!='Person' and not(node())]"/>
</xsl:stylesheet>

And the output xml will be:

<Persons>
   <Person>
     <data>text</data>
     <data2>text</data2>
   </Person>
   <Person/>
</Persons>
0

精彩评论

暂无评论...
验证码 换一张
取 消