开发者

Does XSLT provide a means to identify xml elements by using regular expressions?

开发者 https://www.devze.com 2023-03-07 17:31 出处:网络
I have a sample xml file which looks like this: --- before transformation --- <root-node> <child-type-A> ... </child-type-A>

I have a sample xml file which looks like this:

--- before transformation ---
<root-node>

   <child-type-A> ... </child-type-A>
   <child-type-A> ... </child-type-A>
   开发者_开发技巧<child-type-B> ... </child-type-B>
   <child-type-C>
      <child-type-B> ... </child-type-B>
      ...
   </child-type-C>


   ...

</root-node>

I want to transform this xml file into something that looks like that:

--- after transformation ---
<root-node>

   <child-node> ... </child-node>
   <child-node> ... </child-node>
   <child-node> ... </child-node>
   <child-node>
      <child-node> ... </child-node>
      ...
   </child-node>

   ...

</root-node>

Effectively that means that the document structure remains the same, but some 'chosen' elements are renamed. These chosen elements start with the same prefix (in this example with "child-type-") but have varying suffixes ("A" | "B" | "C" | etc.).

Why all this hassle? I have a software that demands an xml file as input. For sake of convenience I use an XML schema to easily edit an xml file and the schema helps making sure the xml file will be correct. Sadly XML schemas are lacking somewhat when it comes to aspects of context sensitivity. This leads to the xml file looking like shown in /before transformation/. The software cannot process such an xml file because it expects a file as shown in /after transformation/. Thus the need for the transformation.


I want to do the transformation with XSLT and I already figured out how to do so. My approach was to define a rule for an identity transformation and one rule for each "child-type-*" element which needs to be renamed. This solution works but it isn't that elegant though. You end up with lots of rules.

--- sample transformation rules ---

<!-- Identity transformation -->
<xsl:template match="@*|node()">
   <xsl:copy>
      <xsl:apply-templates select="@*|node()" />
   </xsl:copy>
</xsl:template>

<xsl:template match="child-type-A">
   <xsl:element name="child-node">
      <xsl:apply-templates select="@*|node()" />
   </xsl:element>
</xsl:template>

...

Is there a way to condense that into only two rules? One for the identity transformation and one for all "child-type-*" elements? Maybe by using XSLT in combination with some regular expression? Or do you have to take a different approach to tackle such a problem?


(Revised my answer)

This snippet works fine with your sample XML. I merged the two templates, because they both want to act on 'all elements'. My earlier templates didn't work because both matched the same selection.

<xsl:template match="@*|node()">
    <xsl:choose>
        <xsl:when test="starts-with(name(), 'child-type')">
            <xsl:element name="child-node">
                <xsl:apply-templates select="@*|node()"/>
            </xsl:element>
        </xsl:when>
        <xsl:otherwise>
           <xsl:copy>
              <xsl:apply-templates select="@*|node()" />
           </xsl:copy>
        </xsl:otherwise>
    </xsl:choose>
</xsl:template>

Given your source XML of:

<root-node>
   <child-type-A> ... </child-type-A>
   <child-type-A> ... </child-type-A>
   <child-type-B> ... </child-type-B>
   <child-type-C>
      <child-type-B> ... </child-type-B>
   </child-type-C>
</root-node>

This results in the following output:

<root-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node> ... </child-node>
<child-node>
    <child-node> ... </child-node>
</child-node>
</root-node>


XSLtT has a starts-with function, which can be used to identify elements that start with 'child-type' allowing you to use a single template match. See this related question:

select the element which match the start-with name


It's not a good idea to capture information by attaching meaning to the internal syntax of an element name (in extremis, one could have an XML document in which all the information was captured in the name of the root element, <Surname_Kay.Firstname_Michael.Country_UK/>). However, if you've got data in that form it's certainly possible to process it, for example with a template rule of the form <xsl:template match="*[matches(name(), 'child-type-[A-Z]')]">


Here is a generic XSLT 1.0 transformation that could work with parameters that specify the desired predixes and, for each desired prefix, the set of suffixes, such that any element-name with this prefix and one of these suffixes should be renamed with a desired new name:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:my="my:my" exclude-result-prefixes="my" >
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <my:renames>
  <rename prefix="child-type-"
          newVal="child-node">
    <suffix>A</suffix>
    <suffix>B</suffix>
    <suffix>C</suffix>
  </rename>
 </my:renames>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="/*//*">
  <xsl:choose>
  <xsl:when test=
   "document('')/*
         /my:renames
           /rename
             [@prefix[starts-with(name(current()),.)]
            and
              suffix
               [substring(name(current()),
                          string-length(name(current()))
                          - string-length(.) +1
                          )
               =
                 .
               ]
              ]
    ">

  <xsl:variable name="vNewName" select=
   "document('')/*
         /my:renames
           /rename
             [@prefix[starts-with(name(current()),.)]
            and
              suffix
               [substring(name(current()),
                          string-length(name(current()))
                          -string-length(.) +1
                          )
               =
                 .
               ]
              ]
              /@newVal
   "/>

      <xsl:element name="{$vNewName}">
       <xsl:apply-templates select="node()|@*"/>
      </xsl:element>
   </xsl:when>
   <xsl:otherwise>
    <xsl:call-template name="identity"/>
   </xsl:otherwise>
  </xsl:choose>
 </xsl:template>
</xsl:stylesheet>

When applied on the provided XML document:

<root-node>
    <child-type-A> ... </child-type-A>
    <child-type-A> ... </child-type-A>
    <child-type-B> ... </child-type-B>
    <child-type-C>
      <child-type-B> ... </child-type-B>
      ...
    </child-type-C>
      ...
</root-node>

the wanted, correct result is produced:

<root-node>
   <child-node> ... </child-node>
   <child-node> ... </child-node>
   <child-node> ... </child-node>
   <child-node>
      <child-node> ... </child-node>
      ...
    </child-node>
      ...
</root-node>

Do note: Using this transformation you may rename simultaneously different elements with different prefixws and their associated suffixes specified as external parameters/documents.

II. Equivalent XSLT 2.0 solution:

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>
 <xsl:strip-space elements="*"/>

 <xsl:variable name="vRules">
  <rule prefix="^child\-type\-" newVal="child-node">
    <suffix>A$</suffix>
    <suffix>B$</suffix>
    <suffix>C$</suffix>
  </rule>
 </xsl:variable>

 <xsl:template match="node()|@*" name="identity">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match=
  "*[for $n in name(.),
         $r in $vRules/*
                 [matches($n, @prefix)], 
         $s in $vRules/*/suffix
                 [matches($n, .)]
      return $r and $s
    ]">

    <xsl:variable name="vN" select="name()"/>

    <xsl:variable name="vNewName" select=
     "$vRules/*
           [matches($vN, @prefix)
           and 
            suffix[matches($vN, .)]
           ]
           /@newVal
     "/>
   <xsl:element name="{$vNewName}">
    <xsl:apply-templates select="node()|@*"/>
   </xsl:element>
 </xsl:template>
</xsl:stylesheet>

when applied on the same XML document (above), again the same, correct output is produced.

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号