开发者

XSL to find all nodes between nodes

开发者 https://www.devze.com 2023-01-22 11:22 出处:网络
I have a large poorly formed XML file where information related to a single line item is broken into multiple lines of information that I\'m trying to group with the parent line item (ITEM_ID). The in

I have a large poorly formed XML file where information related to a single line item is broken into multiple lines of information that I'm trying to group with the parent line item (ITEM_ID). The information is sequential so the key is the ITEM_ID node, but I can't seem to create the proper XSL needed to group the information related to an item (ITEM_ID), given the following XML source (Updated to include newly discovered grandchild element in XML source):

<LINE_INFO>
    <ITEM_ID>some_part_num</ITEM开发者_如何学运维_ID>
    <DESC>some_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
    <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>some_other_part_num</ITEM_ID>
    <DESC>some_other_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
</LINE_INFO>
<LINE_INFO>
    <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
    <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
    <ADDTL_NOTE_DETAIL>
        <NOTE>This is the grandchild note that sometimes appears in my data</NOTE>
    </ADDTL_NOTE_DETAIL>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>yet_another_part_num</ITEM_ID>
    <DESC>yet_another_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
</LINE_INFO>
  ...

Desired output:

<LINE_INFO>
    <ITEM_ID>some_part_num</ITEM_ID>
    <DESC>some_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
    <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>some_other_part_num</ITEM_ID>
    <DESC>some_other_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
    <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
    <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
    <NOTE>This is the grandchild note that sometimes appears in my data</NOTE>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>yet_another_part_num</ITEM_ID>
    <DESC>yet_another_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
</LINE_INFO>


This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:key name="kFollowing" match="LINE_INFO[not(ITEM_ID)]"
 use="generate-id(preceding-sibling::LINE_INFO[ITEM_ID][1])"/>

 <xsl:template match="node()|@*">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="LINE_INFO[ITEM_ID]">
  <xsl:copy>
   <xsl:apply-templates select="node()|@*"/>

   <xsl:apply-templates select="key('kFollowing', generate-id())/node()"/>
  </xsl:copy>
 </xsl:template>

 <xsl:template match="LINE_INFO[not(ITEM_ID)]"/>
</xsl:stylesheet>

when applied on the provided XML document (wrapped in a single top element to mane it well-formed):

<t>
    <LINE_INFO>
        <ITEM_ID>some_part_num</ITEM_ID>
        <DESC>some_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
    <LINE_INFO>
        <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>some_other_part_num</ITEM_ID>
        <DESC>some_other_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
    <LINE_INFO>
        <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
    </LINE_INFO>
    <LINE_INFO>
        <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>yet_another_part_num</ITEM_ID>
        <DESC>yet_another_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
</t>

produces the wanted, correct result:

<t>
    <LINE_INFO>
        <ITEM_ID>some_part_num</ITEM_ID>
        <DESC>some_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
        <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>some_other_part_num</ITEM_ID>
        <DESC>some_other_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
        <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
        <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>yet_another_part_num</ITEM_ID>
        <DESC>yet_another_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
</t>

Do note: The use of keys to identify easily and efficiently all LINE_INFO nodes that dont have an ITEM_ID child and immediately follow a LINE_INFO node with an ITEM_ID child.


This XSLT 2.0 stylesheet:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output indent="yes"/>
    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*"/>
        </xsl:copy>
    </xsl:template>
    <xsl:template match="root">
        <xsl:for-each-group select="LINE_INFO"
                            group-starting-with="LINE_INFO[ITEM_ID]">
            <xsl:copy>
                <xsl:apply-templates select="current-group()/node()"/>
            </xsl:copy>
        </xsl:for-each-group>
    </xsl:template>
</xsl:stylesheet>

With this input:

<root>
    <LINE_INFO>
        <ITEM_ID>some_part_num</ITEM_ID>
        <DESC>some_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
    <LINE_INFO>
        <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>some_other_part_num</ITEM_ID>
        <DESC>some_other_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
    <LINE_INFO>
        <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
    </LINE_INFO>
    <LINE_INFO>
        <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
    </LINE_INFO>
    <LINE_INFO>
        <ITEM_ID>yet_another_part_num</ITEM_ID>
        <DESC>yet_another_part_num_description</DESC>
        <QTY>nn</QTY>
        <UNIT>uom</UNIT>
    </LINE_INFO>
</root>

Output:

<LINE_INFO>
    <ITEM_ID>some_part_num</ITEM_ID>
    <DESC>some_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
    <EXT_DESC>more_description_for_some_part_num</EXT_DESC>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>some_other_part_num</ITEM_ID>
    <DESC>some_other_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
    <EXT_DESC>more_description_for_some_other_part_num</EXT_DESC>
    <LINE_NOTE>This is a note related to some_other_part_num</LINE_NOTE>
</LINE_INFO>
<LINE_INFO>
    <ITEM_ID>yet_another_part_num</ITEM_ID>
    <DESC>yet_another_part_num_description</DESC>
    <QTY>nn</QTY>
    <UNIT>uom</UNIT>
</LINE_INFO>


This is a classic grouping problem. The best approach depends on whether you have XSLT 2.0, or have to use 1.0.

If 2.0, you'll want to use <xsl:for-each-group>:

<table>
   <xsl:for-each-group select="LINE_INFO" group-starting-with="LINE_INFO[ITEM_ID]">

The above XPath expressions for select and group-starting-with assume that the context node is the parent of the LINE_INFO elements. Alternatively you could put // on the front of both expressions, at the risk of lesser performance.

Output a row for each group, with data put in table cell's according to your most recent comment:

      <tr>
         <td><xsl:value-of select="current-group()/ITEM_ID" /></td>
         <td>
           <xsl:value-of "concat(current-group()/DESC, current-group()/EXT_DESC)"/>
           <br />
           <xsl:value-of "concat(current-group()/LINE_NOTE)" />
           <br />
           <xsl:value-of "concat(current-group()/NOTE)" />
         </td>
         <td><xsl:value-of select="current-group()/QTY" /></td>
         <td><xsl:value-of select="current-group()/ADDTL_NOTE_DETAIL/NOTE" /></td>
      </tr>
   </xsl:for-each-group>
</table>

(The rest of this answer is somewhat obsolete as the OP has XSLT 2.0.)

If 1.0, your best bet is Muenchian grouping. For the identifying-the-groups step (step 1), you would use a key like

<xsl:key name="LINE_INFO-by-section" match="LINE_INFO"
    use="generate-id((. | preceding-sibling::LINE_INFO)[ITEM_ID][last()])" />

To iterate over the groups:

<xsl:for-each select="LINE_INFO[ITEM_ID]">
   <xsl:copy>

To iterate over the members of the group:

      <xsl:variable name="section-starter-id" select="generate-id(.)" />
      <xsl:for-each select="key('LINE_INFO-by-section', $section-starter-id))">
         <xsl:copy-of select="node()|@*" />
      </xsl:for-each>
   </xsl:copy>
</xsl:for-each>

(Untested.)

0

精彩评论

暂无评论...
验证码 换一张
取 消