I had a tough time formulating the question title. Maybe the example will make more sense.
Suppose I have an XML document that looks like this from system A:
<root>
<phone_numbers>
<phone_number type="work">123-WORK</phone_number>
<phone_number type="home">456-HOME</phone_number>
<phone_number type="work">789-WORK</phone_number>
<phone_number type="other">012-OTHER</phone_number>
</phone_numbers>
<email_addresses>
<email_address type="home">a@home</email_address>
<email_address type="other">b@other</email_address>
<email_address type="home">c@home</email_address>
<email_address type="work">d@work</email_address>
<email_address type="other">e@other</email_address>
<email_address type="other">f@other</email_address>
</email_addresses>
</root>
And I have to fit these into a structure like this so they can be used in system B:
<root>
<addresses>
<address name="work1">
<phone_number>123-WORK</phone_number>
<email_address>d@work</email_address>
</address>
<address name="work2">
<phone_number>789-WORK</phone_number>
</address>
<address name="other1">
<phone_number>012-OTHER</phone_number>
<email_address>b@other</email_address>
</address>
<address name="other2">
<email_address>e@other</email_address>
</address>
<address name="other3">
<email_address>f@other</email_address>
</address>
<address name="home1">
<phone_number>456-HOME</phone_number>
<email_address>a@home</email_address>
</address>
<address name="home2">
<email_address>c@home</email_address>
</address>
</addresses>
</root>
There can be any number (from 0 to infinity, as far as I know) of email addresses of each type. There can also be any number of phone numbers of each type, and the number of phone numbers of one type does not have to match the number of email addresses of the same type.
The email addresses and phone numbers in the first document aren't really related to each other, except that they are entered in the order they were added to system A.
I have to pair the emails and phone numbers up by type to fit into system B, and I would like to pair them so that the first phone number of type X is paired with the first email address of type X and so that no phone number of type X is paired with an email of a type other than X.
Since I have to pair them up, and since the order they were entered into the system is the closest I'll get to finding a relationship between the pairs, I would like to order them this way. I'll have to tell the users to go over the results, to make sure they make sense, but I have to pair them - no choice.
To complicate matters, my actual XML document has more nodes that I'll need to merge with phone_numbers and email_addresses, and I have more than two @types
.
One other note: 开发者_StackOverflow中文版I'm already calculating the maximum number of nodes with any given @type
, so with my example docs, I know that the maximum number of <address>
nodes of a single @type
is three (three <email_address>
nodes with @type=other
= three <address>
nodes with @name=otherX
).
This transformation is quite simpler (only 3 templates and no modes):
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:key name="kTypeByVal" match="@type" use="."/>
<xsl:key name="kPhNumByType" match="phone_number"
use="@type"/>
<xsl:key name="kAddrByType" match="email_address"
use="@type"/>
<xsl:variable name="vallTypes" select=
"/*/*/*/@type
[generate-id()
=
generate-id(key('kTypeByVal',.)[1])
]"/>
<xsl:template match="/">
<root>
<addresses>
<xsl:apply-templates select="$vallTypes"/>
</addresses>
</root>
</xsl:template>
<xsl:template match="@type">
<xsl:variable name="vcurType" select="."/>
<xsl:variable name="vPhoneNums" select="key('kPhNumByType',.)"/>
<xsl:variable name="vAddresses" select="key('kAddrByType',.)"/>
<xsl:variable name="vLonger" select=
"$vPhoneNums[count($vPhoneNums) > count($vAddresses)]
|
$vAddresses[not(count($vPhoneNums) > count($vAddresses))]
"/>
<xsl:for-each select="$vLonger">
<xsl:variable name="vPos" select="position()"/>
<address name="{$vcurType}{$vPos}">
<xsl:apply-templates select="$vPhoneNums[position()=$vPos]"/>
<xsl:apply-templates select="$vAddresses[position()=$vPos]"/>
</address>
</xsl:for-each>
</xsl:template>
<xsl:template match="phone_number|email_address">
<xsl:copy>
<xsl:copy-of select="node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
when applied on the provided XML document (and any document with the described properties):
<root>
<phone_numbers>
<phone_number type="work">123-WORK</phone_number>
<phone_number type="home">456-HOME</phone_number>
<phone_number type="work">789-WORK</phone_number>
<phone_number type="other">012-OTHER</phone_number>
</phone_numbers>
<email_addresses>
<email_address type="home">a@home</email_address>
<email_address type="other">b@other</email_address>
<email_address type="home">c@home</email_address>
<email_address type="work">d@work</email_address>
<email_address type="other">e@other</email_address>
<email_address type="other">f@other</email_address>
</email_addresses>
</root>
the wanted, correct result is produced:
<root>
<addresses>
<address name="work1">
<phone_number>123-WORK</phone_number>
<email_address>d@work</email_address>
</address>
<address name="work2">
<phone_number>789-WORK</phone_number>
</address>
<address name="home1">
<phone_number>456-HOME</phone_number>
<email_address>a@home</email_address>
</address>
<address name="home2">
<email_address>c@home</email_address>
</address>
<address name="other1">
<phone_number>012-OTHER</phone_number>
<email_address>b@other</email_address>
</address>
<address name="other2">
<email_address>e@other</email_address>
</address>
<address name="other3">
<email_address>f@other</email_address>
</address>
</addresses>
</root>
Explanation:
All different values of the
type
attribute are collected in the$vallTypes
variable, using the Muenchian method for grouping.For every distinct value found in 1. above, an
<address>
element is output as follows.A
name
attribute is generated with value the concatenation of the currenttype
and the currentposition()
.Two nodesets are captured in variables: one containing all
phone_number
elements that has this specific value of theirtype
attribute, and another containing allemail_address
elements that has this specific value of theirtype
attribute.For every element of the longer of these two node-sets one element or (if possible a pair of elements from the two node-sets) is/are used to be generated (omitting the
type
attribute`) in the final output.
This stylesheet:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:key name="byType" match="/root/*/*" use="@type" />
<xsl:key name="phoneByType" match="phone_numbers/phone_number"
use="@type" />
<xsl:key name="emailByType" match="email_addresses/email_address"
use="@type" />
<xsl:template match="/">
<root>
<addresses>
<xsl:apply-templates />
</addresses>
</root>
</xsl:template>
<xsl:template match="/root/*/*" />
<xsl:template
match="/root/*/*[generate-id()=generate-id(key('byType', @type)[1])]">
<xsl:apply-templates select="key('phoneByType', @type)"
mode="wrap" />
<xsl:apply-templates
select="key('emailByType', @type)
[position() > count(key('phoneByType', @type))]"
mode="wrap" />
</xsl:template>
<xsl:template match="phone_numbers/phone_number" mode="wrap">
<xsl:variable name="pos" select="position()" />
<address name="{concat(@type, $pos)}">
<xsl:apply-templates select="." mode="out" />
<xsl:apply-templates select="key('emailByType', @type)[$pos]"
mode="out" />
</address>
</xsl:template>
<xsl:template match="email_addresses/email_address" mode="wrap">
<address
name="{concat(@type,
position() + count(key('phoneByType', @type)))}">
<xsl:apply-templates select="." mode="out" />
</address>
</xsl:template>
<xsl:template match="/root/*/*" mode="out">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
On this input:
<root>
<phone_numbers>
<phone_number type="work">123-WORK</phone_number>
<phone_number type="home">456-HOME</phone_number>
<phone_number type="work">789-WORK</phone_number>
<phone_number type="other">012-OTHER</phone_number>
</phone_numbers>
<email_addresses>
<email_address type="home">a@home</email_address>
<email_address type="other">b@other</email_address>
<email_address type="home">c@home</email_address>
<email_address type="work">d@work</email_address>
<email_address type="other">e@other</email_address>
<email_address type="other">f@other</email_address>
<email_address type="test">g@other</email_address>
</email_addresses>
</root>
Produces:
<root>
<addresses>
<address name="work1">
<phone_number>123-WORK</phone_number>
<email_address>d@work</email_address>
</address>
<address name="work2">
<phone_number>789-WORK</phone_number>
</address>
<address name="home1">
<phone_number>456-HOME</phone_number>
<email_address>a@home</email_address>
</address>
<address name="home2">
<email_address>c@home</email_address>
</address>
<address name="other1">
<phone_number>012-OTHER</phone_number>
<email_address>b@other</email_address>
</address>
<address name="other2">
<email_address>e@other</email_address>
</address>
<address name="other3">
<email_address>f@other</email_address>
</address>
<address name="test1">
<email_address>g@other</email_address>
</address>
</addresses>
</root>
Explanation:
- There are three groups: 1) all contact info by type; 2) all phone numbers by type; 3) all email addresses by type
- The first group is used to get the first occurrence of each type
- Then we go through each of the phone numbers, pairing with any email address in the same position
- Finally, we account for all of the email addresses that did not have a corresponding phone number
精彩评论