2011-09-23 65 views
0

我有一個鬆散結構的XHTML數據,我需要將其轉換爲更好的結構化XML。一個棘手的XSLT轉換

這裏的例子:

<tbody> 
<tr> 
    <td class="header"><img src="http://www.abc.com/images/icon_apples.gif"/><img src="http://www.abc.com/images/flag/portugal.gif" alt="Portugal"/> First Grade</td> 
</tr> 
<tr> 
    <td>Green</td> 
    <td>Round shaped</td> 
    <td>Tasty</td> 
</tr> 
<tr> 
    <td>Red</td> 
    <td>Round shaped</td> 
    <td>Bitter</td> 
</tr> 
<tr> 
    <td>Pink</td> 
    <td>Round shaped</td> 
    <td>Tasty</td> 
</tr> 
<tr> 
    <td class="header"><img src="http://www.abc.com/images/icon_strawberries.gif"/><img src="http://www.abc.com/images/flag/usa.gif" alt="USA"/> Fifth Grade</td> 
</tr> 
<tr> 
    <td>Red</td> 
    <td>Heart shaped</td> 
    <td>Super tasty</td> 
</tr> 
<tr> 
    <td class="header"><img src="http://www.abc.com/images/icon_bananas.gif"/><img src="http://www.abc.com/images/flag/congo.gif" alt="Congo"/> Third Grade</td> 
</tr> 
<tr> 
    <td>Yellow</td> 
    <td>Smile shaped</td> 
    <td>Fairly tasty</td> 
</tr> 
<tr> 
    <td>Brown</td> 
    <td>Smile shaped</td> 
    <td>Too sweet</td> 
</tr> 

我想實現以下結構:

<data> 
    <entry> 
     <type>Apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Green</color> 
     <shape>Round shaped</shape> 
     <taste>Tasty</taste> 
    </entry> 
    <entry> 
     <type>Apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Red</color> 
     <shape>Round shaped</shape> 
     <taste>Bitter</taste> 
    </entry> 
    <entry> 
     <type>Apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Pink</color> 
     <shape>Round shaped</shape> 
     <taste>Tasty</taste> 
    </entry> 
    <entry> 
     <type>Strawberries</type> 
     <country>USA</country> 
     <rank>Fifth Grade</rank> 
     <color>Red</color> 
     <shape>Heart shaped</shape> 
     <taste>Super</taste> 
    </entry> 
    <entry> 
     <type>Bananas</type> 
     <country>Congo</country> 
     <rank>Third Grade</rank> 
     <color>Yellow</color> 
     <shape>Smile shaped</shape> 
     <taste>Fairly tasty</taste> 
    </entry> 
    <entry> 
     <type>Bananas</type> 
     <country>Congo</country> 
     <rank>Third Grade</rank> 
     <color>Brown</color> 
     <shape>Smile shaped</shape> 
     <taste>Too sweet</taste> 
    </entry> 
</data> 

首先,我需要提取從TBODY/TR/TD水果型/ img [1]/@ src,其次來自的國家tbody/tr/td/img [2]/@ alt屬性和fina lly從tbody/tr/td本身的等級。

接下來,我需要填充每個類別下的所有條目,同時包括這些值(如上所示)。

但是......正如你所看到的,我給出的數據結構非常鬆散。一個類別只是一個td,然後就是該類別中的所有項目。更糟糕的是,在我的數據集中,每個類別下的項目數量在1到100之間變化...

我試過幾種方法,但似乎無法得到它。任何幫助是極大的讚賞。我知道XSLT 2.0引入了xsl:for-each-group,但我僅限於XSLT 1.0。

回答

3

在這種情況下,您並不是實際上將元素分組。這更像是將它們解組。

執行此操作的一種方法是使用xsl:key查找每個詳細信息行的「標題」行。

<xsl:key name="fruity" 
    match="tr[not(td[@class='header'])]" 
    use="generate-id(preceding-sibling::tr[td[@class='header']][1])"/> 

即對於每個詳細信息行,獲取最前面的標題行。

接下來,你就可以匹配所有的標題行,像這樣:

<xsl:apply-templates select="tr/td[@class='header']"/> 

在匹配的模板,然後你可以提取類型,國家和排名。然後獲得相關的詳細信息行,它是在看父行的關鍵一個簡單的例子:

<xsl:apply-templates select="key('fruity', generate-id(..))"> 

這裏是整個XSLT

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
    <xsl:output method="xml" indent="yes"/> 

    <xsl:key name="fruity" 
     match="tr[not(td[@class='header'])]" 
     use="generate-id(preceding-sibling::tr[td[@class='header']][1])"/> 

    <xsl:template match="/tbody"> 
     <data> 
     <!-- Match header rows --> 
     <xsl:apply-templates select="tr/td[@class='header']"/> 
     </data> 
    </xsl:template> 

    <xsl:template match="td"> 
     <!-- Match associated detail rows --> 
     <xsl:apply-templates select="key('fruity', generate-id(..))"> 
     <!-- Extract relevant parameters from the td cell --> 
     <xsl:with-param name="type" select="substring-before(substring-after(img[1]/@src, 'images/icon_'), '.gif')"/> 
     <xsl:with-param name="country" select="img[2]/@alt"/> 
     <xsl:with-param name="rank" select="normalize-space(text())"/> 
     </xsl:apply-templates> 
    </xsl:template> 

    <xsl:template match="tr"> 
     <xsl:param name="type"/> 
     <xsl:param name="country"/> 
     <xsl:param name="rank"/> 
     <entry> 
     <type> 
      <xsl:value-of select="$type"/> 
     </type> 
     <country> 
      <xsl:value-of select="$country"/> 
     </country> 
     <rank> 
      <xsl:value-of select="$rank"/> 
     </rank> 
     <color> 
      <xsl:value-of select="td[1]"/> 
     </color> 
     <shape> 
      <xsl:value-of select="td[2]"/> 
     </shape> 
     <taste> 
      <xsl:value-of select="td[3]"/> 
     </taste> 
     </entry> 
    </xsl:template> 
</xsl:stylesheet> 

當適用於您的輸入文檔中,產生以下輸出:

<data> 
    <entry> 
     <type>apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Green</color> 
     <shape>Round shaped</shape> 
     <taste>Tasty</taste> 
    </entry> 
    <entry> 
     <type>apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Red</color> 
     <shape>Round shaped</shape> 
     <taste>Bitter</taste> 
    </entry> 
    <entry> 
     <type>apples</type> 
     <country>Portugal</country> 
     <rank>First Grade</rank> 
     <color>Pink</color> 
     <shape>Round shaped</shape> 
     <taste>Tasty</taste> 
    </entry> 
    <entry> 
     <type>strawberries</type> 
     <country>USA</country> 
     <rank>Fifth Grade</rank> 
     <color>Red</color> 
     <shape>Heart shaped</shape> 
     <taste>Super tasty</taste> 
    </entry> 
    <entry> 
     <type>bananas</type> 
     <country>Congo</country> 
     <rank>Third Grade</rank> 
     <color>Yellow</color> 
     <shape>Smile shaped</shape> 
     <taste>Fairly tasty</taste> 
    </entry> 
    <entry> 
     <type>bananas</type> 
     <country>Congo</country> 
     <rank>Third Grade</rank> 
     <color>Brown</color> 
     <shape>Smile shaped</shape> 
     <taste>Too sweet</taste> 
    </entry> 
</data> 
+0

+1對於一個很好的答案。 –