2015-11-11 85 views
1

我想解決一個問題,我想從一系列元素中刪除重複的值。使用distinct-values和xslt 2.0刪除重複的元素

我這個玩耍了有一段時間了,而下面的代碼看起來有點像的東西我想會的工作,但我得到一個錯誤:

XPTY0020:「/」開頭不能選擇含上下文項樹的根節點:上下文項不是一個節點

的XSLT:

<?xml version="1.0" encoding="UTF-8"?> 
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" version="2.0"> 
    <xsl:strip-space elements="*"/> 
    <xsl:output method="xml" indent="yes"/> 

    <xsl:template match="/"> 

     <xsl:for-each select="distinct-values(/tobject/tobject.subject/@tobject.subject.refnum)"> 
      <xsl:copy-of select="."/> 
     </xsl:for-each> 

    </xsl:template> 
</xsl:stylesheet> 

的XML:

<?xml version="1.0" encoding="UTF-8"?> 
<tobject tobject.type="Utenriks"> 
    <tobject.property tobject.property.type="Nyheter"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/> 
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/> 
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/> 
</tobject> 

的通緝的結果:

<?xml version="1.0" encoding="UTF-8"?> 
<tobject tobject.type="Utenriks"> 
    <tobject.property tobject.property.type="Nyheter"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/> 
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/> 
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/> 
</tobject> 

回答

1

the code below sort of looks like something I thought would work, but I am getting an error:

XPTY0020: Leading '/' cannot select the root node of the tree containing the context item: the context item is not a node

此錯誤無法重現運行你的代碼 - 看到:http://xsltransform.net/gWvjQfa

然而,distinct-values()結果是,不的節點序列。您所期望的結果 - 刪除重複元素 - 更容易使用分組來實現:

XSLT 2.0

<xsl:stylesheet version="2.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/> 
<xsl:strip-space elements="*"/> 

<xsl:template match="/tobject"> 
    <xsl:copy> 
     <xsl:copy-of select="@* | tobject.property"/> 
     <xsl:for-each-group select="tobject.subject" group-by="@tobject.subject.refnum"> 
      <xsl:copy-of select="current-group()[1]"/> 
     </xsl:for-each-group> 
    </xsl:copy> 
</xsl:template> 

</xsl:stylesheet> 
0

一更短的解決方案是純XSLT 1.0並不需要不必要的元素名稱。

另外,它並不比使用<xsl:for-each-group>的XSLT 2.0解決方案效率低 - 因爲這裏我們用Muenchian方法進行分組:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 
<xsl:output omit-xml-declaration="yes" indent="yes"/> 
<xsl:strip-space elements="*"/> 
<xsl:key name="kOS" match="tobject.subject" use="@tobject.subject.refnum"/> 

    <xsl:template match="node()|@*"> 
    <xsl:copy> 
     <xsl:apply-templates select="node()|@*"/> 
    </xsl:copy> 
    </xsl:template> 

    <xsl:template match= 
    "tobject.subject[generate-id() != generate-id(key('kOS', @tobject.subject.refnum)[1])]"/> 
</xsl:stylesheet> 

當這種轉化應用所提供的XML文件:

<tobject tobject.type="Utenriks"> 
    <tobject.property tobject.property.type="Nyheter"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/> 
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/> 
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/> 
</tobject> 

想要的,正確的結果產生

<tobject tobject.type="Utenriks"> 
    <tobject.property tobject.property.type="Nyheter"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" tobject.subject.type="økonomi og næringsliv"/> 
    <tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" tobject.subject.matter="olje og energi"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" tobject.subject.type="politikk"/> 
    <tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" tobject.subject.matter="valg"/> 
    <tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" tobject.subject.type="kriminalitet og rettsvesen"/> 
    <tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" tobject.subject.type="fritid"/> 
</tobject> 

II。一襯墊的XPath 2.0表達式,其選擇要唯一的(一個來自每個組元素

$vElems[index-of($vElems/@tobject.subject.refnum, @tobject.subject.refnum)[1]] 

這裏$ vElems必須被定義爲:

/*/tobject.subject 

當這個XPath 2。0表達式在所提供的XML文檔上評估,選擇所需元素序列

<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04000000" 
      tobject.subject.type="økonomi og næringsliv"/> 
<tobject.subject tobject.subject.code="OKO" tobject.subject.refnum="04005000" 
      tobject.subject.matter="olje og energi"/> 
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11000000" 
      tobject.subject.type="politikk"/> 
<tobject.subject tobject.subject.code="POL" tobject.subject.refnum="11003000" 
      tobject.subject.matter="valg"/> 
<tobject.subject tobject.subject.code="KRE" tobject.subject.refnum="02000000" 
      tobject.subject.type="kriminalitet og rettsvesen"/> 
<tobject.subject tobject.subject.code="FRI" tobject.subject.refnum="10000000" 
      tobject.subject.type="fritid"/>