摺疊節點，如果沒有其他因素是其間

我有這個XML文件：摺疊節點，如果沒有其他因素是其間

<?xml version="1.0" encoding="iso-8859-1"?> 
<doclist> 
<text attribute="a">This is a <tag1>sentence</tag1> <tag1>with</tag1> a few    
<tag1>words</tag1>.</text> 
<-- many more text nodes with none, one or several '<tag1>' in it --> 
</doclist>

，我希望得到這樣的結果：

<?xml version="1.0" encoding="iso-8859-1"?> 
<doclist> 
<text attribute="a">This is a <tag1>sentence with</tag1> a few <tag1>words</tag1>. 
</text> 
<-- many more text nodes with none, one or several '<tag1>'s in it --> 
</doclist>

我試圖用正則表達式做：

<xsl:template match="text"> 
<text> 
<xsl:apply-templates select="@*"/> <!-- templ. to copy attributes of text --> 
<xsl:analyze-string select="." 
regex="&lt;tag1>(.+)&lt;tag1>&lt;tag1>(.+)&lt;/tag1>"> 
<!-- also tried . instead of &lt; --> 
<xsl:matching-substring> 
<xsl:for-each select="."> 
<tag1> 
<xsl:value-of-select="regex-group(1)"> 
<xsl:text> <xsl:text> 
<xsl:value-of-select="regex-group(2)"> 
</tag1> 
</xsl:matching-substring> 
<xsl:non-matching-substring> 
<xsl:for each select="."> 
<xsl:value-of select="."/> 
</xsl:for-each> 
</xsl:non-matching-substring> 
</xsl:analyze-string> 
</text> 
</xsl:template>

但我的輸出如下所示：

<?xml version="1.0" encoding="iso-8859-1"?> 
<doclist> 
<text attribute="a>This is a sentencewitha few words. 
</text> 
<-- many more text nodes with none, one or several '<tag1>'s in it --> 
</doclist>

我的猜測，什麼情況是，比賽的arent發現，因爲沒有<tag1>出現S的結果 - 但我不明白爲什麼只有標籤surounded的話失去了空格... 我該如何正確崩潰<tag1>是直接的鄰居嗎？

來源

2013-07-03 Beehgr

使用for-each-groupgroup-adjacent如果您需要對節點（元素節點和文本節點的混合內容）進行操作，則不能使用analyze-string對元素節點進行操作。

所以我認爲

<xsl:stylesheet version="2.0" 
       xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> 

<xsl:template match="@* | node()"> 
    <xsl:copy> 
    <xsl:apply-templates select="@* , node()"/> 
    </xsl:copy> 
</xsl:template> 

<xsl:template match="text"> 
    <xsl:copy> 
    <xsl:apply-templates select="@*"/> 
    <xsl:for-each-group select="node()" group-adjacent="self::tag1 or self::text()[not(normalize-space())]"> 
     <xsl:choose> 
     <xsl:when test="current-grouping-key()"> 
      <tag1> 
      <xsl:apply-templates select="current-group()"/> 
      </tag1> 
     </xsl:when> 
     <xsl:otherwise> 
      <xsl:apply-templates select="current-group()"/> 
     </xsl:otherwise> 
     </xsl:choose> 
    </xsl:for-each-group> 
    </xsl:copy> 
</xsl:template> 

<xsl:template match="text/tag1"> 
    <xsl:apply-templates/> 
</xsl:template> 

</xsl:stylesheet>

應該做的，是樣式表，當撒克遜9應用，將輸入

<doclist> 
<text attribute="a">This is a <tag1>sentence</tag1> <tag1>with</tag1> a few    
<tag1>words</tag1>.</text> 
<!-- many more text nodes with none, one or several '<tag1>' in it --> 
</doclist>

到結果

<doclist> 
<text attribute="a">This is a <tag1>sentence with</tag1> a few 
<tag1>words</tag1>.</text> 
<!-- many more text nodes with none, one or several '<tag1>' in it --> 
</doclist>

，我想該方法應該處理更復雜的輸入樣本。但是請自我測試並報告，如果有問題，請在問題中添加更復雜的輸入樣本，以便我們可以測試。

來源

2013-07-03 12:24:57

thx，我添加了匹配tag1和使用group-adjacent（不知道他們）的模板。輸出看起來像預期的結果。我想知道，爲什麼「或self :: text（）」是需要的。我看了一下o'reillys xslt書，並且用「boolean（self :: tag1）」來試用它，但是它沒有摺疊元素。兩者有什麼區別？另外，我很好奇你爲什麼用代替元素。這是一種風格習慣嗎？ – Beehgr

在XPath數據模型中，XSLT在輸入序列'句子和上操作導致在元素節點上構成的節點序列，具有單個空白字符和第二個元素節點的文本節點。我們希望將它們三個組合在一起，這就是爲什麼我使用'xsl：for-each-group select =「node（）」group-adjacent =「self :: tag1或self :: text（）[not （'normalize-space（））]'。 –

至於使用'xsl：copy'，如果你有'match ='text'''''''''允許您使用相同的模板處理多個元素，即如果您有其他元素需要相同的處理，則可以使用' ...'。 –

摺疊節點，如果沒有其他因素是其間

回答

相關問題