使用XPath 1.0查找最小值不起作用

我想從XML文檔（它實際上是一個轉換爲XML的HTML表格）中查找某個元素的最小值。但是，這不符合預期。使用XPath 1.0查找最小值不起作用

查詢結果類似於How can I use XPath to find the minimum value of an attribute in a set of elements?中使用的那個。它看起來像這樣：

/table[@id="search-result-0"]/tbody/tr[ 
    not(substring-before(td[1], " ") > substring-before(../tr/td[1], " ")) 
]

上執行的示例XML

<table class="tablesorter" id="search-result-0"> 
    <thead> 
     <tr> 
      <th class="header headerSortDown">Preis</th> 
      <th class="header headerSortDown">Zustand</th> 
     </tr> 
    </thead> 
    <tbody> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">20 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">25 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">35 CHF</td> 
      <td width="175px">Ausgepack und doch nie gebraucht</td> 
     </tr> 
     <tr> 
      <td width="45px">14 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
     <tr> 
      <td width="45px">15 CHF</td> 
      <td width="175px">Gebraucht, aber noch in Ordnung</td> 
     </tr> 
    </tbody> 
</table>

查詢返回以下結果：

<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Ausgepack und doch nie gebraucht</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">14 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr> 
----------------------- 
<tr> 
<td width="45px">15 CHF</td> 
<td width="175px">Gebraucht, aber noch in Ordnung</td> 
</tr>

爲什麼有更多的節點不是一個回來了？由於只有一個最小值，所以應該只返回一個節點。有人看到查詢有什麼問題嗎？它應該只返回包含14 CHF的節點。使用http://xpath.online-toolz.com/tools/xpath-editor.php

來源

2014-09-22 str

與此同時，我決定改用XSLT。這是我想出的樣式表：

<?xml version="1.0" encoding="UTF-8"?> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml"> 

    <xsl:output method="text" omit-xml-declaration="yes" indent="no" encoding="UTF-8"/> 
    <xsl:strip-space elements="*"/> 

    <xsl:template match="//table[@id=\'search-result-0\']/tbody"> 
     <ul> 
      <xsl:for-each select="tr/td[@width=\'45px\']"> 
       <xsl:sort select="substring-before(., \' \')" data-type="number" order="ascending"/> 

       <xsl:if test="position() = 1"> 
        <xsl:value-of select="substring-before(., \' \')"/> 
       </xsl:if> 
      </xsl:for-each> 
     </ul> 
    </xsl:template> 

    <xsl:template match="text()"/> <!-- ignore the plain text --> 

</xsl:stylesheet>

來源

2014-09-27 12:30:56 str

你用這裏只發現在那裏有沒有重複的值情況下，「最小」的XPath查詢得到

結果，和值之前被寫入節點排序;這是因爲它只是將當前值substring-before(td[1], " ")與發現的第一個值substring-before(../tr/td[1], " ")進行比較。以分解的比較：

[1] not(15 > 15) 
[2] not(20 > 15) 
[3] not(25 > 15) 
[4] not(35 > 15) 
[5] not(14 > 15) 
[6] not(15 > 15) 
[7] not(15 > 15)

比較例1，圖5，圖6，和圖7求值爲真（左手側不大於右手側更大）。

來源

2014-09-22 21:15:19 TML

你是對的。調用節點集上的函數僅返回第一個節點的結果而不是集合。有關如何解決這個問題的任何建議？ – str 2014-09-23 10:00:59

@str我很想說這在XPath 1.0中是不可能的。你能預先操作元素嗎？如果'substring-before'可以是在應用XPath表達式之前執行的一個獨立步驟（這樣就剩下了），那麼我有一個解決方案。 – 2014-09-23 10:18:57

我同意Mathias。這*在XPath 1.0中是不可能的，無需更改輸入XML。 – Tomalak 2014-09-23 10:43:07

TML已經指出爲什麼你當前的路徑表達式不起作用，但沒有提出可行的替代方案。

原因很簡單，因爲@Tomalak說：

我馬蒂亞斯同意。在XPath 1.0中，這實際上是不可能的，不需要改變輸入XML。

我加入這個答案詳細說明的方式，你不得不預處理你的XML 之前尋找瑞士法郎的最低金額。請記住：這太複雜了，因爲您在XPath 1.0中要求提供解決方案。使用XPath 2.0，您的問題可以通過單個路徑表達式來解決。

XML設計

我覺得你的問題說明了爲什麼XML設計XML時實際上是必不可少的。爲什麼？因爲你的問題歸結爲以下幾點：你的XML的設計方式很難處理內容。更確切地說，在一個td元件是這樣的：

<td width="45px">15 CHF</td>

有一個量（如數字）和一個貨幣，無論在td元素的文本節點中。如果您的XML輸入是在一個更聰明或規範的方式設計的，它看起來像：

<td width="45px" currency="CHF">15</td>

看到區別？現在，不同類型的內容顯然彼此分開。

的XPath修訂

假定在新設計的XML，一個tr/td[1]元素的唯一內容是多少，通過帕維爾Minaev您使用的，可向工作XPath表達式：

/table[@id="search-result-0"]/tbody/tr[not(td[1] > ../tr/td[1])][1]

XML結果（與the tool you use測試）

<tr> 
<td width="45px">14</td> 
<td width="175px">Ausgepack und doch nie gebraucht</td> 
</tr>

爲什麼Pavel's expression不行，只是因爲我想補充substring-before？

您已經找到答案的一部分了。它與如何在XPath 1.0函數中處理項目序列有關。

substring-before()是一個XPath 1.0函數，它需要兩個參數，它們都是字符串。而且，最重要的是，如果將字符串的序列定義爲substring-before()的第一個參數，則只會處理的第一個字符串，其他字符串將被忽略。

帕維爾的答案，適應了這一問題：

tr[not(td[1] > ../tr/td[1])][1]

依賴於事實，表達的第二部分，../tr/td[1]，發現的所有tbody元素tr的所有第一td子元素。不涉及函數，並且作爲>的操作數的序列沒有任何問題。

如果我們需要substring-before()因爲文本內容實際上既是一個數（我們想要的）和貨幣（這是我們想忽略），我們要它環繞表達的兩個部分：

tr[not(substring-before(td[1],' ') > substring-before(../tr/td[1],' '))][1]

>左側沒有問題，因爲目前tr只有一個td[1]。但是在右側，有一個序列節點，即../tr/td[1]。可悲的是，substring-before()只能夠處理其中的第一個。

請參閱@TML回答這個問題的後果。

來源

2014-09-23 16:29:42

偉大的擴張和細節，Mathias。 – TML 2014-09-23 18:59:16

我明白了。由於我無法更改源文檔，因此我想出了一個XSLT解決方案（請參閱我的答案）。 – str 2014-09-27 12:32:03

使用XPath 1.0查找最小值不起作用

回答

相關問題