Java XML JDOM2 XPath - 使用XPath表達式從XML屬性和元素中讀取文本值

應允許程序使用XPath表達式從XML文件中讀取數據。我已經開始使用JDOM2的項目，切換到另一個API是不需要的。困難在於，如果程序必須讀取一個元素或一個屬性，它不會事先知道。 API是否提供任何函數來接收內容（字符串），只需提供XPath表達式？從我所瞭解的JDOM2中的XPath中，它使用不同類型的對象來評估指向屬性或元素的XPath表達式。我只關心XPath表達式指向的屬性/元素的內容。Java XML JDOM2 XPath - 使用XPath表達式從XML屬性和元素中讀取文本值

下面是一個示例XML文件：

<?xml version="1.0" encoding="UTF-8"?> 
<bookstore> 
    <book category="COOKING"> 
    <title lang="en">Everyday Italian</title> 
    <author>Giada De Laurentiis</author> 
    <year>2005</year> 
    <price>30.00</price> 
    </book> 
    <book category="CHILDREN"> 
    <title lang="en">Harry Potter</title> 
    <author>J K. Rowling</author> 
    <year>2005</year> 
    <price>29.99</price> 
    </book> 
    <book category="WEB"> 
    <title lang="en">XQuery Kick Start</title> 
    <author>James McGovern</author> 
    <author>Per Bothner</author> 
    <author>Kurt Cagle</author> 
    <author>James Linn</author> 
    <author>Vaidyanathan Nagarajan</author> 
    <year>2003</year> 
    <price>49.99</price> 
    </book> 
    <book category="WEB"> 
    <title lang="en">Learning XML</title> 
    <author>Erik T. Ray</author> 
    <year>2003</year> 
    <price>39.95</price> 
    </book> 
</bookstore>

這是我的計劃是什麼樣子：

package exampleprojectgroup; 

import java.io.IOException; 
import java.util.LinkedList; 
import java.util.List; 
import org.jdom2.Attribute; 
import org.jdom2.Document; 
import org.jdom2.Element; 
import org.jdom2.JDOMException; 
import org.jdom2.filter.Filters; 
import org.jdom2.input.SAXBuilder; 
import org.jdom2.input.sax.XMLReaders; 
import org.jdom2.xpath.XPathExpression; 
import org.jdom2.xpath.XPathFactory; 


public class ElementAttribute2String 
{ 
    ElementAttribute2String() 
    { 
     run(); 
    } 

    public void run() 
    { 
     final String PATH_TO_FILE = "c:\\readme.xml"; 
     /* It is essential that the program has to work with a variable amount of XPath expressions. */ 
     LinkedList<String> xPathExpressions = new LinkedList<>(); 
     /* Simulate user input. 
     * First XPath expression points to attribute, 
     * second one points to element. 
     * Many more expressions follow in a real situation. 
     */ 
     xPathExpressions.add("/bookstore/book/@category"); 
     xPathExpressions.add("/bookstore/book/price"); 

     /* One list should be sufficient to store the result. */ 
     List<Element> elementsResult = null; 
     List<Attribute> attributesResult = null; 
     List<Object> objectsResult = null; 
     try 
     { 
      SAXBuilder saxBuilder = new SAXBuilder(XMLReaders.NONVALIDATING); 
      Document document = saxBuilder.build(PATH_TO_FILE); 
      XPathFactory xPathFactory = XPathFactory.instance(); 
      int i = 0; 
      for (String string : xPathExpressions) 
      { 
       /* Works only for elements, uncomment to give it a try. */ 
//    XPathExpression<Element> xPathToElement = xPathFactory.compile(xPathExpressions.get(i), Filters.element()); 
//    elementsResult = xPathToElement.evaluate(document); 
//    for (Element element : elementsResult) 
//    { 
//     System.out.println("Content of " + string + ": " + element.getText()); 
//    } 

       /* Works only for attributes, uncomment to give it a try. */ 
//    XPathExpression<Attribute> xPathToAttribute = xPathFactory.compile(xPathExpressions.get(i), Filters.attribute()); 
//    attributesResult = xPathToAttribute.evaluate(document); 
//    for (Attribute attribute : attributesResult) 
//    { 
//     System.out.println("Content of " + string + ": " + attribute.getValue()); 
//    } 

       /* I want to receive the content of the XPath expression as a string 
       * without having to know if it is an attribute or element beforehand. 
       */ 
       XPathExpression<Object> xPathExpression = xPathFactory.compile(xPathExpressions.get(i)); 
       objectsResult = xPathExpression.evaluate(document); 
       for (Object object : objectsResult) 
       { 
        if (object instanceof Attribute) 
        { 
         System.out.println("Content of " + string + ": " + ((Attribute)object).getValue()); 
        } 
        else if (object instanceof Element) 
        { 
         System.out.println("Content of " + string + ": " + ((Element)object).getText()); 
        } 
       } 
       i++; 
      } 
     } 
     catch (IOException ioException) 
     { 
      ioException.printStackTrace(); 
     } 
     catch (JDOMException jdomException) 
     { 
      jdomException.printStackTrace(); 
     } 
    } 
}

另一個想法是尋找在XPath表達式中的「@」字符，以確定它是否指向一個屬性或元素。這給了我想要的結果，但我希望有一個更優雅的解決方案。 JDOM2 API是否提供了對這個問題有用的東西？是否可以重新設計代碼以滿足我的要求？

預先感謝您！

來源

2016-10-20 Stefan

XPath表達式很難打字/投射，因爲它們需要在對錶達式中的XPath函數/值的返回類型敏感的系統中進行編譯。 JDOM依靠第三方代碼來實現，而第三方代碼沒有在JDOM代碼編譯時與這些類型相關聯的機制。請注意，XPath表達式可以返回許多不同類型的內容，包括字符串，布爾值，數字和類似節點列表的內容。

在大多數情況下，XPath表達式的返回類型在評估表達式之前是已知的，程序員具有處理結果的「正確」投射/期望值。

在你的情況下，你不這樣做，並且表達式更具動態性。

我建議你聲明一個輔助函數來處理內容：

private static final Function extractValue(Object source) { 
    if (source instanceof Attribute) { 
     return ((Attribute)source).getValue(); 
    } 
    if (source instanceof Content) { 
     return ((Content)source).getValue(); 
    } 
    return String.valueOf(source); 
}

這至少會neaten你的代碼，如果你使用Java8流，可以說是相當緊湊：

List<String> values = xPathExpression.evaluate(document) 
         .stream() 
         .map(o -> extractValue(o)) 
         .collect(Collectors.toList());

請注意，Element節點的XPath規範是string-value是Element的text()內容以及所有子元素內容的連接。因此，在下面的XML片段：

<a>bilbo <b>samwise</b> frodo</a>

的a元素的getValue()將返回bilbo samwise frodo，但getText()將返回bilbo frodo。仔細選擇您用於提取值的機制。

來源

2016-10-20 13:25:32 rolfl

JDOM2中的'Attribute'是Content的一個子類嗎？ http://www.jdom.org/docs/apidocs/org/jdom2/Attribute.html不顯示，所以我很困惑，爲什麼你的答案似乎表明XPathExpression xPathExpression = xPathFactory.compile（xPathExpressions.get（i ），Filters.content（））'處理元素和屬性。 –

啊......廢話。我忘記了屬性並不滿足。它有'getValue（）'方法，我假設。讓我考慮一下。 – rolfl

我想不出更好的方式來處理模糊的XPath結果，而不是檢查它。如果元素和屬性節點共享一個共同的祖先，那麼JDOM可以使事情變得更容易一些，但還有其他原因是不可行的。我編輯了答案，以推薦一個函數提取來完善代碼，而不是更改OP描述的基本機制。 – rolfl

我有完全相同的問題，並採取了識別何時屬性是Xpath焦點的方法。我解決了兩個功能。第一個符合供以後使用XPathExpression：

XPathExpression xpExpression; 
    if (xpath.matches( ".*/@[\\w]++$")) { 
     // must be an attribute value we're after.. 
     xpExpression = xpfac.compile(xpath, Filters.attribute(), null, myNSpace); 
    } else { 
     xpExpression = xpfac.compile(xpath, Filters.element(), null, myNSpace); 
    }

第二個計算並返回一個值：

Object target = xpExpression.evaluateFirst(baseEl); 
if (target != null) { 
    String value = null; 
    if (target instanceof Element) { 
     Element targetEl = (Element) target; 
     value = targetEl.getTextNormalize(); 
    } else if (target instanceof Attribute) { 
     Attribute targetAt = (Attribute) target; 
     value = targetAt.getValue(); 
    }

我懷疑它是否喜歡編碼風格的問題的輔助函數在前面的答案建議或這種方法。要麼會工作。

來源

2017-01-18 21:55:37

Java XML JDOM2 XPath - 使用XPath表達式從XML屬性和元素中讀取文本值

回答

相關問題