2008-10-27 87 views
42

有人可以提供一些代碼來獲取System.Xml.XmlNode實例的xpath嗎?如何從XmlNode實例獲取xpath

謝謝!

+0

只是澄清一下,你的意思是從根節點到節點的列表節點名稱,用/分開。 – 2008-10-27 20:18:30

+0

Exatcly。所以像...... 「root/mycars/toyota/description/paragraph」 description元素中可能有多個段落。但我只希望xpath指向XmlNode實例所指的那個。 – joe 2008-10-27 20:28:04

+2

人們不應該只是「請求代碼」 - 他們應該提供一些他們至少已經嘗試過的代碼。 – bgmCoder 2015-01-11 16:31:20

回答

52

好吧,我忍不住去了一下。它只適用於屬性和元素,但嘿...你可以在15分鐘內得到什麼:)同樣,這可能是一種更乾淨的方式。

將索引包含在每個元素(特別是根元素!)中是多餘的,但它比試圖找出是否存在任何不明確性更容易。

using System; 
using System.Text; 
using System.Xml; 

class Test 
{ 
    static void Main() 
    { 
     string xml = @" 
<root> 
    <foo /> 
    <foo> 
    <bar attr='value'/> 
    <bar other='va' /> 
    </foo> 
    <foo><bar /></foo> 
</root>"; 
     XmlDocument doc = new XmlDocument(); 
     doc.LoadXml(xml); 
     XmlNode node = doc.SelectSingleNode("//@attr"); 
     Console.WriteLine(FindXPath(node)); 
     Console.WriteLine(doc.SelectSingleNode(FindXPath(node)) == node); 
    } 

    static string FindXPath(XmlNode node) 
    { 
     StringBuilder builder = new StringBuilder(); 
     while (node != null) 
     { 
      switch (node.NodeType) 
      { 
       case XmlNodeType.Attribute: 
        builder.Insert(0, "/@" + node.Name); 
        node = ((XmlAttribute) node).OwnerElement; 
        break; 
       case XmlNodeType.Element: 
        int index = FindElementIndex((XmlElement) node); 
        builder.Insert(0, "/" + node.Name + "[" + index + "]"); 
        node = node.ParentNode; 
        break; 
       case XmlNodeType.Document: 
        return builder.ToString(); 
       default: 
        throw new ArgumentException("Only elements and attributes are supported"); 
      } 
     } 
     throw new ArgumentException("Node was not in a document"); 
    } 

    static int FindElementIndex(XmlElement element) 
    { 
     XmlNode parentNode = element.ParentNode; 
     if (parentNode is XmlDocument) 
     { 
      return 1; 
     } 
     XmlElement parent = (XmlElement) parentNode; 
     int index = 1; 
     foreach (XmlNode candidate in parent.ChildNodes) 
     { 
      if (candidate is XmlElement && candidate.Name == element.Name) 
      { 
       if (candidate == element) 
       { 
        return index; 
       } 
       index++; 
      } 
     } 
     throw new ArgumentException("Couldn't find element within parent"); 
    } 
} 
2

有沒有這樣的事情作爲節點的「xpath」。對於任何給定的節點,可能會有很多xpath表達式匹配它。

你或許可以在樹上構建表達式,它會匹配它,考慮到特定元素的索引等,但它不會是非常好的代碼。

爲什麼你需要這個?可能有更好的解決方案。

+0

我正在調用一個XML編輯應用程序的API。我需要告訴應用程序隱藏某些節點,我通過調用帶有xpath的ToggleVisibleElement來完成此操作。 我希望有一個簡單的方法來做到這一點。 – joe 2008-10-27 20:26:12

20

Jon的正確,有任何數量的XPath表達式將產生實例文檔中的相同節點。構建明確地產生一個特定節點的表達式最簡單的方法是使用在謂詞的節點位置的節點測試鏈,例如:

/node()[0]/node()[2]/node()[6]/node()[1]/node()[2] 

顯然,這種表達不使用元素名稱,但隨後如果你所要做的只是在一個文檔中找到一個節點,你不需要它的名字。它也不能用於查找屬性(因爲屬性不是節點並且沒有位置;只能通過名稱找到它們),但它會查找所有其他節點類型。

要構建這個表情,你需要寫一個返回節點在其父的子節點位置的方法,因爲XmlNode不公開,作爲一個屬性:

static int GetNodePosition(XmlNode child) 
{ 
    for (int i=0; i<child.ParentNode.ChildNodes.Count; i++) 
    { 
     if (child.ParentNode.ChildNodes[i] == child) 
     { 
      // tricksy XPath, not starting its positions at 0 like a normal language 
      return i + 1; 
     } 
    } 
    throw new InvalidOperationException("Child node somehow not found in its parent's ChildNodes property."); 
} 

(有可能是一個更優雅辦法做到這一點使用LINQ,因爲XmlNodeList實現IEnumerable,但我有什麼,我知道這裏會)

然後,你可以寫這樣的遞歸方法:

static string GetXPathToNode(XmlNode node) 
{ 
    if (node.NodeType == XmlNodeType.Attribute) 
    { 
     // attributes have an OwnerElement, not a ParentNode; also they have 
     // to be matched by name, not found by position 
     return String.Format(
      "{0}/@{1}", 
      GetXPathToNode(((XmlAttribute)node).OwnerElement), 
      node.Name 
      );    
    } 
    if (node.ParentNode == null) 
    { 
     // the only node with no parent is the root node, which has no path 
     return ""; 
    } 
    // the path to a node is the path to its parent, plus "/node()[n]", where 
    // n is its position among its siblings. 
    return String.Format(
     "{0}/node()[{1}]", 
     GetXPathToNode(node.ParentNode), 
     GetNodePosition(node) 
     ); 
} 

正如你所看到的,我也通過某種方式找到屬性。

喬恩在我寫我的時候滑過了他的版本。關於他的代碼有些東西會讓我現在有點咆哮,如果我聽起來像Jon在嘮叨,我會提前道歉。 (我不是,我非常肯定Jon要向我學習的東西非常短)。但是我認爲,對於任何使用XML的人來說,我要說的一點非常重要,想一想。

我懷疑Jon的解決方案是從我看到很多開發者所做的事情中浮現出來的:將XML文檔看作元素和屬性的樹。我認爲這很大程度上來自主要使用XML的開發人員作爲序列化格式,因爲他們習慣使用的所有XML都是以這種方式構建的。您可以發現這些開發人員,因爲他們交替使用術語「節點」和「元素」。這導致他們想出解決方案,將所有其他節點類型視爲特殊情況。 (我自己也是這些人中的一員,很長一段時間。)

這感覺就像是一個簡化的假設,而你正在做它。但事實並非如此。它使問題變得更難,代碼更復雜。它會引導您繞過XML技術(如XPath中的node()函數),這些專門設計用於統一處理所有節點類型。

Jon的代碼中有一個紅色的標誌,它會讓我在代碼審查中查詢它,即使我不知道需求是什麼,那就是GetElementsByTagName。每當我看到使用該方法時,想到的問題總是「爲什麼它必須是一個元素?」答案經常是「哦,這個代碼是否也需要處理文本節點?」

0

這是更容易

''' <summary> 
    ''' Gets the full XPath of a single node. 
    ''' </summary> 
    ''' <param name="node"></param> 
    ''' <returns></returns> 
    ''' <remarks></remarks> 
    Private Function GetXPath(ByVal node As Xml.XmlNode) As String 
     Dim temp As String 
     Dim sibling As Xml.XmlNode 
     Dim previousSiblings As Integer = 1 

     'I dont want to know that it was a generic document 
     If node.Name = "#document" Then Return "" 

     'Prime it 
     sibling = node.PreviousSibling 
     'Perculate up getting the count of all of this node's sibling before it. 
     While sibling IsNot Nothing 
      'Only count if the sibling has the same name as this node 
      If sibling.Name = node.Name Then 
       previousSiblings += 1 
      End If 
      sibling = sibling.PreviousSibling 
     End While 

     'Mark this node's index, if it has one 
     ' Also mark the index to 1 or the default if it does have a sibling just no previous. 
     temp = node.Name + IIf(previousSiblings > 0 OrElse node.NextSibling IsNot Nothing, "[" + previousSiblings.ToString() + "]", "").ToString() 

     If node.ParentNode IsNot Nothing Then 
      Return GetXPath(node.ParentNode) + "/" + temp 
     End If 

     Return temp 
    End Function 
3

我10便士的價值是羅伯特和科裏的答案的混合體。我只能聲稱額外的代碼行的實際打字。

private static string GetXPathToNode(XmlNode node) 
    { 
     if (node.NodeType == XmlNodeType.Attribute) 
     { 
      // attributes have an OwnerElement, not a ParentNode; also they have 
      // to be matched by name, not found by position 
      return String.Format(
       "{0}/@{1}", 
       GetXPathToNode(((XmlAttribute)node).OwnerElement), 
       node.Name 
       ); 
     } 
     if (node.ParentNode == null) 
     { 
      // the only node with no parent is the root node, which has no path 
      return ""; 
     } 
     //get the index 
     int iIndex = 1; 
     XmlNode xnIndex = node; 
     while (xnIndex.PreviousSibling != null) { iIndex++; xnIndex = xnIndex.PreviousSibling; } 
     // the path to a node is the path to its parent, plus "/node()[n]", where 
     // n is its position among its siblings. 
     return String.Format(
      "{0}/node()[{1}]", 
      GetXPathToNode(node.ParentNode), 
      iIndex 
      ); 
    } 
1

如果你這樣做,你會得到一個路徑與DER節點和位置的名稱,如果你有相同的名字這樣的節點: 「/服務[1] /系統[1] /集團[1] /文件夾[2] /文件[2]」

public string GetXPathToNode(XmlNode node) 
{   
    if (node.NodeType == XmlNodeType.Attribute) 
    {    
     // attributes have an OwnerElement, not a ParentNode; also they have    
     // to be matched by name, not found by position    
     return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name); 
    } 
    if (node.ParentNode == null) 
    {    
     // the only node with no parent is the root node, which has no path 
     return ""; 
    } 

    //get the index 
    int iIndex = 1; 
    XmlNode xnIndex = node; 
    while (xnIndex.PreviousSibling != null && xnIndex.PreviousSibling.Name == xnIndex.Name) 
    { 
     iIndex++; 
     xnIndex = xnIndex.PreviousSibling; 
    } 

    // the path to a node is the path to its parent, plus "/node()[n]", where 
    // n is its position among its siblings.   
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, iIndex); 
} 
1

我發現沒有上述與XDocument工作,所以我寫了我自己的代碼來支持XDocument和使用遞歸。我認爲這段代碼比其他一些代碼更好地處理了多個相同的節點,因爲它首先嚐試深入XML路徑,然後備份以僅構建需要的內容。因此,如果您有/home/white/bob/home/white/mike,並且您想創建/home/white/bob/garage,代碼將知道如何創建該代碼。但是,我不想混淆謂詞或通配符,所以我明確地禁止了這些;但很容易爲它們添加支持。

Private Sub NodeItterate(XDoc As XElement, XPath As String) 
    'get the deepest path 
    Dim nodes As IEnumerable(Of XElement) 

    nodes = XDoc.XPathSelectElements(XPath) 

    'if it doesn't exist, try the next shallow path 
    If nodes.Count = 0 Then 
     NodeItterate(XDoc, XPath.Substring(0, XPath.LastIndexOf("/"))) 
     'by this time all the required parent elements will have been constructed 
     Dim ParentPath As String = XPath.Substring(0, XPath.LastIndexOf("/")) 
     Dim ParentNode As XElement = XDoc.XPathSelectElement(ParentPath) 
     Dim NewElementName As String = XPath.Substring(XPath.LastIndexOf("/") + 1, XPath.Length - XPath.LastIndexOf("/") - 1) 
     ParentNode.Add(New XElement(NewElementName)) 
    End If 

    'if we find there are more than 1 elements at the deepest path we have access to, we can't proceed 
    If nodes.Count > 1 Then 
     Throw New ArgumentOutOfRangeException("There are too many paths that match your expression.") 
    End If 

    'if there is just one element, we can proceed 
    If nodes.Count = 1 Then 
     'just proceed 
    End If 

End Sub 

Public Sub CreateXPath(ByVal XDoc As XElement, ByVal XPath As String) 

    If XPath.Contains("//") Or XPath.Contains("*") Or XPath.Contains(".") Then 
     Throw New ArgumentException("Can't create a path based on searches, wildcards, or relative paths.") 
    End If 

    If Regex.IsMatch(XPath, "\[\]()@='<>\|") Then 
     Throw New ArgumentException("Can't create a path based on predicates.") 
    End If 

    'we will process this recursively. 
    NodeItterate(XDoc, XPath) 

End Sub 
3

這是我用過的一個簡單的方法,爲我工作。

static string GetXpath(XmlNode node) 
    { 
     if (node.Name == "#document") 
      return String.Empty; 
     return GetXpath(node.SelectSingleNode("..")) + "/" + (node.NodeType == XmlNodeType.Attribute ? "@":String.Empty) + node.Name; 
    } 
5

我知道,老的文章,但我喜歡的大多數(具有名稱)的版本是有缺陷的: 當父節點有不同的名稱節點,它停止計數指標後,最先找到的非匹配節點名稱。

這裏是我對它的修正版本:

有關使用類擴展
/// <summary> 
/// Gets the X-Path to a given Node 
/// </summary> 
/// <param name="node">The Node to get the X-Path from</param> 
/// <returns>The X-Path of the Node</returns> 
public string GetXPathToNode(XmlNode node) 
{ 
    if (node.NodeType == XmlNodeType.Attribute) 
    { 
     // attributes have an OwnerElement, not a ParentNode; also they have    
     // to be matched by name, not found by position    
     return String.Format("{0}/@{1}", GetXPathToNode(((XmlAttribute)node).OwnerElement), node.Name); 
    } 
    if (node.ParentNode == null) 
    { 
     // the only node with no parent is the root node, which has no path 
     return ""; 
    } 

    // Get the Index 
    int indexInParent = 1; 
    XmlNode siblingNode = node.PreviousSibling; 
    // Loop thru all Siblings 
    while (siblingNode != null) 
    { 
     // Increase the Index if the Sibling has the same Name 
     if (siblingNode.Name == node.Name) 
     { 
      indexInParent++; 
     } 
     siblingNode = siblingNode.PreviousSibling; 
    } 

    // the path to a node is the path to its parent, plus "/node()[n]", where n is its position among its siblings.   
    return String.Format("{0}/{1}[{2}]", GetXPathToNode(node.ParentNode), node.Name, indexInParent); 
} 
1

什麼? ;) 我的版本(建立在別人的工作)使用語法名稱[索引] ...與索引omited是元素沒有「兄弟」。 獲取元素索引的循環在獨立例程(也是類擴展)中是外部的。

剛剛過去的任何實用程序類下面(或者在程序主類)

static public int GetRank(this XmlNode node) 
{ 
    // return 0 if unique, else return position 1...n in siblings with same name 
    try 
    { 
     if(node is XmlElement) 
     { 
      int rank = 1; 
      bool alone = true, found = false; 

      foreach(XmlNode n in node.ParentNode.ChildNodes) 
       if(n.Name == node.Name) // sibling with same name 
       { 
        if(n.Equals(node)) 
        { 
         if(! alone) return rank; // no need to continue 
         found = true; 
        } 
        else 
        { 
         if(found) return rank; // no need to continue 
         alone = false; 
         rank++; 
        } 
       } 

     } 
    } 
    catch{} 
    return 0; 
} 

static public string GetXPath(this XmlNode node) 
{ 
    try 
    { 
     if(node is XmlAttribute) 
      return String.Format("{0}/@{1}", (node as XmlAttribute).OwnerElement.GetXPath(), node.Name); 

     if(node is XmlText || node is XmlCDataSection) 
      return node.ParentNode.GetXPath(); 

     if(node.ParentNode == null) // the only node with no parent is the root node, which has no path 
      return ""; 

     int rank = node.GetRank(); 
     if(rank == 0) return String.Format("{0}/{1}",  node.ParentNode.GetXPath(), node.Name); 
     else   return String.Format("{0}/{1}[{2}]", node.ParentNode.GetXPath(), node.Name, rank); 
    } 
    catch{} 
    return ""; 
} 
1

我公司生產的VBA爲Excel這樣做的工作項目。它輸出Xpath的元組和元素或屬性的相關文本。目的是讓業務分析員識別和映射一些XML。欣賞這是一個C#論壇,但認爲這可能是有趣的。

Sub Parse2(oSh As Long, inode As IXMLDOMNode, Optional iXstring As String = "", Optional indexes) 


Dim chnode As IXMLDOMNode 
Dim attr As IXMLDOMAttribute 
Dim oXString As String 
Dim chld As Long 
Dim idx As Variant 
Dim addindex As Boolean 
chld = 0 
idx = 0 
addindex = False 


'determine the node type: 
Select Case inode.NodeType 

    Case NODE_ELEMENT 
     If inode.ParentNode.NodeType = NODE_DOCUMENT Then 'This gets the root node name but ignores all the namespace attributes 
      oXString = iXstring & "//" & fp(inode.nodename) 
     Else 

      'Need to deal with indexing. Where an element has siblings with the same nodeName,it needs to be indexed using [index], e.g swapstreams or schedules 

      For Each chnode In inode.ParentNode.ChildNodes 
       If chnode.NodeType = NODE_ELEMENT And chnode.nodename = inode.nodename Then chld = chld + 1 
      Next chnode 

      If chld > 1 Then '//inode has siblings of the same nodeName, so needs to be indexed 
       'Lookup the index from the indexes array 
       idx = getIndex(inode.nodename, indexes) 
       addindex = True 
      Else 
      End If 

      'build the XString 
      oXString = iXstring & "/" & fp(inode.nodename) 
      If addindex Then oXString = oXString & "[" & idx & "]" 

      'If type is element then check for attributes 
      For Each attr In inode.Attributes 
       'If the element has attributes then extract the data pair XString + Element.Name, @Attribute.Name=Attribute.Value 
       Call oSheet(oSh, oXString & "/@" & attr.Name, attr.Value) 
      Next attr 

     End If 

    Case NODE_TEXT 
     'build the XString 
     oXString = iXstring 
     Call oSheet(oSh, oXString, inode.NodeValue) 

    Case NODE_ATTRIBUTE 
    'Do nothing 
    Case NODE_CDATA_SECTION 
    'Do nothing 
    Case NODE_COMMENT 
    'Do nothing 
    Case NODE_DOCUMENT 
    'Do nothing 
    Case NODE_DOCUMENT_FRAGMENT 
    'Do nothing 
    Case NODE_DOCUMENT_TYPE 
    'Do nothing 
    Case NODE_ENTITY 
    'Do nothing 
    Case NODE_ENTITY_REFERENCE 
    'Do nothing 
    Case NODE_INVALID 
    'do nothing 
    Case NODE_NOTATION 
    'do nothing 
    Case NODE_PROCESSING_INSTRUCTION 
    'do nothing 
End Select 

'Now call Parser2 on each of inode's children. 
If inode.HasChildNodes Then 
    For Each chnode In inode.ChildNodes 
     Call Parse2(oSh, chnode, oXString, indexes) 
    Next chnode 
Set chnode = Nothing 
Else 
End If 

End Sub 

使用管理元素的計數:

Function getIndex(tag As Variant, indexes) As Variant 
'Function to get the latest index for an xml tag from the indexes array 
'indexes array is passed from one parser function to the next up and down the tree 

Dim i As Integer 
Dim n As Integer 

If IsArrayEmpty(indexes) Then 
    ReDim indexes(1, 0) 
    indexes(0, 0) = "Tag" 
    indexes(1, 0) = "Index" 
Else 
End If 
For i = 0 To UBound(indexes, 2) 
    If indexes(0, i) = tag Then 
     'tag found, increment and return the index then exit 
     'also destroy all recorded tag names BELOW that level 
     indexes(1, i) = indexes(1, i) + 1 
     getIndex = indexes(1, i) 
     ReDim Preserve indexes(1, i) 'should keep all tags up to i but remove all below it 
     Exit Function 
    Else 
    End If 
Next i 

'tag not found so add the tag with index 1 at the end of the array 
n = UBound(indexes, 2) 
ReDim Preserve indexes(1, n + 1) 
indexes(0, n + 1) = tag 
indexes(1, n + 1) = 1 
getIndex = 1 

End Function 
0

你的問題的另一種解決方案可能是「標記」,你會想以後有自定義屬性標識將XMLNode:

var id = _currentNode.OwnerDocument.CreateAttribute("some_id"); 
id.Value = Guid.NewGuid().ToString(); 
_currentNode.Attributes.Append(id); 

您可以將其存儲在字典中。 你可以稍後用XPath查詢識別的節點:

newOrOldDocument.SelectSingleNode(string.Format("//*[contains(@some_id,'{0}')]", id)); 

我知道這是不是直接回答你的問題,但它可以幫助,如果你想的理由知道的XPath的節點是在代碼中丟失對它的引用之後,有一種方法可以在'達到'節點。

這也克服了文檔獲取元素添加/移動時的問題,這可能會擾亂xpath(或索引,如其他答案中的建議)。

0
public static string GetFullPath(this XmlNode node) 
     { 
      if (node.ParentNode == null) 
      { 
       return ""; 
      } 
      else 
      { 
       return $"{GetFullPath(node.ParentNode)}\\{node.ParentNode.Name}"; 
      } 
     }