問題分析在c＃中的XML文檔

我想從XML文檔中的特定元素獲取innertertext，通過一個字符串傳入，我不知道爲什麼它沒有找到任何節點。問題分析在c＃中的XML文檔

此代碼運行良好，但從未進入FOREACH循環，因爲ocNodesCompany和ocNodesOrgs都具有xero元素。爲什麼GetElementsByTagName找不到節點？

順便說一句我也試過：

XmlNodeList ocNodesOrgs = thisXmlDoc.SelectNodes("//OpenCalaisSimple/CalaisSimpleOutputFormat/Company")

代碼：

public static ArrayList getTwitterHandles(String ocXML) 
{ 
    ArrayList thisList = new ArrayList(); 
    XmlDocument thisXmlDoc = new XmlDocument(); 
    thisXmlDoc.LoadXml(ocXML); 

    //get Companies 
    XmlNodeList ocNodesCompany = thisXmlDoc.GetElementsByTagName("Company"); 
    foreach (XmlElement element in ocNodesCompany) 
    { 
     thisList.Add(element.InnerText); 
    } 

    //Get Organisations 
    XmlNodeList ocNodesOrgs = thisXmlDoc.GetElementsByTagName("Organization"); 
    foreach (XmlElement element in ocNodesOrgs) 
    { 
     thisList.Add(element.InnerText); 
    } 

    //Get Organisations 

    return thisList; 
}

我的XML字符串是：

<!--Use of the Calais Web Service is governed by the Terms of Service located at http://www.opencalais.com. By using this service or the results of the service you agree to these terms of service.--><!-- Company: BBC,T-mobile,Vodafone,GE, IndustryTerm: open calais services, Organization: Federal Bureau of Investigation,Red Cross,Greenpeace,Royal Navy,--> 

<OpenCalaisSimple> 
    <Description> 
     <calaisRequestID>38cb8898-48ba-85ff-12e9-f8d629568428</calaisRequestID> 
     <id>http://id.opencalais.com/lt0Hf8XWIr2DNIJzNlaXlA</id> 
     <about>http://d.opencalais.com/dochash-1/ff929eb2-de43-3ed1-8ee4-6109abf6bf77</about> 
     <docTitle/> 
     <docDate>2011-03-10 06:36:08.646</docDate> 
     <externalMetadata/> 
    </Description> 
    <CalaisSimpleOutputFormat> 
     <Company count="1" relevance="0.603" normalized="British Broadcasting Corporation">BBC</Company> 
     <Company count="1" relevance="0.603" normalized="T-MOBILE NETHERLANDS HOLDING B.V.">T-mobile</Company> 
     <Company count="1" relevance="0.603" normalized="Vodafone Group Plc">Vodafone</Company> 
     <Company count="1" relevance="0.603" normalized="General Electric Company">GE</Company> 
     <IndustryTerm count="1" relevance="0.603">open calais services</IndustryTerm> 
     <Organization count="1" relevance="0.603">Red Cross</Organization> 
     <Organization count="1" relevance="0.603">Greenpeace</Organization> 
     <Organization count="1" relevance="0.603">Royal Navy</Organization> 
     <Topics> 
      <Topic Taxonomy="Calais" Score="0.899">Human Interest</Topic> 
      <Topic Taxonomy="Calais" Score="0.694">Technology_Internet</Topic> 
     </Topics> 
    </CalaisSimpleOutputFormat> 
</OpenCalaisSimple>

來源

2011-03-11 Ben Drury

看起來像你應該使用XPath查詢來獲取你想要接收的元素。你可以閱讀一下[這裏] [1] [1]：http://msdn.microsoft.com/en-us/library/bb156083.aspx – madcyree 2011-03-11 09:25:34

這可能是一個愚蠢的問題，但有您調查了XML Document對象的InnerXml以驗證它作爲參數傳遞時是否正確加載？ – FarligOpptreden 2011-03-11 09:32:14

我測試了這個，代碼似乎正常工作。檢查您在ocXML參數中是否有正確的數據，並確認它已正確加載。 – mgronber 2011-03-11 09:41:39

好像你應該使用XPath查詢來獲取元素，你想要收到。您可以閱讀關於它的信息here

來源

2011-03-11 09:25:34 madcyree

不是答案，這是一條評論。 – 2011-03-11 09:26:51

這應該是一個評論，而不是答案 – Manatherin 2011-03-11 09:28:07

想要說的一樣。這構成了反對票。 – FarligOpptreden 2011-03-11 09:28:17

請注意，Microsoft建議您也使用XPath，這裏是他們的GetElementsByTag方法幫助頁面，並注意中間的註釋，建議使用SelectNodes（XPath）。

http://msdn.microsoft.com/en-us/library/dc0c9ekk.aspx

你的方法，使用XPath編寫的變化，應該是：

public static ArrayList getTwitterHandles(String ocXML) 
{ 
    ArrayList thisList = new ArrayList(); 
    XmlDocument thisXmlDoc = new XmlDocument(); 
    thisXmlDoc.LoadXml(ocXML); 

    //get Companies 
    XmlNodeList ocNodesCompany = thisXmlDoc.SelectNodes("//Company"); 
    foreach (XmlElement element in ocNodesCompany) 
    { 
     thisList.Add(element.InnerText); 
    } 

    //Get Organisations 
    XmlNodeList ocNodesOrgs = thisXmlDoc.SelectNodes("//Organization"); 
    foreach (XmlElement element in ocNodesOrgs) 
    { 
     thisList.Add(element.InnerText); 
    } 

    //Get Organisations 

    return thisList; 
}

注意上面實現了什麼，我相信是你在你的榜樣具備的功能 - 這是不是很和你嘗試過的xpath一樣。基本上在XPath中，「//」表示任何父節點，所以「//公司」將選取您傳入的具有公司名稱的根的任何子節點。

如果您只需要特定公司的節點，那麼你就可以更具體：

XmlNodeList ocNodesCompany = thisXmlDoc.SelectNodes("//Company");

成爲

XmlNodeList ocNodesCompany = thisXmlDoc.SelectNodes("/OpenCalaisSimple/CalaisSimpleOutputFormat/Company");

注意的關鍵區別是，只有一個前鋒在開始削減。

我剛剛測試了這兩種變化，他們工作得很好。

如果您正在處理XML文件，那麼我強烈建議您仔細閱讀併成爲XPath的專家，這對於允許您快速編寫代碼以便通過XML文件進行解析並準確挑選出您的內容非常方便需要（雖然我應該加上它不是唯一的方法來做它，它當然是不適合的所有當然情況:)）

希望這可以幫助。

來源

2011-03-11 09:48:18

謝謝。我認爲你是對的，多讀一些選項。操作XML的新手。 – 2011-03-11 09:54:33

也可以使用從System.Xml.Linq命名空間。以下片段幾乎等同於您的代碼。返回類型是List<string>而不是ArrayList。

public static List<string> getTwitterHandles(String ocXml) 
    var xml = XDocument.Parse(ocXml); 
    var list = xml.Descendants("Company") 
      .Concat(xml.Descendants("Organization")) 
      .Select(element => element.Value) 
      .ToList(); 
    return list; 
}

來源

2011-03-11 10:19:36 mgronber

問題分析在c＃中的XML文檔

回答

相關問題