閱讀Excel xml到字典

我想讀簡單的excel xml文件到字典。我試圖使用xlrd 7.1，但它返回格式錯誤。現在我試圖使用xml.etree.ElementTree，也沒有成功。我無法更改.xml文件的結構。在這裏我的代碼：閱讀Excel xml到字典

<?xml version="1.0" encoding="UTF-8"?> 
-<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns:html="http://www.w3.org/TR/REC-html40"> 
    -<Styles> 
    -<Style ss:Name="Normal" ss:ID="Default"> 
     <Alignment ss:Vertical="Bottom"/> 
     <Borders/> 
     <Font ss:FontName="Verdana"/> 
     <Interior/> 
     <NumberFormat/> 
     <Protection/> 
    </Style> -<Style ss:ID="s22"> 
     <NumberFormat ss:Format="General Date"/> 
    </Style> 
    </Styles> -<Worksheet ss:Name="Linkfeed"> 
    -<Table> 
     -<Row> 
     -<Cell> 
      <Data ss:Type="String">ID</Data> 
     </Cell> -<Cell> 
      <Data ss:Type="String">URL</Data> 
     </Cell> 
     </Row> -<Row> 
     -<Cell> 
      <Data ss:Type="String">22222</Data> 
     </Cell> -<Cell> 
      <Data ss:Type="String">Hello there</Data> 
     </Cell> 
     </Row> 
    </Table> 
    </Worksheet> 
</Workbook>

閱讀：

import xml.etree.cElementTree as etree 

def xml_to_list(fname): 
     with open(fname) as xml_file: 
       tree = etree.parse(xml_file) 

       for items in tree.getiterator(tag="Table"): 
         for item in items: # Items is None! 
           print item.text

更新，現在它的工作原理，但如何排除垃圾？

def xml_to_list(fname): 
     with open(fname) as xml_file: 
       tree = etree.iterparse(xml_file) 
       for item in tree: 
         print item[1].text

來源

2011-11-21 User

什麼「垃圾」你在說什麼？ – Constantinius

樹中的空項目 – User

對不起，我仍然無法找到你的問題。也許你可以澄清什麼是錯的。我無法找到任何語法錯誤，並且您使用'etree'似乎也是正確的。 – Constantinius

排除「垃圾」與if語句：

def xml_to_list(fname): 
    with open(fname) as xml_file: 
      tree = etree.iterparse(xml_file) 
      for item in tree: 
       if item[1].text.strip() != '-': 
         print item[1].text

來源

2011-11-21 16:56:28

謝謝，做到了。如果我在分析之前清理原始xml會怎麼樣？ – User

我想添加額外的支票if item[1].text and item[1].text.strip() != '-': –

閱讀Excel xml到字典

回答

相關問題