如何從XLSX文件使用python

enter image description here

，我想這個數據更改爲字典這樣的：

{ 
    0:{ 
     'a':1, 
     'b':100, 
     'c':2, 
     'd':10 
    }, 
    1:{ 
     'a':8, 
     'b':480, 
     'c':3, 
     'd':14 
    } 
... 
}

所以有人知道一個python lib做到這一點，並從行124開始，並且行141結束，

謝謝

來源

2011-04-02 zjm1126

您的第一個輸出字典具有來自第124和125行的數據;你的第二行有來自第126行的數據...請編輯你的問題。請確認您想要的數據列是B，C，E和G. – 2011-04-02 04:43:26

'xlrd'（自版本0.8.0開始）支持直接讀取'.xlsx'文件。（John Machin在他的回答中提到的「螺栓連接」模塊最終被合併到'xlrd'包中。）相關：http://stackoverflow.com/questions/4371163/reading-xlsx-files-using-python – 2013-03-05 15:53:13

我認爲你的意思是第一部分是'd：12';你的檔案有多大？ – 2014-02-27 12:42:03

與xlrd選項：

（1）您的XLSX文件看起來並不非常大;保存爲xls。（2）使用xlrd加上螺栓連接的beta測試模塊xlsxrd（找到我的電子郵件地址，並要求它）;使用xlrd加上螺栓接通beta測試模塊xlsxrd（找到我的電子郵件地址，並要求它）;使用xlrd。該組合將無縫讀取xls和xlsx文件中的數據（相同的API;它檢查文件內容以確定它是xls，xlsx還是冒名頂替者）。

在這兩種情況下，像下面的（未經測試）的代碼應該做你想要什麼：

from xlrd import open_workbook 
from xlsxrd import open_workbook 
# Choose one of the above 

# These could be function args in real live code 
column_map = { 
    # The numbers are zero-relative column indexes 
    'a': 1, 
    'b': 2, 
    'c': 4, 
    'd': 6, 
    } 
first_row_index = 124 - 1 
last_row_index = 141 - 1 
file_path = 'your_file.xls' 

# The action starts here 
book = open_workbook(file_path) 
sheet = book.sheet_by_index(0) # first worksheet 
key0 = 0 
result = {} 
for row_index in xrange(first_row_index, last_row_index + 1): 
    d = {} 
    for key1, column_index in column_map.iteritems(): 
     d[key1] = sheet.cell_value(row_index, column_index) 
    result[key0] = d 
    key0 += 1

來源

2011-04-02 05:04:29

另一種選擇是openpyxl。我一直想要嘗試一下，但還沒有開始嘗試，所以我不能說它有多好。

來源

2011-04-03 09:54:20 joshayers

自發布此答案以來，我有機會嘗試openpyxl。這很容易使用。我設法寫出了一個相當大的電子表格 - 20個標籤，每個標籤有200列和500行。該操作使用大約2GB的內存。它還有一個優化的僅附加作者，作者聲稱可以編寫無限大小的電子表格，但我還沒有理由嘗試。 – joshayers 2011-06-19 22:06:12

這是一個非常粗略的實現，只使用標準庫。

def xlsx(fname): 
    import zipfile 
    from xml.etree.ElementTree import iterparse 
    z = zipfile.ZipFile(fname) 
    strings = [el.text for e, el in iterparse(z.open('xl/sharedStrings.xml')) if el.tag.endswith('}t')] 
    rows = [] 
    row = {} 
    value = '' 
    for e, el in iterparse(z.open('xl/worksheets/sheet1.xml')): 
     if el.tag.endswith('}v'): # <v>84</v> 
      value = el.text 
     if el.tag.endswith('}c'): # <c r="A3" t="s"><v>84</v></c> 
      if el.attrib.get('t') == 's': 
       value = strings[int(value)] 
      letter = el.attrib['r'] # AZ22 
      while letter[-1].isdigit(): 
       letter = letter[:-1] 
      row[letter] = value 
     if el.tag.endswith('}row'): 
      rows.append(row) 
      row = {} 
    return dict(enumerate(rows))

來源

2014-02-27 12:14:32

假設你有過這樣的數據：

a,b,c,d 
1,2,3,4 
2,3,4,5 
...

在2014年的一個許多潛在的答案是：

import pyexcel 


r = pyexcel.SeriesReader("yourfile.xlsx") 
# make a filter function 
filter_func = lambda row_index: row_index < 124 or row_index > 141 
# apply the filter on the reader 
r.filter(pyexcel.filters.RowIndexFilter(filter_func)) 
# get the data 
data = pyexcel.utils.to_records(r) 
print data

現在的數據字典的數組：

[{ 
    'a':1, 
    'b':100, 
    'c':2, 
    'd':10 
}, 
{ 
    'a':8, 
    'b':480, 
    'c':3, 
    'd':14 
}... 
]

可以讀取文檔here

來源

2014-09-21 21:29:20 chfw

如何從XLSX文件使用python

回答

相關問題