使用python動態計算excel行

我正在編寫一個解析Excel文件的Python腳本。這個腳本的目的是在第1列計算每個單元格的值，它在列值的數量2.使用python動態計算excel行

每例子，看起來像這樣一個Excel文件：

12 abc 
12 abc 
12 efg 
12 efg 
13 hij 
13 hij 
13 klm

我的腳本會返回：

For cell value 12 : 2 values "abc", 2 values "efg" and for cell value 13 : 2 values "hij" and 1 value "klm".

我使用Python中的散列試過了，這裏就是我想要做的事：

import xlrd 
workbook = xlrd.open_workbook('myexcelfile.xls') 
worksheet = workbook.sheet_by_name('myexcelsheet') 
num_rows = worksheet.nrows - 1 
num_cells = worksheet.ncols - 1 
first_col = 0 
scnd_col = 1 
curr_row = 1 
hash = [] 
while curr_row < num_rows: 
curr_row += 1 
curr_cell = -1 
print 'IN ROW', curr_row 
while curr_cell < num_cells: 
     curr_cell += 1 
     print 'IN CELL', curr_cell 
     cell0_val = int(worksheet.cell_value(curr_row,first_col)) 
     cell1_val = worksheet.cell_value(curr_row,scnd_col) 
     print 'CELL VALUE', cell0_val, cell1_val 
     hash[cell0_val][cell1_val]+=1

我當然會以錯誤的方式使用這個散列，但我真的是Python的新手，並且我找不到任何符合我真正想要的好例子。任何幫助將非常感激。謝謝

來源

2012-11-26 salamey

你確定你正在解析_Excel_文件，而不是更像是'csv'或其他格式的東西嗎？我非常懷疑你能夠輕易地用Python解析一個'.xls'或'.xlsx'文件。 – jdotjdot

他使用'xlrd'，一個庫來讀取Excel文件。 –

你的意思是一個dictionary。
也許把每個鍵內的列表。首先它是hash = {}

並且如果只有兩列，則不需要第二個循環。只是這樣做

cell0_val = int(worksheet.cell_value(curr_row,first_col)) 
cell1_val = worksheet.cell_value(curr_row,scnd_col) 

if cell0_val in hash: 
    hash[cell0_val].append(cell1_val) 
else: 
    hash[cell0_val] = [cell1_val]

你應該得到類似hash= {12: ['abc', 'abc', 'efg', 'efg'], 13: ['hij', 'hij', 'klm']}

來源

2012-11-26 14:31:22

我會用一個雙層詞典：

所以你的字典中定義：

celldict =字典（）＃或celldict = {}

import xlrd 
workbook = xlrd.open_workbook('myexcelfile.xls') 
worksheet = workbook.sheet_by_name('myexcelsheet') 

num_rows = worksheet.nrows - 1 
num_cells = worksheet.ncols - 1 

first_col = 0 
scnd_col = 1 


# Read Data into double level dictionary 
celldict = dict() 
for curr_row in range(num_rows) : 

    #print 'IN ROW',curr_row 
    cell0_val = int(worksheet.cell_value(curr_row,first_col)) 
    cell1_val = worksheet.cell_value(curr_row,scnd_col) 

    # if this cell number isn't in my cell dict add it 
    if not cell0_val in celldict : 

     celldict[cell0_val] = dict() 

    # if the entry isn't in the second level dictionary then add it, with count 1 

    if not cell1_val in celldict[cell0_val] : 
     celldict[cell0_val][cell1_val] = 1 

    # Otherwise increase the count 
    else : 
     celldict[cell0_val][cell1_val] += 1 

# Outputs Dictionary hierachy 
print celldict 
# Outputs it more pretiliy 
for cellval in celldict : 
    print "For cell value ", cellval ,":" 
    for cellval2 in celldict[cellval] : 
     print cellval2," values", celldict[cellval][cellval2]

來源

2012-11-26 14:54:18 JPH

你可以也可以這樣做：

from itertools import groupby 
from operator import itemgetter 
from collections import Counter 
import xlrd 

workbook = xlrd.open_workbook('myexcelfile.xls') 
sheet = workbook.sheet_by_name('myexcelsheet') 

as_list = sorted([sheet.row_values(rownum) for rownum in range(sheet.nrows)], 
       key=itemgetter(1)) 

for cell_value, vals in groupby(as_list, itemgetter(0)): 
    letter_values = [v[1] for v in vals] 
    occurrences = dict(Counter(letter_values)) 

    print 'For cell value {}:'.format(int(cell_value)) 
    print ', '.join('{} values {}'.format(str(c), v) 
        for v, c in occurrences.items())

然後根據需要格式化輸出。

來源

2012-11-26 15:01:34

使用python動態計算excel行

回答

相關問題