2014-10-29 98 views
0

我有一個包含100列的csv文件。我想計算第4列至n的總和。我可以爲單個列生成總和,但是當我嘗試爲所有列嘗試失敗時。以下是我迄今爲止彙總CSV文件中的特定列

import decimal 
import numpy as np 
import os as os 
import csv as csv 
import re as re 
import sys 

col=10 
values=[] 
with open('test.csv', 'r') as f: 
    reader = csv.reader(f) 
    headers = reader.next() 
    for line in reader: 
    #print line 
     line = [int(i) for i in line] 
    col_totals = [sum(result) for result in zip(*line)] 
    print col_totals 
     #values.append(int(line[col])) 
     #csum=sum(values) 
    #print csum 

感謝,

+0

你要計算4-10列的總和,對於每一行?或者你想要計算所有行的第4列和所有行的第5列的和? – inspectorG4dget 2014-10-29 19:39:40

+0

你試過'減少',因爲它[這裏]解釋(https://docs.python.org/2/library/functions.html#reduce)。 – 2014-10-29 19:47:30

+1

是的我想要計算所有行的第4列和所有行的第5列的總和等? – learningcurve 2014-10-29 21:02:09

回答

0

如果要在連續的線來概括,這會做

i, j = 3, 5 

with open('test.csv', 'r') as f: 
    reader = csv.reader(f) 
    headers = reader.next() 
    table = list(reader) 
    sums = [sum(float(elt) for elt in col) for col in zip(*table)[i:j]] 

嘗試還包括以下

requested = [4, 7, 12, 13, 21, 81] 

with open('test.csv', 'r') as f: 
    reader = csv.reader(f) 
    headers = reader.next() 
    table = list(reader) 
    sums = [sum(float(elt) for elt in col) for i, col in enumerate(zip(*table)) if i in requested] 
+0

謝謝你的回覆。但是,當我嘗試這樣我得到TypeError:不支持的操作數類型爲+:'int'和'str' – learningcurve 2014-10-29 21:05:43

+0

呃,編輯中... – gboffi 2014-10-29 21:19:04

+0

@learnincurve我已經測試我的解決方案與_synthetic_ tabelle發佈之前,愉快地忘記了悲傷的真相,'csv.reader'只返回字符串。哦,我的壞!我會在一天或其他時間學習'熊貓'... – gboffi 2014-10-29 21:34:24

1

這在熊貓中是非常非常容易的:

import pandas as pd 
df = pd.read_csv(filename) 
df[df.columns[4:]].sum() 

,如果你想列的每行之,那就是:

df[df.columns[4:]].sum(1) 
+0

謝謝你的工作.. – learningcurve 2014-10-29 21:50:30

+0

謝謝,介意接受答案? – acushner 2014-10-30 18:21:34