2012-03-30 157 views
0

您好,我是Python的新用戶,我遇到了問題,正在做我認爲是相當基本的任務。彙總每日數據以計算月平均數

我有幾個包含每日積雪深度數據的(> 50)csv文件。我想遍歷csv文件並計算每月的雪深度。數據示例:

Date,SD 
1/1/2000,36 
1/2/2000,36 
1/3/2000,38 
1/4/2000,40 
2/1/2000,48 
2/2/2000,48 

換句話說,我想計算每月積雪深度平均值並將輸出結果寫入新的csv文件。我能夠修改我的數據的一個不同的代碼示例,但我正在接收使用Date作爲我的字典中的鍵值的關鍵錯誤。

有什麼建議嗎?

到目前爲止的代碼:

from __future__ import division 
import csv 
from collections import defaultdict 

def default_factory(): 
    return [0, None, None, 0] 

reader = csv.DictReader(open(r'C:\SandBox\VALIDATION\TestTable.csv')) 

dates = defaultdict(default_factory) 
for row in reader: 
    sd = int(row["SD"]) 
    dates[row["Dates"]][0] += sd 
    max = dates[row["Dates"]][1] 
    dates[row["Dates"]][1] = amount if max is None else amount if amount > max else max 
    min = dates[row["Date"]][2] 
    dates[row["Dates"]][2] = amount if min is None else amount if amount < min else min 
    dates[row["Dates"]][3] += 1 

for date in dates: 
    dates[date][3] = dates[date][0]/dates[date][3] 

writer = csv.writer(open(r'C:\SandBox\VALIDATION\TestAvg.csv', 'w', newline = '')) 
writer.writerow(["Date", "SD", "max", "min", "mean"]) 
writer.writerows([date] + dates[date] for date in dates) 

編輯:只是爲了澄清,我想實現每月總平均,即一月平均,平均日等..不計算平均單個日期。

+2

你可以發佈整個stacktrace/error嗎? – jgritty 2012-03-30 20:29:16

+2

如果您計算的是平均值而非中位數,您爲什麼關心最小值和最大值? – jgritty 2012-03-30 20:37:18

+1

日期,雪深或日期,SD? – WolframH 2012-03-30 20:42:58

回答

0

你可能想使用字典使代碼更易讀。

from __future__ import division 
import csv 
from collections import defaultdict 

def default_factory(): 
    return { "sum": 0, "max": None, "min": None, "count": 0} 

reader = csv.DictReader(open(r'sd.csv')) 

dates = defaultdict(default_factory) 
rows = [] 
for row in reader: 
    date = row["Date"] 
    sd = int(row["Snowdepth"]) 
    rows.append([date, sd]) 
    month = date.split("/")[0] 
    r = dates[month] 
    r["sum"] += sd 
    max = r["max"] 
    r["max"] = sd if max is None else sd if sd > max else max 
    min = r["min"] 
    r["min"] = sd if min is None else sd if sd < min else min 
    r["count"] += 1 

for date in dates: 
    r = dates[date] 
    r["avg"] = r["sum"]/r["count"] 

writer = csv.writer(open(r'TestAvg.csv', 'w')) 
writer.writerow(["Date", "SD", "max", "min", "mean"]) 
for row in rows: 
    r = dates[row[0].split("/")[0]] 
    writer.writerow(row + [r["max"], r["min"], r["avg"]]) 
+0

謝謝Gebb,工作得很好! – 2012-03-30 21:56:20

0

Someplaces您已經使用Dates作爲列名(例如max = dates[row["Dates"]][1])和其他地方它是Date(例如min = dates[row["Date"]][2]),從你的示例數據看起來像Date是列名?所以如果你在任何地方使用相同的名字,它應該可以。

s="""Date,Snowdepth 
1/1/2000,36 
1/2/2000,36 
1/3/2000,38 
1/4/2000,40 
2/1/2000,48 
2/2/2000,48""" 

import StringIO 
import csv 
reader = csv.DictReader(StringIO.StringIO(s)) 

for row in reader: 
    print row['Date'] 

輸出:

1/1/2000 
1/2/2000 
1/3/2000 
1/4/2000 
2/1/2000 
2/2/2000 
0
from __future__ import division 
import csv 
from collections import defaultdict 

def default_factory(): 
    return [0, None, None, 0] 

reader = csv.DictReader(open(r'snow_data.csv')) 

dates = defaultdict(default_factory) 

for row in reader: 
    amount = int(row["Snowdepth"]) 
    dates[row["Date"]][0] += amount 
    max = dates[row["Date"]][1] 
    dates[row["Date"]][1] = amount if max is None else amount if amount > max else max 
    min = dates[row["Date"]][2] 
    dates[row["Date"]][2] = amount if min is None else amoun if amount < min else min 
    dates[row["Date"]][3] += 1 


for date in dates: 
    dates[date][3] = dates[date][0]/dates[date][3] 

writer = csv.writer(open(r'TestAvg.csv', 'w')) 
writer.writerow(["Date", "Snowdepth", "max", "min", "mean"]) 
writer.writerows([date] + dates[date] for date in dates) 

我固定的代碼使用DateSnowdepth任何地方,那是你的樣本CSV提供什麼。此外,您有一個變量amount,該變量旨在爲sd,否則數量未定義。我到處都是那個amount

它不會給出非常令人興奮的結果,除非您在單個日期有多個條目。

例如,這裏是從你的樣本CSV輸出:

Date,Snowdepth,max,min,mean 

1/3/2000,38,38,38,38.0 

2/2/2000,48,48,48,48.0 

2/1/2000,48,48,48,48.0 

1/4/2000,40,40,40,40.0 

1/1/2000,36,36,36,36.0 

1/2/2000,36,36,36,36.0 
+0

我想你誤解了我的問題。我希望達到每月平均值(即一月平均值36.6667)而非日平均值。 – 2012-03-30 21:02:22

+0

哦,對了,我完全錯過了那部分。 – jgritty 2012-03-30 23:32:05

相關問題