拆分日期時間的列表,我有以下CSV文件:Python的按年+月
# simulate a csv file
from StringIO import StringIO
data = StringIO("""
2012-04-01,00:10, A, 10
2012-04-01,00:20, B, 11
2012-04-01,00:30, B, 12
2012-04-02,00:10, A, 18
2012-05-02,00:20, A, 14
2012-05-02,00:30, B, 11
2012-05-03,00:10, A, 10
2012-06-03,00:20, B, 13
2012-06-03,00:30, C, 12
""".strip())
,我想通過年+月加類別gropu(即A,B,C)。
我想最終的數據有按月分組,然後由 類別作爲原始數據
2012-04, A
>> array[0,] => 2012-04-01,00:10, A, 10
>> array[3,] => 2012-04-02,00:10, A, 18
2012-04, B
>> array[1,] => 2012-04-01,00:20, B, 11
>> array[2,] => 2012-04-01,00:30, B, 12
2012-05, A
>> array[4,] => 2012-05-02,00:20, A, 14
...
然後爲每個組的看法,我想迭代使用相同的繪製它們功能。
我已經幾天 Split list of datetimes into days 看到分割一個類似的問題通過的日期,我可以這麼在我的情況下)。但有一些問題將其轉化爲情況b)中的年份+月份。
這裏是我遇到了,我至今這一問題的片段:
#! /usr/bin/python
import numpy as np
import csv
import os
from datetime import datetime
def strToDate(string):
d = datetime.strptime(string, '%Y-%m-%d')
return d;
def strToMonthDate(string):
d = datetime.strptime(string, '%Y-%m-%d')
d_by_month = datetime(d.year,d.month,1)
return d_by_month;
# simulate a csv file
from StringIO import StringIO
data = StringIO("""
2012-04-01,00:10, A, 10
2012-04-01,00:20, B, 11
2012-04-01,00:30, B, 12
2012-04-02,00:10, A, 18
2012-05-02,00:20, A, 14
2012-05-02,00:30, B, 11
2012-05-03,00:10, A, 10
2012-06-03,00:20, B, 13
2012-06-03,00:30, C, 12
""".strip())
arr = np.genfromtxt(data, delimiter=',', dtype=object)
# a) If we were to just group by dates
# Get unique dates
#keys = np.unique(arr[:,0])
#keys1 = np.unique(arr[:,2])
# Group by unique dates
#for key in keys:
# print key
# for key1 in keys1:
# group = arr[ (arr[:,0]==key) & (arr[:,2]==key1) ]
# if group.size:
# print "\t" + key1
# print group
# print "\n"
# b) But if we want to group by year+month in the dates
dates_by_month = np.array(map(strToMonthDate, arr[:,0]))
keys2 = np.unique(dates_by_month)
print dates_by_month
# >> [datetime.datetime(2012, 4, 1, 0, 0), datetime.datetime(2012, 4, 1, 0, 0), ...
print "\n"
print keys2
# >> [2012-04-01 00:00:00 2012-05-01 00:00:00 2012-06-01 00:00:00]
for key in keys2:
print key
print type(key)
group = arr[dates_by_month==key]
print group
print "\n"
問題:我得到每月的關鍵,但對於組,我得到的是[2012- 04-01 00:10 A 10]。鍵2中的鍵是datetime.datetime類型的鍵。任何想法可能是錯的?歡迎任何替代實施建議。我不想使用itertools.groupby解決方案,因爲它會返回一個迭代器而不是數組,這不太適合繪圖。
編輯1:問題解決。問題是我在事例b)中預先使用索引的dates_by_month應該初始化爲一個np.array,而不是一個映射返回dates_by_month = np.array(map(strToMonthDate,arr [:,0]))的列表。我已經將它修復在上面的代碼片段中,現在該示例正常工作。
請隨時接受您自己的答案。因此面臨同樣問題的未來用戶可以使用您的知識。 – Hyperboreus
@Hyperboreus,謝謝,我會等待兩天的限制,然後才允許我接受我的回答。 – frank
@frank我已經稍微改變了你的文章的格式,試圖使它更容易遵循...隨意調整/回滾... –