Python的按年+月

拆分日期時間的列表，我有以下CSV文件：Python的按年+月

# simulate a csv file 
from StringIO import StringIO 
data = StringIO(""" 
2012-04-01,00:10, A, 10 
2012-04-01,00:20, B, 11 
2012-04-01,00:30, B, 12 
2012-04-02,00:10, A, 18 
2012-05-02,00:20, A, 14 
2012-05-02,00:30, B, 11 
2012-05-03,00:10, A, 10 
2012-06-03,00:20, B, 13 
2012-06-03,00:30, C, 12 
""".strip())

，我想通過年+月加類別gropu（即A，B，C）。

我想最終的數據有按月分組，然後由類別作爲原始數據

2012-04, A 

>> array[0,] => 2012-04-01,00:10, A, 10 

>> array[3,] => 2012-04-02,00:10, A, 18 

2012-04, B 

>> array[1,] => 2012-04-01,00:20, B, 11 

>> array[2,] => 2012-04-01,00:30, B, 12 

2012-05, A 

>> array[4,] => 2012-05-02,00:20, A, 14 

...

然後爲每個組的看法，我想迭代使用相同的繪製它們功能。

我已經幾天 Split list of datetimes into days 看到分割一個類似的問題通過的日期，我可以這麼在我的情況下）。但有一些問題將其轉化爲情況b）中的年份+月份。

這裏是我遇到了，我至今這一問題的片段：

#! /usr/bin/python 

import numpy as np 
import csv 
import os 
from datetime import datetime 

def strToDate(string): 
    d = datetime.strptime(string, '%Y-%m-%d') 
    return d; 

def strToMonthDate(string): 
    d = datetime.strptime(string, '%Y-%m-%d') 
    d_by_month = datetime(d.year,d.month,1) 
    return d_by_month; 

# simulate a csv file 
from StringIO import StringIO 
data = StringIO(""" 
2012-04-01,00:10, A, 10 
2012-04-01,00:20, B, 11 
2012-04-01,00:30, B, 12 
2012-04-02,00:10, A, 18 
2012-05-02,00:20, A, 14 
2012-05-02,00:30, B, 11 
2012-05-03,00:10, A, 10 
2012-06-03,00:20, B, 13 
2012-06-03,00:30, C, 12 
""".strip()) 

arr = np.genfromtxt(data, delimiter=',', dtype=object) 


# a) If we were to just group by dates 
# Get unique dates 
#keys = np.unique(arr[:,0]) 
#keys1 = np.unique(arr[:,2]) 
# Group by unique dates 
#for key in keys: 
# print key 
# for key1 in keys1:  
#  group = arr[ (arr[:,0]==key) & (arr[:,2]==key1) ]      
#  if group.size: 
#   print "\t" + key1 
#   print group 
# print "\n"  

# b) But if we want to group by year+month in the dates 
dates_by_month = np.array(map(strToMonthDate, arr[:,0])) 
keys2 = np.unique(dates_by_month) 
print dates_by_month 
# >> [datetime.datetime(2012, 4, 1, 0, 0), datetime.datetime(2012, 4, 1, 0, 0), ... 
print "\n" 
print keys2 
# >> [2012-04-01 00:00:00 2012-05-01 00:00:00 2012-06-01 00:00:00] 

for key in keys2: 
    print key  
    print type(key) 
    group = arr[dates_by_month==key] 
     print group 
    print "\n"

問題：我得到每月的關鍵，但對於組，我得到的是[2012- 04-01 00:10 A 10]。鍵2中的鍵是datetime.datetime類型的鍵。任何想法可能是錯的？歡迎任何替代實施建議。我不想使用itertools.groupby解決方案，因爲它會返回一個迭代器而不是數組，這不太適合繪圖。

編輯1：問題解決。問題是我在事例b）中預先使用索引的dates_by_month應該初始化爲一個np.array，而不是一個映射返回dates_by_month = np.array（map（strToMonthDate，arr [：，0]））的列表。我已經將它修復在上面的代碼片段中，現在該示例正常工作。

來源

2013-07-30 frank

我發現問題出在我原來的解決方案。

在情況b）中，

dates_by_month = map(strToMonthDate, arr[:,0])

返回一個列表，而不是numpy的陣列。提前索引：

group = arr[dates_by_month==key]

因此不起作用。如果相反，我有：

dates_by_month = np.array(map(strToMonthDate, arr[:,0]))

然後分組按預期工作。

來源

2013-07-30 02:01:01 frank

請隨時接受您自己的答案。因此面臨同樣問題的未來用戶可以使用您的知識。 – Hyperboreus

@Hyperboreus，謝謝，我會等待兩天的限制，然後才允許我接受我的回答。 – frank

@frank我已經稍微改變了你的文章的格式，試圖使它更容易遵循...隨意調整/回滾... –

Python的按年+月

回答

相關問題