2016-11-15 77 views
1

我需要每天檢查一下索引值的列表,爲了方便閱讀,我把它們放到了一個DataFrame中。我使用Python 2.7將循環中的結果寫入python的數據框中

首先,我輸出我的回答到一個列表:

index_list = [df1,df2,df3,df4,df5,df6,df7] 
value_list = [20,22,28,29,30,31,32,33] 
myarray = [] 

def minimum(dataframe,value): 
    return dataframe['Datetime'][(dataframe["IDXType"] == value)].min() 

for i in index_list: 
    for value_i in value_list: 
     myarray.append(minimum(i,value_i)) 

這將輸出56個鏡頭長長的清單,然後我把它的數據幀,手動。

result = {'df1':pd.Series(myarray[0:8], index=value_list), 
    'df2':pd.Series(myarray[8:16], index=value_list), 
    'df3':pd.Series(myarray[16:24], index=value_list), 
    'df4':pd.Series(myarray[24:32], index=value_list), 
    'df5':pd.Series(myarray[32:40], index=value_list), 
    'df6':pd.Series(myarray[40:48], index=value_list), 
    'df7':pd.Series(myarray[48:56], index=value_list), 
    } 
result = pd.DataFrame(result) 
result 

它顯示8 * 7數據幀。像下面這樣:

Expected Result 我想問一下這個程序是否有捷徑? 像,直接把我的結果從循環到數據框?

我的清單不斷增長,因此我無法每隔一天修復我的代碼。

+0

'index_list'是'與列DataFrames'的''list'日期時間'和'IDXType'? – jezrael

+0

index是包含列的DataFrame的列表。 Datetime和IDXType是我必須在原始源數據框中檢查的兩列。 –

回答

0

您可以使用:

df1 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'), 
        'IDXType':[20,20,33,33,33]}) 

print (df1) 
    Datetime IDXType 
0 2015-01-04  20 
1 2015-01-05  20 
2 2015-01-06  33 
3 2015-01-07  33 
4 2015-01-08  33 

df2 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'), 
        'IDXType':[30,30,21,21,10]}) 

print (df2) 
    Datetime IDXType 
0 2015-01-04  30 
1 2015-01-05  30 
2 2015-01-06  21 
3 2015-01-07  21 
4 2015-01-08  10 

df3 = pd.DataFrame({'Datetime':pd.date_range('2015-01-04','2015-01-08'), 
        'IDXType':[20,20,30,31,31]}) 

print (df3) 
    Datetime IDXType 
0 2015-01-04  20 
1 2015-01-05  20 
2 2015-01-06  30 
3 2015-01-07  31 
4 2015-01-08  31 
index_list = [df1,df2,df3] 
value_list = [20,22,28,29,30,31,32,33] 
myarray = [] 
def minimum(dataframe,value): 
    return dataframe.loc[dataframe["IDXType"] == value, 'Datetime'].min() 
for i in index_list: 
    for value_i in value_list: 
     myarray.append(minimum(i,value_i)) 
#print (myarray)   

result = { 
'df1':pd.Series(myarray[0:8], index=value_list), 
'df2':pd.Series(myarray[8:16], index=value_list), 
'df3':pd.Series(myarray[16:24], index=value_list) 
} 
result = pd.DataFrame(result) 
print (result) 
      df1  df2  df3 
20 2015-01-04  NaT 2015-01-04 
22  NaT  NaT  NaT 
28  NaT  NaT  NaT 
29  NaT  NaT  NaT 
30  NaT 2015-01-04 2015-01-06 
31  NaT  NaT 2015-01-07 
32  NaT  NaT  NaT 
33 2015-01-06  NaT  NaT 

我的解決方案與groupby和聚合minconcatreindex和最後刪除index namerename_axis(新中pandas0.18.0):

print (df1.groupby('IDXType')['Datetime'].min()) 
IDXType 
20 2015-01-04 
33 2015-01-06 
Name: Datetime, dtype: datetime64[ns] 

df = pd.concat([df1.groupby('IDXType')['Datetime'].min(), 
       df2.groupby('IDXType')['Datetime'].min(), 
       df3.groupby('IDXType')['Datetime'].min()], 
       axis=1, 
       keys=('df1','df2','df3')).reindex(value_list).rename_axis(None) 
print (df)  
      df1  df2  df3 
20 2015-01-04  NaT 2015-01-04 
22  NaT  NaT  NaT 
28  NaT  NaT  NaT 
29  NaT  NaT  NaT 
30  NaT 2015-01-04 2015-01-06 
31  NaT  NaT 2015-01-07 
32  NaT  NaT  NaT 
33 2015-01-06  NaT  NaT 

您還可以使用更動態的解決方案 - 在concat使用list comprehension,但需要在新df5添加新的名單列名:

index_list = [df1,df2,df3] 
value_list = [20,22,28,29,30,31,32,33] 
namesdf = ['df1','df2','df3'] 
df5 = pd.concat([x.groupby('IDXType')['Datetime'].min() for x in index_list], 
       axis=1, 
       keys=namesdf).reindex(value_list).rename_axis(None) 
print (df5) 
      df1  df2  df3 
20 2015-01-04  NaT 2015-01-04 
22  NaT  NaT  NaT 
28  NaT  NaT  NaT 
29  NaT  NaT  NaT 
30  NaT 2015-01-04 2015-01-06 
31  NaT  NaT 2015-01-07 
32  NaT  NaT  NaT 
33 2015-01-06  NaT  NaT 
+0

感謝reindex和concat的想法。我遇到的最大問題是如何直接寫入數據框,而不是轉換爲現有數據(這意味着我必須每天修改數據框的大小/名稱等)。我需要幫助:loop-> list-> dataframe to loop-> dataframe。 –

+0

嗯,但你爲什麼需要循環?熊貓是最好的避免所有循環。看來我不明白爲什麼我的解決方案不好,你能解釋一下嗎? – jezrael

+0

好吧,我必須從數據框中讀取數據幀(在index_list中),以便在該列等於特定值(在value_list中)時在每個特定列(此處爲「IDXType」)中查找最小值...而且我不知道其他方法,因此我使用嵌套循環...這可能是一個壞主意,您是否有任何其他方法? –