如何在Pandas中忽略NaN值的數據框應用group by？

如果這太簡單了，我很抱歉，但是我已經搜索了Allot並找不到解決此問題的解決方案。如何在Pandas中忽略NaN值的數據框應用group by？

我填充我的數據幀（DF）如下：

weather = pd.read_csv(weather_path) 
weather_stn1 = weather[weather['Station'] == 1][['Tavg']] 
weather_stn2 = weather[weather['Station'] == 2][['Tavg']] 

df = pd.DataFrame(columns=['xAxis', 'yAxis1', 'yAxis2']) 
df['xAxis'] = pd.to_datetime(weather['Date']) 
df['yAxis1'] = weather_stn1['Tavg'] 
df['yAxis2'] = weather_stn2['Tavg']

我的數據幀如下：

 xAxis  yAxis1 yAxis2 
0 2009-05-01  53  NaN 
1 2009-05-01  NaN  55 
2 2009-05-02  55  NaN 
3 2009-05-02  NaN  55 
4 2009-05-03  57  NaN 
5 2009-05-03  NaN  58

，但我希望有我的結果如下：

 xAxis  yAxis1 yAxis2 
0 2009-05-01  53  55 
2 2009-05-02  55  55 
4 2009-05-03  57  58

我一直在致力於weather_stn1和weather_stn2的重新編制和應用group by但它不工作，因爲我想去做。它最終與我什麼都沒有顯示！

我該如何解決這個問題？

感謝您提前分配您的時間。

來源

2015-05-26 Amir

夥計們我自己找到了解決方案，萬一別人卡住了，這會有所幫助。

df = pd.DataFrame(columns=['xAxis', 'yAxis1', 'yAxis2']) 
df['xAxis'] = pd.to_datetime(weather['Date']) 
df['yAxis1'] = weather_stn1['Tavg'] 
df['yAxis2'] = weather_stn2['Tavg'] 

plot_df = plot_df.groupby(plot_df['xAxis']).mean() 

print plot_df.reset_index()

現在我的輸出是：

  xAxis yAxis1 yAxis2 
0 2009-05-01  53  55 
1 2009-05-02  55  55 
2 2009-05-03  57  58 
3 2009-05-04  57  60 
4 2009-05-05  60  62 
5 2009-05-06  63  66

這簡單的很！

來源

2015-05-26 17:10:08 Amir

您真正想要做的是旋轉表格，以便station列中的值成爲列標題。試試這個：

df = weather.pivot(index='Date', columns='Station', values='Tavg')

如果沒有超過一個記錄每個站對於每一日期，那麼你會得到你想要的東西，只是日期將是指數而不是列。如果你喜歡，你可以重置索引並更改列名。

來源

2015-05-26 19:14:12 JoeCondron

如何在Pandas中忽略NaN值的數據框應用group by？

回答

相關問題