pandas：groupby造成的不需要的格式結果...我如何groupby（）。sum（）提供表格結構

我搜索了熊貓文檔，很不幸，我找不到答案。pandas：groupby造成的不需要的格式結果...我如何groupby（）。sum（）提供表格結構

從本質上講，一些數據的爭吵之後，我有數據幀

ticker_id   close_date   sector sector_index 
0   1 2014-02-28 00:00:00 Consumer Goods  31.106653 
1   1 2014-02-27 00:00:00 Consumer Goods  30.951213 
2   2 2014-02-28 00:00:00 Consumer Goods  19.846387 
3   2 2014-02-27 00:00:00 Consumer Goods  19.671747 
4   3 2014-02-28 00:00:00 Consumer Goods 1208.552000 
5   3 2014-02-27 00:00:00 Consumer Goods 1193.352000 
6   4 2014-02-28 00:00:00 Consumer Goods  9.893989 
7   4 2014-02-27 00:00:00 Consumer Goods  9.857385 
8   5 2014-02-28 00:00:00 Consumer Goods  52.196757 
9   5 2014-02-27 00:00:00 Consumer Goods  53.101520 
10   6 2014-02-28 00:00:00   Services  5.449554 
11   6 2014-02-27 00:00:00   Services  5.440019 
12   7 2014-02-28 00:00:00 Basic Materials 4149.237000 
13   7 2014-02-27 00:00:00 Basic Materials 4130.704000

我GROUPBY

df_all2 = df_all.groupby(['close_date','sector']).sum() 
print df_all2

RAN和結果是這個

      ticker_id sector_index 
close_date sector         
2014-02-27 Basic Materials   7 4130.704000 
      Consumer Goods   15 1306.933865 
      Services     6  5.440019 
2014-02-28 Basic Materials   7 4149.237000 
      Consumer Goods   15 1321.595786 
      Services     6  5.449554

但在這種形式下，我無法正確上傳到MySQL。所以爲了正確地上傳到MySQL，我需要做這個和其他一些事情。

data2 = list(tuple(x) for x in df_all2.values)

但data2沒有意義的垃圾。

爲了長話短說，我怎樣才能讓groupby給我以下結果（其中close_date全部填寫正確且列標題是表格）。

close_date sector   ticker_id sector_index 
2014-02-27 Basic Materials   7 4130.704000 
2014-02-27 Consumer Goods   15 1306.933865 
2014-02-27 Services     6  5.440019 
2014-02-28 Basic Materials   7 4149.237000 
2014-02-28 Consumer Goods   15 1321.595786 
2014-02-28 Services     6  5.449554

此外，爲幫助社會，我應該怎麼修改標題所以，面對這個問題可以找到解決方案，也是其他熊貓的用戶？我非常感謝你的幫助。

來源

2014-03-13 vt2424253

你必須reset_index對多指標使用to_sql *前：

In [11]: df.groupby(['close_date','sector']).sum().reset_index() 
Out[11]: 
    close_date   sector ticker_id sector_index 
0 2014-02-27 Basic Materials   7 4130.704000 
1 2014-02-27 Consumer Goods   15 1306.933865 
2 2014-02-27   Services   6  5.440019 
3 2014-02-28 Basic Materials   7 4149.237000 
4 2014-02-28 Consumer Goods   15 1321.595786 
5 2014-02-28   Services   6  5.449554

或者您可以使用as_index =假在GROUPBY：

In [12]: df.groupby(['close_date','sector'], as_index=False).sum() 
Out[12]: 
    close_date   sector ticker_id sector_index 
0 2014-02-27 Basic Materials   7 4130.704000 
1 2014-02-27 Consumer Goods   15 1306.933865 
2 2014-02-27   Services   6  5.440019 
3 2014-02-28 Basic Materials   7 4149.237000 
4 2014-02-28 Consumer Goods   15 1321.595786 
5 2014-02-28   Services   6  5.449554

*注：這應該從固定0.14以上，即你應該能夠保存一個MultiIndex到SQL。

請參閱How to insert pandas dataframe via mysqldb into database?。

來源

2014-03-13 01:01:15

非常感謝答案和「如何將大熊貓插入mysqldb」的鏈接。我無法使它工作，所以我使用pymysql軟件包。你知道大熊貓是否可以使用pymysql？ – vt2424253

@ vt2424253我想這裏的一些用戶已經說過了。 –

p.s.如果它有幫助，不要忘記upvote/accept！ –

pandas：groupby造成的不需要的格式結果...我如何groupby（）。sum（）提供表格結構

回答

相關問題