熊貓據幀切片

我有以下數據框：熊貓據幀切片

df[df["Kategorie"] == "KGV"]

，輸出：

2012 2013 2014 2015 2016 2017 2018     Kategorie 
0 5.31 5.27 5.61 4.34 4.54 5.02 7.07 Gewinn pro Aktie in EUR 
1 13.39 14.70 12.45 16.29 15.67 14.17 10.08      KGV 
2 -21.21 -0.75 6.45 -22.63 -7.75 9.76 47.52   Gewinnwachstum 
3 -17.78 2.27 -0.55 3.39 1.48 0.34 NaN      PEG

現在，我有隻選擇KGV行

2012 2013 2014 2015 2016 2017 2018 Kategorie 
1 13.39 14.7 12.45 16.29 15.67 14.17 10.08  KGV

我如何計算過去五年的mean()（本例中爲2016,15,14,13,12）？
我試圖

df[df["Kategorie"] == "KGV"]["2016":"2012"].mean()

但這拋出一個TypeError。爲什麼我不能在這裏分欄？

來源

2016-09-17 Jan

爲什麼過去五年2012-2016？ –

一旦你開始嘗試用'__getitem__'（方括號索引）切片，[熊貓着眼於行（https://github.com/pydata/pandas/blob/master/pandas/core/frame。 py＃l2043）而不是列。此外，切片只能向前邁進。代替'：你在這個情況下，索引可以使用'df.loc [ 「2016」 DF [ 「Kategorie」] == 「KGV」，「2012」]來完成。 –

@AmiTavory：*最後* *從現在開始**向後***。沒有最後一個元素的最後一個。 – Jan

loc支持這種類型的切片（從左至右）的：

df.loc[df["Kategorie"] == "KGV", "2012":"2016"].mean(axis=1) 
Out: 
1 14.5 
dtype: float64

注意，這並不意味着2012年，2013年，2014年，2015年和2016年，這些都是字符串，這樣就意味着012之間的所有列和df['2016']。可能會有一列名爲foo的列，它將被選中。

來源

2016-09-17 14:50:39 ayhan

非常感謝！中間沒有「foo」列，每年都按列排序。 – Jan

不知道爲什麼過去五年是2012-2016（他們似乎是第一個五年）。儘管如此，找到了2012 - 2016年平均爲'KGV'，您可以使用

df[df['Kategorie'] == 'KGV'][[c for c in df.columns if c != 'Kategorie' and 2012 <= int(c) <= 2016]].mean(axis=1)

來源

2016-09-17 14:42:17

*最後* *從現在開始**向後***。爲什麼這應該與@ ayhan的做法相反？ – Jan

@Jan沒有特別的理由 - 我在他面前回答，但我更喜歡他。 –

我用filter和iloc

row = df[df.Kategorie == 'KGV'] 

row.filter(regex='\d{4}').sort_index(1).iloc[:, -5:].mean(1) 

1 13.732 
dtype: float64

來源

2016-09-17 14:43:53 piRSquared

熊貓據幀切片

回答

相關問題