2017-05-03 81 views
1

根據B中的值,每個A的前兩個C值是多少?熊貓數據框:按A分組,取最大B,輸出C

df = pd.DataFrame({ 
      'A': ["first","second","second","first", 
         "second","first","third","fourth", 
         "fifth","second","fifth","first", 
         "first","second","third","fourth","fifth"], 
      'B': [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7], 
      'C': ["a", "b", "c", "d", 
        "e", "f", "g", "h", 
        "i", "j", "k", "l", 
        "m", "n", "o", "p", "q"]}) 

我想

x = df.groupby(['A'])['B'].nlargest(2) 

    A 
    fifth 16 7 
      10 4 
    first 12 6 
      11 5 
    fourth 15 7 
      7  3 
    second 13 6 
      9  4 
    third 14 6 
      6  3 

但這種下降C列,那就是實際價值,我需要。

我想要結果中的C,而不是原始df的行索引。我必須加入備份嗎?我甚至可以花一點點C單獨的列表...

我需要作用於頂部2 C值(基於B)爲每A.

回答

2

IIUC:

In [42]: df.groupby(['A'])['B','C'].apply(lambda x: x.nlargest(2, columns=['B']) 
Out[42]: 
      B C 
A 
fifth 16 7 q 
     10 4 k 
first 12 6 m 
     11 5 l 
fourth 15 7 p 
     7 3 h 
second 13 6 n 
     9 4 j 
third 14 6 o 
     6 3 g 
+0

這它。就這一點讓我的頭痛了好幾個小時。熊貓和Python都是新手,這對您有所幫助!謝謝MaxU! – user4445586