2016-04-20 117 views
0

相關的問題在這裏:Reordering pandas dataframe based on multiple column and sum of one column接受的大熊貓數據框頂部行基於分組

我如何使用sort列時接受前2個國家在這個數據幀,: 在這種情況下,頂部2個國家將在澳大利亞和阿富汗

Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 
2  Algeria bus 827000.0 829351.0 
3  Algeria bus  2351.0 829351.0 

- 編輯:

我也想保留type列。在這種情況下,解決方案應該是這樣的:

Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 

回答

1

UPDATE:

In [166]: df.loc[df.Country_FAO.isin(df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index)] 
Out[166]: 
    Country_FAO type mean_area  sort 
5 Australia car 12141000.0 18910501.0 
4 Australia car 6475695.0 18910501.0 
6 Australia bus 293806.0 18910501.0 
0 Afghanistan car 2029000.0 2141000.0 
1 Afghanistan car 112000.0 2141000.0 

我會做這種方式:

In [153]: df.groupby('Country_FAO').sum() 
Out[153]: 
       mean_area 
Country_FAO 
Afghanistan 2141000.0 
Algeria  829351.0 
Australia 18910501.0 

In [154]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area') 
Out[154]: 
       mean_area 
Country_FAO 
Australia 18910501.0 
Afghanistan 2141000.0 

In [155]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').index 
Out[155]: Index(['Australia', 'Afghanistan'], dtype='object', name='Country_FAO') 

還,您可能需要重置您的索引:

In [156]: df.groupby('Country_FAO').sum().nlargest(2, 'mean_area').reset_index() 
Out[156]: 
    Country_FAO mean_area 
0 Australia 18910501.0 
1 Afghanistan 2141000.0 
+0

謝謝@MaxU,這個soln刪除'type'列,有沒有辦法保留這個? – user308827

+0

@ user308827,我已經更新了我的答案 - 請檢查 – MaxU

+0

謝謝@MaxU,此作品! – user308827