2017-04-18 16 views
1

我在熊貓有一個數據幀,relation_between_countries區別:找關係

country_from country_to points 
1 Albania  Austria  10 
2 Denmark  Austria  5 
3 Austria  Albania  2 
4 Greece   Norway  4 
5 Norway   Greece  5 

我試圖讓關係的點之間的差異,因爲這:

country_from_or_to country_to_or_from difference 
Albania    Austria    8 
Denmark    Austria    
Greece    Norway    -1 

你有什麼想法怎麼辦?

回答

5

使用DataFrameGroupBy.diff

cols = ['country_from','country_to'] 
#sort values in columns 
df[cols] = df[cols].apply(sorted, axis=1) 
#get difference 
df['difference'] = df.groupby(cols)['points'].diff(-1) 
print (df) 
    country_from country_to points difference 
1  Albania Austria  10   8.0 
2  Austria Denmark  5   NaN 
3  Albania Austria  2   NaN 
4  Greece  Norway  4  -1.0 
5  Greece  Norway  5   NaN 

也可以代替NaN空的空間,但在右列的混合值 - 數字處理字符串,因此一些函數可以返回奇怪的輸出:

cols = ['country_from','country_to'] 
df[cols] = df[cols].apply(sorted, axis=1) 
df['difference'] = df.groupby(cols)['points'].diff(-1).fillna('') 
print (df) 
    country_from country_to points difference 
1  Albania Austria  10   8 
2  Austria Denmark  5   
3  Albania Austria  2   
4  Greece  Norway  4   -1 
5  Greece  Norway  5   
+0

非常感謝,@jezrael! – Kristoffer

+0

很好的答案,很好的解釋! – piRSquared