2016-04-13 63 views
2

我有一個數據幀命名pricecomp_df,我想借此比較列的「市場價格」和每個喜歡「蘋果價格的」等欄目的價格,「芒果價格」,‘西瓜價格’,但優先考慮基礎條件的差異:(第一優先是西瓜的價格,僅次於芒果和第三的蘋果)。下面的輸入數據框中給出:以大熊貓據幀的兩個欄之間的差異

code apple price mangoes price watermelon price market price 
0 101   101   NaN    NaN   122 
1 102   123   123    NaN   124 
2 103   NaN   NaN    NaN   123 
3 105   123   167    NaN   154 
4 107   165   NaN    177   176 
5 110   123   NaN    NaN   123 

所以這裏的第一行已經不僅僅是蘋果的價格和市場價格,然後把他們的差異,但在第二行,我們有蘋果,芒果的價格,所以我必須採取只區別市場價格和芒果價格之間。同樣根據優先權條件採取差異。對於所有三種價格,也跳過nan行。任何人都可以幫忙嗎?

+0

任何人都可以幫助我嗎? – User1090

+0

三年後,我想出了一個解決方案。你還需要@ User1090嗎? – MERose

回答

12

希望我不是太晚了。這個想法是計算差異並根據您的優先級列表覆蓋它們。

import numpy as np 
import pandas as pd 

df = pd.DataFrame({'code': [101, 102, 103, 105, 107, 110], 
        'apple price': [101, 123, np.nan, 123, 165, 123], 
        'mangoes price': [np.nan, 123, np.nan, 167, np.nan, np.nan], 
        'watermelon price': [np.nan, np.nan, np.nan, np.nan, 177, np.nan], 
        'market price': [122, 124, 123, 154, 176, 123]}) 

# Calculate difference to apple price 
df['diff'] = df['market price'] - df['apple price'] 
# Overwrite with difference to mangoes price 
df['diff'] = df.apply(lambda x: x['market price'] - x['mangoes price'] if not np.isnan(x['mangoes price']) else x['diff'], axis=1) 
# Overwrite with difference to watermelon price 
df['diff'] = df.apply(lambda x: x['market price'] - x['watermelon price'] if not np.isnan(x['watermelon price']) else x['diff'], axis=1) 

print df 
    apple price code mangoes price market price watermelon price diff 
0   101 101   NaN   122    NaN 21 
1   123 102   123   124    NaN  1 
2   NaN 103   NaN   123    NaN NaN 
3   123 105   167   154    NaN -13 
4   165 107   NaN   176    177 -1 
5   123 110   NaN   123    NaN  0