2017-02-10 264 views
0

選擇行我有一個pandas.DataFramedf在大熊貓

Property Area dist 
A   50  2 
B   100 3 
C   20  10 
D   1  15 
E   20  16 
F   3  25 

我想最終的數據框有以下形式:

Property Area dist 
A   50  2 
C   20  10 
F   3  25 

即:我想省略這比8更接近行每一個。

+1

具有u累什麼? – haifzhan

+1

你的意思是「比每個更接近8」? – Zero

回答

1

我相信這段代碼符合你的問題陳述。基本思想是收集一組dist值來保留,然後將這些值應用於數據框。

代碼:

# find the dist values to keep 
to_keep = set() 
min_value = None 
min_dist = 8 
for dist in sorted(df['dist']): 
    if min_value <= dist - min_dist: 
     min_value = dist 
     to_keep.add(dist) 

# build a new data frame with just the keep values 
new_df = df.query('dist in @to_keep') 
print(new_df) 

產地:

Area dist 
A 50  2 
C 20 10 
F  3 25 

的樣本數據:

import numpy as np 
import pandas as pd 
props = np.array([ 
    ('Property', 'Area', 'dist'), 
    ('A',   50,  2), 
    ('B',   100,  3), 
    ('C',   20,  10), 
    ('D',   1,  15), 
    ('E',   20,  16), 
    ('F',   3,  25), 
    ]) 

df = pd.DataFrame(data=props[1:, 1:], 
        index=props[1:, 0], 
        columns=props[0, 1:]).apply(pd.to_numeric) 
+0

謝謝,一切正常,直到步驟new_df = df.query('dist in @to_keep'),在那裏我得到的錯誤:raise ImportError(「'numexpr'not found。Can not use」 ImportError:'numexpr'not found。如果沒有安裝'numexpr',就不能使用engine ='numexpr'進行查詢/評估 – Ssank

+0

我安裝了numexpr並且一切正常,謝謝 – Ssank