2017-08-01 108 views
1

大家好,我有以下數據框:大熊貓刪除重複的數據範圍

df1 
     WL  WM  WH  WP  
1  low medium high premium 
2  26  26  15  14 
3  32  32  18  29 
4  41  41  19  42 
5  apple dog  fur  napkins   
6  orange cat  tesla earphone 
7  NaN  rat  tobias controller 
8  NaN  NaN  phone 
9  low  medium high    
10  1  3  5 
11  2  4  6 
12 low  medium high 
13  4  8  10 
14  5  9  11 

有沒有一種方法,以消除低+ 2行使得輸出是這樣的:

df1 
     WL  WM  WH  WP  
1  low medium high premium 
2  26  26  15  14 
3  32  32  18  29 
4  41  41  19  42 
5  apple dog  fur  napkins   
6  orange cat  tesla earphone 
7  NaN  rat  tobias controller 
8  NaN  NaN  phone 

不幸的是,代碼必須是動態的,因爲我有多個數據框,並且每個「低」的位置都不相同。我最初的嘗試:

df1 = df1[~df1.iloc[:,0].isin(['LOW'])+2].reset_index(drop=True) 

但是,這不是我所期待的。任何幫助表示讚賞

回答

1

您可以使用:

#get index values where low 
a = df.index[df.iloc[:,0] == 'low'] 

size = 2 
#all index values (without first [1:]) 
#min is for last rows of df for avoid select non existed values 
arr = [np.arange(i, min(i+size+1,len(df)+1)) for i in a[1:]] 
idx = np.unique(np.concatenate(arr)) 
print (idx) 
[ 9 10 11 12 13 14] 

#remove rows 
df = df.drop(idx) 
print (df) 
     WL  WM  WH   WP 
1  low medium high  premium 
2  26  26  15   14 
3  32  32  18   29 
4  41  41  19   42 
5 apple  dog  fur  napkins 
6 orange  cat tesla earphone 
7  NaN  rat tobias controller 
8  NaN  NaN phone   NaN 
+0

的感謝!好奇,什麼是大小變量? – codeninja

+0

它是窗口大小 – jezrael

+0

需要刪除2行,所以'size = 2' – jezrael