2017-05-26 72 views
0

我有一個(268X4)df,並找到一列的異常值(22,1)。我想從df中刪除那些異常值。我怎麼做?如何從數據框中刪除異常值?

> df=df_nonull import pandas as pd # to manipulate dataframes import 
> numpy as np # to manipulate arrays 
> 
> # a number "a" from the vector "x" is an outlier if 
> # a > median(x)+1.5*iqr(x) or a < median-1.5*iqr(x) 
> # iqr: interquantile range = third interquantile - first interquantile def 
>outliers(x): 
>  return np.abs(x- x.median()) > 1.5*(x.quantile(.75)- 
>x.quantile(0.25)) 
> 
> # Give the outliers for the first column for example 
>outliers=df.StockValue[outliers(df.StockValue)] 

回答

1

您只能刪除整行,不要像(22,1)這樣的單個單元格。如果你想刪除整行數據。 (df.index [[22]])