2014-05-05 50 views
0

我有一個函數,它繪製了PandasDataFrame中兩列的日誌。因爲這樣的零會導致錯誤並需要刪除。此時該函數的輸入是DataFrame的兩列。有沒有辦法刪除任何包含零的行?例如DF = DF的等效版本[df.ColA!= 0]刪除兩個Pandas系列中包含零的整行

def logscatfit(x,y,title): 
    xvals2 = np.arange(-2,6,1) 
    a = np.log(x) #These are what I want to remove the zeros from 
    b = np.log(y) 
    plt.scatter(a, b, c='g', marker='x', s=35) 
    slope, intercept, r_value, p_value, std_err = stats.linregress(a,b) 
    plt.plot(xvals2, (xvals2*slope + intercept), color='red') 
    plt.title(title) 
    plt.show() 
    print "Slope is:",slope, ". Intercept is:",intercept,". R-value is:",r_value,". P-value is:",p_value,". Std_err is:",std_err 

在想不到的兩個ab去除零,但讓他們同樣長度,使得的方式,我可以繪製散點圖。是我唯一的選擇重寫函數採取DataFrame,然後刪除零如df1 = df[df.ColA != 0]然後df2 = df1[df1.ColB != 0]

回答

2

根據我的理解你的問題,你需要刪除或者(和/或)xy爲零的行。

一個簡單的方法是

keepThese = (x > 0) & (y > 0) 
a = x[keepThese] 
b = y[keepThese] 

,然後用你的代碼進行。

0

插入FooBar的回答到你的函數給出:

def logscatfit(x,y,title): 
    xvals2 = np.arange(-2,6,1) 
    keepThese = (x > 0) & (y > 0) 
    a = x[keepThese] 
    b = y[keepTheese]   
    a = np.log(a) 
    b = np.log(b) 
    plt.scatter(a, b, c='g', marker='x', s=35) 
    slope, intercept, r_value, p_value, std_err = stats.linregress(a,b) 
    plt.plot(xvals2, (xvals2*slope + intercept), color='red') 
    plt.title(title) 
    plt.show() 
    print "Slope is:",slope, ". Intercept is:",intercept,". R-value is:",r_value,". P-value is:",p_value,". Std_err is:",std_err 
1

我喜歡FooBar的對簡單的答案。更通用的方法是將數據幀傳遞給您的函數,並使用.any()方法。

def logscatfit(df,x_col_name,y_col_name,title): 
    two_cols = df[[x_col_name,y_col_name]] 
    mask = two_cols.apply(lambda x: (x==0).any(), axis = 1) 
    df_to_use = df[mask] 
    x = df_to_use[x_col_name] 
    y = df_to_use[y_col_name] 

    #your code 
    a = n.log(x) 
    etc 
相關問題