2017-05-31 83 views
1

我正在嘗試使用行中的其他值爲較低置信區間創建一個新列。我已經(作爲pypi上的包public-health-cis)編寫(併發布)置信區間計算。這些函數採用浮點值並返回一個浮點數。將列值傳遞給Pandas中的lambda函數

在我的分析腳本中,我試圖從熊貓數據框中調用這個函數。我嘗試了幾種方法來試圖實現這一目標,但都無濟於事。

df_for_ci_calcs = df[['Value', 'Count', 'Denominator']].copy() 
    df_for_ci_calcs = df_for_ci_calcs.applymap(lambda x: -1 if x == '*' else x) 
    df_for_ci_calcs = df_for_ci_calcs.astype(np.float) 
    df['LowerCI'].apply(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'].astype(float), 
             df_for_ci_calcs['Count'].astype(float), 
             df_for_ci_calcs['Denominator'].astype(float), indicator.rate)) 

這種回溯回來:

Internal Server Error:/

df['LowerCI'].apply(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'].astype(float), df_for_ci_calcs['Count'].astype(float), df_for_ci_calcs['Denominator'].astype(float), indica 
tor.rate)) 

TypeError: cannot convert the series to <class 'float'> 

我一直在使用也嘗試:

df['LowerCI'] = df_for_ci_calcs.applymap(lambda x: public_health_cis.wilson_lower(df_for_ci_calcs['Value'], df_for_ci_calcs['Count'], 
                 df_for_ci_calcs['Denominator'], indicator.rate), axis=1) 

它提供了錯誤:

applymap() got an unexpected keyword argument 'axis'

當我將軸kwarg取出時,我得到與第一種方法相同的錯誤。那麼,如何將每行的值傳遞給函數以獲取基於這些行中數據的值?

回答

1

我認爲你需要applyaxis=1由行的過程,所以獲得輸入作爲float S:

df['LowerCI'] = df[['Value', 'Count', 'Denominator']] 
       .replace('*', -1) 
       .astype(float) 
       .apply(lambda x: public_health_cis.wilson_lower(x['Value'], 
                   x['Count'], 
                   x['Denominator'], 
                   indicator.rate), 
                   axis=1) 

樣品(爲簡化我改變indicator.rate標量100):

df = pd.DataFrame({'Value':['*',2,3], 
        'Count':[4,5,6], 
        'Denominator':[7,8,'*'], 
        'D':[1,3,5], 
        'E':[5,3,6], 
        'F':[7,4,3]}) 

print (df) 
    Count D Denominator E F Value 
0  4 1   7 5 7  * 
1  5 3   8 3 4  2 
2  6 5   * 6 3  3 

df['LowerCI'] = df[['Value', 'Count', 'Denominator']] \ 
       .replace('*', -1) \ 
       .astype(float) \ 
       .apply(lambda x: public_health_cis.wilson_lower(x['Value'], 
                   x['Count'], 
                   x['Denominator'], 
                   100), axis=1) 

print (df) 
    Count D Denominator E F Value LowerCI 
0  4 1   7 5 7  * 14.185885 
1  5 3   8 3 4  2 18.376210 
2  6 5   * 6 3  3 99.144602 
+0

這就是它謝謝!我覺得自己是一個白癡,沒有參考我發送的['Value'],['Count']等,所以我通過整個系列發送,難怪它不喜歡它! – RustyBrain

+1

很高興能幫到你,祝你好運! – jezrael