將混合值DataFrame中的特定值設置爲固定值？

我有一個數據框，在列中有響應和預測變量，在行中有觀察值。響應中的某些值低於給定的檢測限（LOD）。由於我計劃對答覆應用排名轉換，因此我希望將所有這些值設置爲等於LOD。可以說，數據幀是將混合值DataFrame中的特定值設置爲固定值？

data.head() 

    age response1 response2 response3 risk  sex smoking 
0 33 0.272206 0.358059 0.585652 no female  yes 
1 38 0.425486 0.675391 0.721062 yes female  no 
2 20 0.910602 0.200606 0.664955 yes female  no 
3 38 0.966014 0.584317 0.923788 yes female  no 
4 27 0.756356 0.550512 0.106534 no female  yes

我願做

responses = ['response1', 'response2', 'response3'] 
LOD = 0.2 

data[responses][data[responses] <= LOD] = LOD

其中有多個原因不工作（如大熊貓不知道是否應該產生對數據的視圖或不，它不會，我猜）

我如何在

data[responses] <= LOD

等於LOD設置的所有值？

最少例如：

import numpy as np 
import pandas as pd 

from pandas import Series, DataFrame 

x = Series(random.randint(0,2,50), dtype='category') 
x.cat.categories = ['no', 'yes'] 

y = Series(random.randint(0,2,50), dtype='category') 
y.cat.categories = ['no', 'yes'] 

z = Series(random.randint(0,2,50), dtype='category') 
z.cat.categories = ['male', 'female'] 

a = Series(random.randint(20,60,50), dtype='category') 

data = DataFrame({'risk':x, 'smoking':y, 'sex':z, 
    'response1': random.rand(50), 
    'response2': random.rand(50), 
    'response3': random.rand(50), 
    'age':a})

來源

2016-10-19 Thomas Möbius

做'數據[數據[應答] <= LOD] = 0.2' – EdChum

可以使用DataFrame.mask：

import numpy as np 
import pandas as pd 

np.random.seed(123) 
x = pd.Series(np.random.randint(0,2,10), dtype='category') 
x.cat.categories = ['no', 'yes'] 
y = pd.Series(np.random.randint(0,2,10), dtype='category') 
y.cat.categories = ['no', 'yes'] 
z = pd.Series(np.random.randint(0,2,10), dtype='category') 
z.cat.categories = ['male', 'female'] 

a = pd.Series(np.random.randint(20,60,10), dtype='category') 

data = pd.DataFrame({ 
'risk':x, 
'smoking':y, 
'sex':z, 
'response1': np.random.rand(10), 
'response2': np.random.rand(10), 
'response3': np.random.rand(10), 
'age':a}) 
print (data) 
    age response1 response2 response3 risk  sex smoking 
0 24 0.722443 0.425830 0.866309 no male  yes 
1 23 0.322959 0.312261 0.250455 yes male  yes 
2 22 0.361789 0.426351 0.483034 no female  no 
3 40 0.228263 0.893389 0.985560 no female  yes 
4 59 0.293714 0.944160 0.519485 no female  no 
5 22 0.630976 0.501837 0.612895 no male  yes 
6 40 0.092105 0.623953 0.120629 no female  no 
7 27 0.433701 0.115618 0.826341 yes male  yes 
8 55 0.430863 0.317285 0.603060 yes male  yes 
9 48 0.493685 0.414826 0.545068 no male  no

responses = ['response1', 'response2', 'response3'] 
LOD = 0.2 

print (data[responses] <= LOD) 
    response1 response2 response3 
0  False  False  False 
1  False  False  False 
2  False  False  False 
3  False  False  False 
4  False  False  False 
5  False  False  False 
6  True  False  True 
7  False  True  False 
8  False  False  False 
9  False  False  False 

data[responses] = data[responses].mask(data[responses] <= LOD, LOD) 
print (data) 
    age response1 response2 response3 risk  sex smoking 
0 24 0.722443 0.425830 0.866309 no male  yes 
1 23 0.322959 0.312261 0.250455 yes male  yes 
2 22 0.361789 0.426351 0.483034 no female  no 
3 40 0.228263 0.893389 0.985560 no female  yes 
4 59 0.293714 0.944160 0.519485 no female  no 
5 22 0.630976 0.501837 0.612895 no male  yes 
6 40 0.200000 0.623953 0.200000 no female  no 
7 27 0.433701 0.200000 0.826341 yes male  yes 
8 55 0.430863 0.317285 0.603060 yes male  yes 
9 48 0.493685 0.414826 0.545068 no male  no

來源

2016-10-19 13:45:02 jezrael

如何它工作嗎？ – jezrael

Thx，它工作完美！今天學到了熊貓的另一個功能。 .mask看起來確實很強大。 –

將混合值DataFrame中的特定值設置爲固定值？

回答

相關問題