隨機產生更多的零比例python

我想模擬一個變量，其值可以在0和1之間。但是我也希望這個隨機變量有80％的零。目前我可以做以下操作：隨機產生更多的零比例python

data['response']=np.random.uniform(0,1,15000)#simulate response 
data['response']=data['response'].apply(lambda x:0 if x<0.85 else x)

但是，這隻會導致變量中的極值（0和.8+）。我想要百分之八十零和其餘20％的行之間的值爲零和一。這必須隨機完成。

來源

2017-04-02 user2906657

您可以生成80％的零，然後附加0和1之間的值的零的長度的四分之一，然後對其進行洗牌。 – Garmekain

下面是與np.random.choice一種方法，這與它的可選輸入參數replace一套適合這裏作爲False or 0產生沿着15000全長唯一索引，然後用np.random.uniform產生的隨機數和分配。

因此，實施看起來會是沿着這些路線 -

# Parameters 
s = 15000 # Length of array 
zeros_ratio = 0.8 # Ratio of zeros expected in the array 

out = np.zeros(s) # Initialize output array 
nonzeros_count = int(np.rint(s*(1-zeros_ratio))) # Count of nonzeros in array 

# Generate unique indices where nonzeros are to be placed 
idx = np.random.choice(s, nonzeros_count, replace=0) 

# Generate nonzeros between 0 and 1 
nonzeros_num = np.random.uniform(0,1,nonzeros_count) 

# Finally asssign into those unique positions 
out[idx] = nonzeros_num

樣品試驗結果 -

In [233]: np.isclose(out, 0).sum() 
Out[233]: 12000 

In [234]: (~np.isclose(out, 0)).sum() 
Out[234]: 3000

來源

2017-04-02 11:36:53 Divakar

這裏是另外一個使用numpy.random.shuffle

# Proportion between zeros and non-zeros 
proportion = .8 
n_non_zeros = 200 

# Generate fake non-zero data. 
# Inversion to ensure the range contains all the values between 0 and 1, except 0 
non_zeros = 1 - np.random.uniform(size=[n_non_zeros]) 

# Create [proportion/(1 - proportion)] zeros for each non-zero 
zeros = [0 for _ in range(int(n_non_zeros * proportion/(1 - proportion)))] 

# Concatenate both zeros and non-zeros 
data = np.concatenate((zeros, non_zeros), axis=0) 

# Shuffle data 
np.random.shuffle(data) 

# 'data' now contains 200 non-zeros and 800 zeros 
# They are %20 and %80 of 1000

來源

2017-04-02 11:40:19 Garmekain

您的代碼構建，當它大於0.8時，可以縮放x：

lambda x: 0 if x < 0.8 else 5 * (x - 0.8)

來源

2017-04-02 11:41:14 BlackBear

我們可以繪製數從均勻分佈擴展到負側，則取max零：

>>> numpy.maximum(0, numpy.random.uniform(-4, 1, 15000)) 
array([ 0.57310319, 0.  , 0.02696571, ..., 0.  , 
     0.  , 0.  ]) 
>>> a = _ 
>>> sum(a <= 0) 
12095 
>>> sum(a > 0) 
2905 
>>> 12095/15000 
0.8063333333333333

這裏-4被使用，因爲4 /（4 + 1）= 80 ％。

由於結果是一個稀疏數組，所以可能SciPy sparse matrix更合適。

>>> a = scipy.sparse.rand(1, 15000, 0.2) 
>>> a.toarray() 
array([[ 0.  , 0.03971366, 0.  , ..., 0.  , 
     0.  , 0.9252341 ]])

這裏0.2 = 1 − 0.8是陣列的密度。非零數字在0和1之間均勻分佈。

來源

2017-04-02 11:47:31 kennytm

隨機產生更多的零比例python

回答

相關問題