3
我已經實現了矩陣分解模型,比如說R = U * V,現在我要訓練和測試這個模型。爲此,給定一個稀疏矩陣R(缺失值爲零),我想先在訓練中隱藏一些非零元素,然後將這些非零元素用作測試集。如何從numpy.ndarray中隨機選擇一些非零元素?
如何從numpy.ndarray中隨機選擇一些非零元素?此外,我需要記住這些選定元素的索引和列位置,以便在測試中使用這些元素。
例如:
In [2]: import numpy as np
In [4]: mtr = np.random.rand(10,10)
In [5]: mtr
Out[5]:
array([[ 0.92685787, 0.95496193, 0.76878455, 0.12304856, 0.13804963,
0.30867502, 0.60245974, 0.00797898, 0.1060602 , 0.98277982],
[ 0.88879888, 0.40209901, 0.35274404, 0.73097713, 0.56238248,
0.380625 , 0.16432029, 0.5383006 , 0.0678564 , 0.42875591],
[ 0.42343761, 0.31957986, 0.5991212 , 0.04898903, 0.2908878 ,
0.13160296, 0.26938537, 0.91442668, 0.72827097, 0.4511198 ],
[ 0.63979934, 0.33421621, 0.09218392, 0.71520048, 0.57100522,
0.37205284, 0.59726293, 0.58224992, 0.58690505, 0.4791199 ],
[ 0.35219557, 0.34954002, 0.93837312, 0.2745864 , 0.89569075,
0.81244084, 0.09661341, 0.80673646, 0.83756759, 0.7948081 ],
[ 0.09173706, 0.86250006, 0.22121994, 0.21097563, 0.55090202,
0.80954817, 0.97159981, 0.95888693, 0.43151554, 0.2265607 ],
[ 0.00723128, 0.95690539, 0.94214806, 0.01721733, 0.12552314,
0.65977765, 0.20845669, 0.44663729, 0.98392716, 0.36258081],
[ 0.65994805, 0.47697842, 0.35449045, 0.73937445, 0.68578224,
0.44278095, 0.86743906, 0.5126411 , 0.75683392, 0.73354572],
[ 0.4814301 , 0.92410622, 0.85267402, 0.44856078, 0.03887269,
0.48868498, 0.83618382, 0.49404473, 0.37328248, 0.18134919],
[ 0.63999748, 0.48718656, 0.54826717, 0.1001681 , 0.1940816 ,
0.3937014 , 0.48768013, 0.70610649, 0.03213063, 0.88371607]])
In [6]: mtr = np.where(mtr>0.5, 0, mtr)
In [7]: %clear
In [8]: mtr
Out[8]:
array([[ 0. , 0. , 0. , 0.12304856, 0.13804963,
0.30867502, 0. , 0.00797898, 0.1060602 , 0. ],
[ 0. , 0.40209901, 0.35274404, 0. , 0. ,
0.380625 , 0.16432029, 0. , 0.0678564 , 0.42875591],
[ 0.42343761, 0.31957986, 0. , 0.04898903, 0.2908878 ,
0.13160296, 0.26938537, 0. , 0. , 0.4511198 ],
[ 0. , 0.33421621, 0.09218392, 0. , 0. ,
0.37205284, 0. , 0. , 0. , 0.4791199 ],
[ 0.35219557, 0.34954002, 0. , 0.2745864 , 0. ,
0. , 0.09661341, 0. , 0. , 0. ],
[ 0.09173706, 0. , 0.22121994, 0.21097563, 0. ,
0. , 0. , 0. , 0.43151554, 0.2265607 ],
[ 0.00723128, 0. , 0. , 0.01721733, 0.12552314,
0. , 0.20845669, 0.44663729, 0. , 0.36258081],
[ 0. , 0.47697842, 0.35449045, 0. , 0. ,
0.44278095, 0. , 0. , 0. , 0. ],
[ 0.4814301 , 0. , 0. , 0.44856078, 0.03887269,
0.48868498, 0. , 0.49404473, 0.37328248, 0.18134919],
[ 0. , 0.48718656, 0. , 0.1001681 , 0.1940816 ,
0.3937014 , 0.48768013, 0. , 0.03213063, 0. ]])
鑑於這種稀疏ndarray,我怎麼能選擇非零元素的20%,並記住它們的位置?