我有熊貓一個數據幀,看起來像這樣:如何檢查熊貓中另一個數組中存在的數組中的值的百分比?
app_id_x period app_id_y
10 [pb6uhl15, xn66n2cr, e68t39yp, s7xun0k1, wab2z... 2015-19 NaN
11 [uscm6kkb, tja4ma8u, qcwhw33w, ux5bbkjz, mmt3s... 2015-20 NaN
12 [txdbauhy, dib24pab, xt69u57g, n9e6a6ol, d9f7m... 2015-21 NaN
13 [21c2b5ca5e7066141b2e2aea35d7253b3b8cce11, oht... 2015-22 [g8m4lecv, uyhsx6lo, u9ue1zzo, kw06m3f5, wvqhq...
14 [64lbiaw3, jum7l6yd, a5d00f6aba8f1505ff22bc1fb... 2015-23 [608a223c57e1174fc64775dd2fd8cda387cc4a47, ze4...
15 [gcg8nc8k, jkrelo7v, g9wqigbc, n806bjdu, piqgv... 2015-24 [kz8udlea, zwqo7j8w, 6d02c9d74b662369dc6c53ccc...
16 [uc311krx, wpd7gm75, am8p0spd, q64dcnlm, idosz... 2015-25 [fgs0qhtf, awkcmpns, e0iraf3a, oht91x5j, mv4uo...
17 [wilhuu0x, b51xiu51, ezt7goqr, qj6w7jh6, pkzkv... 2015-26 [zwqo7j8w, dzdfiof5, phwoy1ea, e7hfx7mu, 40fdd...
18 [xn43bho3, uwtjxy6u, ed65xcuj, ejbgjh61, hbvzt... 2015-27 [ze4rr0vi, kw06m3f5, be532399ca86c053fb0a69d13...
我想做的事,是每個period
,這是行,檢查這也在名單app_id_y
值% app_id_x
值,例如,該行如果ze4rr0vi和gm83klja不到app_id_x
其中包含該行53個值,那麼就應該有一個叫adoption_rate
新列是:
period adoption_rate
2015-9 0%
2015-22 3.56%
2015-25 4.56%
2015-26 5.10%
2015-35 4.58%
2015-36 1.23%
我認爲你可以使用更好的樣本並添加樣本的所需輸出。也許幫忙 - 'print pd.DataFrame({'app_id_x':{10:['pb6uhl15','pb6uhl15','pb6uhl15'],11:['pb6uhl15','pb6uhl15','e68t39yp','s7xun0k1'] ,12:['pb6uhl15','s7xun0k1'],13:['s7xun0k1'],14:['pb6uhl15','pb6uhl15','e68t39yp','s7xun0k1']},'app_id_y':{10: 'pb6uhl15'],11:['pb6uhl15'],12:np.nan,13:['e68t39yp','xn66n2cr'] ['e68t39yp','xn66n2cr'] },'period':{10:'2015-19',11:'2015-20',12:'2015-21',13:'2015-22',14:'2015-23'}})'隨意修改它以獲得更好的效果。祝你好運。 – jezrael