這是你在找什麼?
In [36]: a = np.random.random(20)
In [37]: a
Out[37]:
array([ 0.68574307, 0.15743428, 0.68006876, 0.63572484, 0.26279663,
0.14346269, 0.56267286, 0.47250091, 0.91168387, 0.98915746,
0.22174062, 0.11930722, 0.30848231, 0.1550406 , 0.60717858,
0.23805205, 0.57718675, 0.78075297, 0.17083826, 0.87301963])
In [38]: b = np.array((0.3,0.7))
In [39]: np.sum(a[:,None]<b[None,:], axis=0)
Out[39]: array([ 8, 16])
In [40]: np.sum(a[:,None]<b, axis=0) # b's new axis above is unnecessary...
Out[40]: array([ 8, 16])
In [41]: (a[:,None]<b).sum(axis=0) # even simpler
Out[41]: array([ 8, 16])
時序總是好評(爲一個稍長,2E6元件陣列)
In [47]: a = np.random.random(2000000)
In [48]: %timeit (a[:,None]<b).sum(axis=0)
10 loops, best of 3: 78.2 ms per loop
In [49]: %timeit np.searchsorted(a, b, 'right',sorter=a.argsort())
1 loop, best of 3: 448 ms per loop
對於較小的陣列
In [50]: a = np.random.random(2000)
In [51]: %timeit (a[:,None]<b).sum(axis=0)
10000 loops, best of 3: 89 µs per loop
In [52]: %timeit np.searchsorted(a, b, 'right',sorter=a.argsort())
The slowest run took 4.86 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 141 µs per loop
編輯
Divakar說,事情可能是lenghty b
期不同,讓我們來看看
In [71]: a = np.random.random(2000)
In [72]: b =np.random.random(200)
In [73]: %timeit (a[:,None]<b).sum(axis=0)
1000 loops, best of 3: 1.44 ms per loop
In [74]: %timeit np.searchsorted(a, b, 'right',sorter=a.argsort())
10000 loops, best of 3: 172 µs per loop
確實完全不同!謝謝你提醒我的好奇心。
可能OP應該測試他的使用案例,關於截斷序列還是非常長的樣本?哪裏有餘額?
編輯#2
我在時機做了一個軼事,我忘了axis=0
參數.sum()
...
我編輯的時序與更正聲明,並且,當然,更正的時機。我很抱歉。
廣播魔術 – gboffi
我愛廣播呢!但是,你必須嘗試一個更大的'b',而不是僅僅考慮2個元素。 – Divakar
@Divakar你是對的!我對我的帖子做了修改。 – gboffi