如果您需要的性能,您應該使用numpy的或numba,而這一切低級程序完成後,在近C速度:
import numpy as np
bigarray=np.random.randint(0,2,10**4*3*2).reshape(10**4,3,2)
biglist=[[[e for e in B] for B in A] for A in bigarray]
# [[[1, 0], [0, 0], [1, 0]],
# [[0, 0], [0, 1], [0, 1]],
# [[1, 0], [1, 0], [0, 0]], ...
def your_count(biglist):
integers=[]
for k in biglist:
num = int("".join(str(row[0]) for row in k), 2)
integers.append(num)
return integers
def count_python(big):
m=len(big)
integers=np.empty(m,np.int32)
for i in range(m):
n=len(big[i])
b=1
s=0
for j in range(n-1,-1,-1):
s = s+big[i][j][0]*b
b=b*2
integers[i]=s
return integers
def count_numpy(bigarray):
integers=(bigarray[:,:,0]*[4,2,1]).sum(axis=1)
return integers
from numba import njit
count_numba =njit(count_python)
和一些測試:
In [125]: %timeit your_count(biglist)
145 ms ± 22.1 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [126]: %timeit count_python(biglist)
29.6 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [127]: %timeit count_numpy(bigarray)
354 µs ± 10.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [128]: %timeit count_numba(bigarray)
73 µs ± 938 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Numba讓你編譯一些python代碼的低級版本(不是你的,因爲Numba不管理字符串和列表,只有numpy數組)。 Numpy給你特殊的語法來在一個指令中創造出奇妙的東西,以獲得好的表演。
Numba解決方案在這裏比你的解決方案快2000倍。
計數被有效地collections.Counter
或np.unique
計算:/:
In [150]: %timeit {k:v for k,v in zip(*np.unique(integers,return_counts=True))}
46.4 µs ± 1.55 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [151]: %timeit Counter(integers)
218 µs ± 11.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
我想,如果你的代碼工作正常,但要提高它,你應該在[代碼審查堆棧交易所(HTTPS發佈此/codereview.stackexchange.com/) – RoyaumeIX
整數值可能的範圍是多少? – randomir
每個「行」的列數是否相同?即 - 你只在這裏處理3位......或者他們可能會更大? –