Counter
有一個主要的瓶頸,當你數短iterables:它檢查if isinstance(iterable, Mapping)
。這個測試是相當緩慢的,因爲collections.abc.Mapping
是abstract metaclass並且這樣isinstance
-check是有點比正常isinstance
-checks更復雜,例如參見:"Why is checking isinstance(something, Mapping) so slow?"
所以它是不是真的令人驚訝,如果其他方法都更快簡短的迭代。然而,對於長迭代文件,檢查無關緊要,並且Counter
應該更快(至少對於其中實際計數功能_count_elements
寫在c中的python-3.x(CPython))。
找出瓶頸的簡單方法是分析。我在這裏使用line_profiler:
%load_ext line_profiler
from collections import Counter
x = range(50)
# Profile the function Counter.update when executing the command "Counter(x)"
%lprun -f Counter.update Counter(x)
結果:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
604 1 8 8.0 3.9 if not args:
605 raise TypeError("descriptor 'update' of 'Counter' object "
606 "needs an argument")
607 1 13 13.0 6.4 self, *args = args
608 1 6 6.0 3.0 if len(args) > 1:
609 raise TypeError('expected at most 1 arguments, got %d' % len(args))
610 1 5 5.0 2.5 iterable = args[0] if args else None
611 1 3 3.0 1.5 if iterable is not None:
612 1 94 94.0 46.3 if isinstance(iterable, Mapping):
613 if self:
614 self_get = self.get
615 for elem, count in iterable.items():
616 self[elem] = count + self_get(elem, 0)
617 else:
618 super(Counter, self).update(iterable) # fast path when counter is empty
619 else:
620 1 69 69.0 34.0 _count_elements(self, iterable)
621 1 5 5.0 2.5 if kwds:
622 self.update(kwds)
所以花費的時間從一個迭代初始化Counter
有一個相當大的常數因子(時46%都花在在isinstance
檢查中,使用計數字典只需要34%)。
但是長期iterables沒關係(多),因爲它只是做一次:
%lprun -f Counter.update Counter([1]*100000)
Line # Hits Time Per Hit % Time Line Contents
==============================================================
604 1 12 12.0 0.0 if not args:
605 raise TypeError("descriptor 'update' of 'Counter' object "
606 "needs an argument")
607 1 12 12.0 0.0 self, *args = args
608 1 6 6.0 0.0 if len(args) > 1:
609 raise TypeError('expected at most 1 arguments, got %d' % len(args))
610 1 6 6.0 0.0 iterable = args[0] if args else None
611 1 3 3.0 0.0 if iterable is not None:
612 1 97 97.0 0.3 if isinstance(iterable, Mapping):
613 if self:
614 self_get = self.get
615 for elem, count in iterable.items():
616 self[elem] = count + self_get(elem, 0)
617 else:
618 super(Counter, self).update(iterable) # fast path when counter is empty
619 else:
620 1 28114 28114.0 99.5 _count_elements(self, iterable)
621 1 13 13.0 0.0 if kwds:
622 self.update(kwds)
只給你如何將這些執行的概述根據元素的數量,比較我包括一個優化版本的計數以及Counter
使用的_count_elements
函數。不過我排除在外,你sorted
的項目,創造了計數的列表,以避免其他影響的部分 - 尤其是因爲sorted
比計算不同的運行時行爲(O(n log(n))
)(O(n)
):
# Setup
import random
from collections import Counter, _count_elements
def count(iterable):
"""Explicit iteration over items."""
dctCounter = {}
for item in iterable:
if item in dctCounter:
dctCounter[item] += 1
else:
dctCounter[item] = 1
return dctCounter
def count2(iterable):
"""Iterating over the indices"""
dctCounter = {}
lenLstItems = len(iterable)
for idx in range(lenLstItems):
item = iterable[idx]
if item in dctCounter.keys():
dctCounter[item] += 1
else:
dctCounter[item] = 1
return dctCounter
def c_count(iterable):
"""Internal counting function that's used by Counter"""
d = {}
_count_elements(d, iterable)
return d
# Timing
timings = {Counter: [], count: [], count2: [], c_count: []}
for i in range(1, 20):
print(2**i)
it = [random.randint(0, 2**i) for _ in range(2**i)]
for func in (Counter, count, count2, c_count):
res = %timeit -o func(it)
timings[func].append(res)
# Plotting
%matplotlib notebook
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(1)
ax = plt.subplot(111)
n = 2**np.arange(1, 5)
ax.plot(n,
[time.average for time in timings[count]],
label='my custom function', c='red')
ax.plot(n,
[time.average for time in timings[count2]],
label='your custom function', c='green')
ax.plot(n,
[time.average for time in timings[Counter]],
label='Counter', c='blue')
ax.plot(n,
[time.average for time in timings[c_count]],
label='_count_elements', c='purple')
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('elements')
ax.set_ylabel('time to count them [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()
而結果:
個人定時:
2
30.5 µs ± 177 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1.67 µs ± 3.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
6.03 µs ± 31.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1.67 µs ± 1.45 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
4
30.7 µs ± 75.5 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
2.63 µs ± 25.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
7.81 µs ± 13.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
1.97 µs ± 5.59 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
8
34.3 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
4.3 µs ± 16.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
11.3 µs ± 23.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
3 µs ± 25.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
16
34.2 µs ± 599 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7.46 µs ± 42 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
17.5 µs ± 83.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
4.24 µs ± 17.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
32
38.4 µs ± 578 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
13.7 µs ± 95.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
29.8 µs ± 383 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
7.56 µs ± 90.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
64
43.5 µs ± 1.92 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
24 µs ± 107 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
52.8 µs ± 1.04 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
11.6 µs ± 83.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
128
53.5 µs ± 1.14 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
47.8 µs ± 507 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
101 µs ± 3.25 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
21.7 µs ± 161 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
256
69.6 µs ± 239 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
92.1 µs ± 1.94 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
188 µs ± 3.29 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
39.5 µs ± 1.07 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
512
123 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
200 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
409 µs ± 3.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
90.9 µs ± 958 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
1024
230 µs ± 2.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
428 µs ± 5.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
855 µs ± 14.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
193 µs ± 2.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
2048
436 µs ± 7.11 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
868 µs ± 10.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.76 ms ± 31.9 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
386 µs ± 1.17 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
4096
830 µs ± 10.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.8 ms ± 33.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
3.75 ms ± 153 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.06 ms ± 89.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
8192
2.3 ms ± 246 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.8 ms ± 163 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
7.8 ms ± 425 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.69 ms ± 100 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
16384
4.53 ms ± 349 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
8.22 ms ± 359 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
15.9 ms ± 371 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
3.9 ms ± 243 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
32768
9.6 ms ± 17.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
17.2 ms ± 51.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
32.5 ms ± 215 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
10.4 ms ± 687 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
65536
24.8 ms ± 490 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
40.1 ms ± 710 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
66.8 ms ± 271 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
24.5 ms ± 659 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
131072
54.6 ms ± 756 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
84.2 ms ± 728 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
142 ms ± 1.29 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
54.1 ms ± 424 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
262144
120 ms ± 4.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
182 ms ± 4.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
296 ms ± 3.87 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
117 ms ± 320 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
524288
244 ms ± 3.31 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
368 ms ± 2.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
601 ms ± 13.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
252 ms ± 6.13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
另外,計算50個元素實際上太小了一些要測試的元素。至少從一千或一萬個元素開始。 –
@MartijnPieters查看更新的問題,考慮到您的意見,使您的評論過時。 – Claudio