看起來像一個整潔的練習,我終於可以用tee
,所以我做了這個:
from itertools import tee
words = ['A','B','C']
dicts = [{'A': 0.1, 'B': 0.5, 'C': 0.01},
{'A': 0.4, 'B': 0.11, 'C': 0.21}]
newdict = {word: (min(minfeed), max(maxfeed))
for word in words
for minfeed, maxfeed in [tee(d[word] for d in dicts)]}
輸出:
{'B': (0.11, 0.5), 'C': (0.01, 0.21), 'A': (0.1, 0.4)}
編輯:我很好奇,並嘗試了一些更多的版本,做了大量隨機數據,10萬字和100個詞典的高速測試。結果第一:
5.418 seconds Chin
4.364 seconds Stefan
3.460 seconds Stefan2
3.471 seconds Stefan3
代碼:
from itertools import tee
def Stefan(words, dicts):
return {word: (min(minfeed), max(maxfeed))
for word in words
for minfeed, maxfeed in [tee(d[word] for d in dicts)]}
def Stefan2(words, dicts):
out = {}
for word in words:
values = [d[word] for d in dicts]
out[word] = min(values), max(values)
return out
def Stefan3(words, dicts):
return {word: (min(values), max(values))
for word in words
for values in [[d[word] for d in dicts]]}
def Chin(myList, myDict):
myNewDict = {}
for words in myList:
NewList = []
for dicts in myDict:
tmps = dicts[words]
NewList.append(tmps)
myNewDict[words] = (min(NewList), max(NewList))
return myNewDict
from random import sample, randrange, random
from string import ascii_letters
from time import time
WORDS = 100000
DICTS = 100
def random_word():
''.join(sample(ascii_letters, randrange(2, 10)))
words = [random_word() for _ in range(WORDS)]
dicts = [{w: random() for w in words} for _ in range(DICTS)]
prev = None
for func in Chin, Stefan, Stefan2, Stefan3:
t0 = time()
result = func(words, dicts)
print('%6.3f seconds' % (time() - t0), func.__name__)
if prev and result != prev:
print('fail')
prev = result
它會更容易給你的建議,如果你能發佈與'myList'和'myDict'填入實際值的完整工作示例。在原始示例中,他們不必分別有100和60個條目,只有少數條目可以。這樣我們就可以看到實際數據的樣子。 –
...雖然知道實際的_sizes_將會很好,因爲很多優化都適用,比如說,一次使用的一小列巨大字典對於比如說小字典的小列表來說是沒有意義的使用數十億次。 – abarnert
尊敬的Michaael和Abarnet,謝謝。我已經改進了這個例子,謝謝。 –