在Python中散列元組的順序很重要？

我：在Python中散列元組的順序很重要？

tuple1 = token1, token2 
tuple2 = token2, token1 
for tuple in [tuple1, tuple2]: 
    if tuple in dict: 
     dict[tuple] += 1 
    else: 
     dict[tuple] = 1

然而，無論元組1和tuple2都得到同樣的罪名。什麼是散列一組2件事情的方式，這樣的順序很重要？散列時

來源

2013-01-17 Shazam

順序是考慮到：

>>> hash((1,2)) 
1299869600 
>>> hash((2,1)) 
1499606158

這假定對象本身具有獨特的哈希值。

>>> t1 = 'a',hash('a') 
>>> [hash(x) for x in t1] #both elements in the tuple have same hash value since `int` hash to themselves in cpython 
[-468864544, -468864544] 
>>> t2 = hash('a'),'a' 
>>> hash(t1) 
1486610051 
>>> hash(t2) 
1486610051 
>>> d = {t1:1,t2:2} #This is OK. dict's don't fail when there is a hash collision 
>>> d 
{('a', -468864544): 1, (-468864544, 'a'): 2} 
>>> d[t1]+=7 
>>> d[t1] 
8 
>>> d[t1]+=7 
>>> d[t1] 
15 
>>> d[t2] #didn't touch d[t2] as expected. 
2

注意的是，由於散列衝突：即使他們不這樣做，你可以在字典中使用它的時候（只要通過他們的__eq__方法定義的對象本身是不相等的）仍然是OK ，這個詞典可能會比沒有散列衝突的另一個詞典更有效率:)

來源

2013-01-17 21:18:41 mgilson

假設'token1'和''token2'散列（）'不同的數值本身，當然。 –

@ sr2222 - 是的，當然，但我認爲這是值得記錄的，因爲你可以在類中重寫'__hash__'，這樣當你有一個包含該類的實例的元組時，命令就無關緊要了......但是即使這樣，根據'__eq__'是如何實現的，你的'dict'仍然可以出來（儘管由於散列衝突而效率很低）。 – mgilson

@ sr2222 - 由於您的評論，我已經更新了我的答案。這對我來說很有趣，我想你可能會覺得它很有趣。 – mgilson

他們得到相同計數的原因是您的代碼明確地同時增加token1,token2和token2,token1計數。如果你不這樣做，計數將不會保持鎖步：

In [16]: import collections 

In [17]: d = collections.defaultdict(int) 

In [18]: d[1,2] += 1 

In [19]: d[1,2] 
Out[19]: 1 

In [20]: d[2,1] 
Out[20]: 0

來源

2013-01-17 21:19:04 NPE

我認爲我通常會把它寫成'd [（1,2）]'而不是'd [1,2]'，即使它們是相同的。 .. – mgilson

你是對的，對不起，這是愚蠢的 – Shazam

看起來你已經發布了一個循環體的一個實例。我可能會建議您使用collections.Counter什麼你正在嘗試做的，這不正是你想要的東西，但在一個行：

counter = (collections.Counter(myListOfTuples) + 
      collections.Counter([j,i for i,j in myListOfTuples]))

來源

2013-01-17 21:25:23 inspectorG4dget

這並不是那麼簡單，但謝謝 – Shazam

在Python中散列元組的順序很重要？

回答

相關問題