2010-02-19 51 views
3

(我很抱歉,此問題的以前版本顯示錯誤的功能,我需要修復,這已得到解決,我希望這個問題現在更有意義。)位置排名和處理Python中的關係

我有一個有分數的對象列表,我試圖根據這些分數給他們分配排名。以下是我如何輸出我的數據。

sorted_scores = [ 
    ('Apolo Ohno', 0), 
    ('Shanie Davis', -1), 
    ('Bodie Miller', -2), 
    ('Lindsay Vohn', -3), 
    ('Shawn White', -3), 
    ('Bryan Veloso', -4) 
] 

我有一個領帶。現在,爲上面的對象分配位置的函數是一個簡單的for循環,它只將i的值賦值爲對象的最終位置。

positions = {} 

i = 1 
for key, value in sorted_list: 
    # Since in my codebase the strings are IDs, I use the key to fetch the object. 
    if value is not None: 
     positions[key] = i 
     i += 1 

所以這會明顯地返回:

positions = { 
    'Apolo Ohno': 1, 
    'Shanie Davis': 2, 
    'Bodie Miller': 3, 
    'Lindsay Vohn': 4,   
    'Shawn White': 5, 
    'Bryan Veloso': 6 
} 

希望這有一定的道理。問題的關鍵在於循環。是什麼讓更多的感覺是,如果返回他們像這樣:

positions = { 
    'Apolo Ohno': 1, 
    'Shanie Davis': 2, 
    'Bodie Miller': 3, 
    'Lindsay Vohn': 4, # Same value. 
    'Shawn White': 4, # Same value. 
    'Bryan Veloso': 6 
} 

我將如何修改上面的功能,要做到這一點,在根據記住,我能有任何數量的在任何給定的時間關係保持多少的我的成員排名說對象?最高排名應該是1,因此它可以顯示爲:<rank>/<total # of people>

在此先感謝。 :)

+0

它總是有道理的,你只需要清理掉所有無關緊要的東西。繼續。成員排名與您的問題無關。失去foo_ranking和酒吧排名。首先:「我有一個元組列表(object_id,score)。給定關係可以發生,我該怎麼做......」。說明最高分/最低分是指0(或1)的等級。 – 2010-02-22 04:41:39

+0

感謝您的幫助,我非常感謝您學習如何提出一個簡潔的問題,尤其是當涉及到這樣的主題時。:) – 2010-02-22 06:40:54

+0

爲什麼您接受該答案而不是早期的等效答案的任何特定原因? – 2010-02-23 09:14:51

回答

7
>>> sorted_scores = [ 
...  ('Apolo Ohno', 0), 
...  ('Shanie Davis', -1), 
...  ('Bodie Miller', -2), 
...  ('Lindsay Vohn', -3), 
...  ('Shawn White', -3), 
...  ('Bryan Veloso',-4) 
... ] 
>>> 
>>> res = {} 
>>> prev = None 
>>> for i,(k,v) in enumerate(sorted_scores): 
...  if v!=prev: 
...   place,prev = i+1,v 
...  res[k] = place 
... 
>>> print res 
{'Apolo Ohno': 1, 'Bryan Veloso': 6, 'Shanie Davis': 2, 'Lindsay Vohn': 4, 'Bodie Miller': 3, 'Shawn White': 4} 

記住類型的字典是無序的,所以要在地方爲了迭代,你需要做的這個

>>> from operator import itemgetter 
>>> print sorted(res.items(),key=itemgetter(1)) 
[('Apolo Ohno', 1), ('Shanie Davis', 2), ('Bodie Miller', 3), ('Lindsay Vohn', 4), ('Shawn White', 4), ('Bryan Veloso', 6)] 
+0

比我的解決方案短得多。使用枚舉的好電話。 – vishvananda 2010-02-22 08:35:23

+0

@vishvananda:請也對我的(早期)答案做一個批評:-) – 2010-02-22 13:04:47

+0

@約翰:對不起約翰,我錯過了你的答案使用相同的技術的事實。 – vishvananda 2010-02-23 08:11:36

3

要做到這一點的方法不是計算元素的位置是一些任意的序列,而是要計算有多少其他元素有更好的分數。

編輯:

應廣大用戶要求,爲O(n)'版和一切:

positions = {} 
cur_score = None # Score we're examining 
cur_count = 0 # Number of others that we've seen with this score 

for ix, (name, score) in enumerate(sorted_scores): 
    if score == cur_score: # Same score for this player as previous 
    cur_count += 1 
    else: # Different score from before 
    cur_score = score 
    cur_count = 0 
    positions[name] = ix - cur_count + 1 # Add 1 because ix is 0-based 

print positions 
+0

我不知道爲什麼會被拒絕,因爲這個建議是非常合理和正確的。這是一個不變的情況,「有關聯的隊伍」列表中的每個等級值都是更好得分的數量(通常爲+1)。依賴於解決問題的策略,伊格納西奧的建議具有額外的價值,因爲它是完美的可並行化的,因此 - 非常適合多線程/多進程解決方案。 – 2010-02-22 05:45:36

+0

那麼它不是我......但原因可能包括「無代碼」。 – 2010-02-22 06:04:59

0

我在做一堆假設它是什麼,你想做的事,但這裏的一個嘗試:

scores = { 
    'lorem': 100, 
    'ipsum': 200, 
    'dolor': 300, 
    'sit': 300, 
    'amet': 300, 
    'quia': 400, 
    'consectetur': 500, 
    'adipiscing': 500, 
    'elit': 600, 
    } 

groups = {} 
for (member, score) in scores.items(): 
    if score not in groups: 
     groups[score] = [member] 
    else: 
     groups[score].append(member) 

positions = {} 
for (rank, (score, members)) in enumerate(groups.items()): 
    for member in members: 
     positions[member] = rank 

顯示我的工作:

>>> import pprint 
>>> scores = { 
...  'lorem': 100, 
...  'ipsum': 200, 
...  'dolor': 300, 
...  'sit': 300, 
...  'amet': 300, 
...  'quia': 400, 
...  'consectetur': 500, 
...  'adipiscing': 500, 
...  'elit': 600, 
...  } 
>>> groups = {} 
>>> for (member, score) in scores.items(): 
...  if score not in groups: 
...   groups[score] = [member] 
...  else: 
...   groups[score].append(member) 
... 
>>> pprint.pprint(groups) 
{100: ['lorem'], 
200: ['ipsum'], 
300: ['sit', 'dolor', 'amet'], 
400: ['quia'], 
500: ['consectetur', 'adipiscing'], 
600: ['elit']} 
>>> positions = {} 
>>> for (rank, (score, members)) in enumerate(groups.items()): 
...  for member in members: 
...   positions[member] = rank 
... 
>>> pprint.pprint(positions) 
{'adipiscing': 4, 
'amet': 2, 
'consectetur': 4, 
'dolor': 2, 
'elit': 5, 
'ipsum': 1, 
'lorem': 0, 
'quia': 3, 
'sit': 2} 
>>> pprint.pprint(sorted(positions.items(), key=lambda i: i[1])) 
[('lorem', 0), 
('ipsum', 1), 
('sit', 2), 
('dolor', 2), 
('amet', 2), 
('quia', 3), 
('consectetur', 4), 
('adipiscing', 4), 
('elit', 5)] 
+0

您導入'itertools'但從未使用它。嘗試將獨立代碼放在一個文件中並從命令行運行它。 – 2010-02-22 03:17:06

+0

更糟糕的是:它不起作用。你的單一正確結果是一個意外。將elit的分數從600提高到6000(仍然在低分是更好的制度上運行),並看看會發生什麼。你錯誤地假設groups.items()是按分數排序的。 – 2010-02-22 03:35:20

3
變化/澄清規範===的

# coding: ascii 

def ranks_from_scores(sorted_scores): 
    """sorted_scores: a list of tuples (object_id, score), sorted by score DESCENDING 
     return a mapping of object IDs to ranks 
    """ 
    ranks = {} 
    previous_score = object() 
    for index, (obj_id, score) in enumerate(sorted_scores): 
     if score != previous_score: 
      previous_score = score 
      rank = index + 1 
     ranks[obj_id] = rank 
    return ranks 

from operator import itemgetter 
import pprint 

scores0 = dict([ 
    ('Apolo Ohno', 0), 
    ('Shanie Davis', -1), 
    ('Bodie Miller', -2), 
    ('Lindsay Vohn', -3), 
    ('Shawn White', -3) 
    ]) 

scores1 = { 
    'lorem': 100, 
    'ipsum': 200, 
    'dolor': 300, 
    'sit': 300, 
    'amet': 300, 
    'quia': 400, 
    'consectetur': 500, 
    'adipiscing': 500, 
    'elit': 600, 
    } 

scores2 = { 
    'lorem': 100, 
    'ipsum': 200, 
    'dolor': 300, 
    'sit': 300, 
    'amet': 300, 
    'quia': 400, 
    'consectetur': 500, 
    'adipiscing': 500, 
    'elit': 6000, 
    } 

import pprint 
funcs = (ranks_from_scores,) # Watch this space! 
tests = (scores0, scores1, scores2) 

for test in tests: 
    print 
    test_list = sorted(test.items(), key=itemgetter(1), reverse=True) 
    print "Input:", test_list 
    for func in funcs: 
     result = func(test_list) 
     print "%s ->" % func.__name__ 
     pprint.pprint(result) 

結果後

===更新:

Input: [('Apolo Ohno', 0), ('Shanie Davis', -1), ('Bodie Miller', -2), ('Lindsay 
Vohn', -3), ('Shawn White', -3)] 
ranks_from_scores -> 
{'Apolo Ohno': 1, 
'Bodie Miller': 3, 
'Lindsay Vohn': 4, 
'Shanie Davis': 2, 
'Shawn White': 4} 

Input: [('elit', 600), ('consectetur', 500), ('adipiscing', 500), ('quia', 400), 
('dolor', 300), ('sit', 300), ('amet', 300), ('ipsum', 200), ('lorem', 100)] 
ranks_from_scores -> 
{'adipiscing': 2, 
'amet': 5, 
'consectetur': 2, 
'dolor': 5, 
'elit': 1, 
'ipsum': 8, 
'lorem': 9, 
'quia': 4, 
'sit': 5} 

Input: [('elit', 6000), ('consectetur', 500), ('adipiscing', 500), ('quia', 400) 
, ('dolor', 300), ('sit', 300), ('amet', 300), ('ipsum', 200), ('lorem', 100)] 
ranks_from_scores -> 
{'adipiscing': 2, 
'amet': 5, 
'consectetur': 2, 
'dolor': 5, 
'elit': 1, 
'ipsum': 8, 
'lorem': 9, 
'quia': 4, 
'sit': 5} 

===最初提交===

這代碼假設你真的想要最高分獲得1名,而不是獲得1名(或0名!)的最低分。

# coding: ascii 

def ranks_from_scores(scores, debug=False): 
    """scores (a mapping of object IDs to scores) 
     return a mapping of object IDs to ranks 
    """ 
    alist = [(v, k) for k, v in scores.items()] 
    alist.sort(reverse=True) 
    if debug: print 'alist:', alist 
    bdict = {} 
    previous_score = object() 
    for posn, (score, obj_id) in enumerate(alist): 
     if score != previous_score: 
      previous_score = score 
      rank = posn + 1 
     bdict[obj_id] = rank 
    if debug: 
     print 'bdict:', bdict 
     blist = [(v, k) for k, v in bdict.items()] 
     print 'blist:', sorted(blist) 
    return bdict 

ranks_from_scores(
    {'q': 10, 'w': 20, 'e': 20, 'r': 20, 't': 30}, 
    debug=True, 
    ) 

輸出:

alist: [(30, 't'), (20, 'w'), (20, 'r'), (20, 'e'), (10, 'q')] 
bdict: {'q': 5, 'r': 2, 'e': 2, 't': 1, 'w': 2} 
blist: [(1, 't'), (2, 'e'), (2, 'r'), (2, 'w'), (5, 'q')] 
1

看起來你可以使用sortedenumerate建宏,從itertoolsgroupby方法和operatoritemgetter方法。假設分數越高越好......(如果得分較低的是更好的,改變reverse=Truereverse=False

>>> from itertools import groupby 
>>> from operator import itemgetter 
>>> scores = { 
...  'lorem': 100, 
...  'ipsum': 200, 
...  'dolor': 300, 
...  'sit': 300, 
...  'amet': 300, 
...  'quia': 400, 
...  'consectetur': 500, 
...  'adipiscing': 500, 
...  'elit': 600, 
...  } 
>>> sorted_items = sorted(scores.items(), key=itemgetter(1), reverse=True) 
>>> groups = groupby(sorted_items, itemgetter(1)) 
>>> for rank, (score, items) in enumerate(groups): 
...  print rank+1, map(itemgetter(0), items) 
... 
1 ['elit'] 
2 ['consectetur', 'adipiscing'] 
3 ['quia'] 
4 ['dolor', 'sit', 'amet'] 
5 ['ipsum'] 
6 ['lorem'] 
+0

似乎沒有做OP所要做的事情:注意他的「更明智」輸出位置的例子= {'1':1,'2':2,'3':2,'4':2,'' 5':5,'6':6}'即一等獎,三等獎,五等獎五等獎。你會得到第三等級三。 – 2010-02-22 04:26:32

0

這裏有一個簡單的方法來做到這一點

last = None 
position = 0 
delta = 1 
for key, value in sorted_list: 
    if value is not None: 
     if value != last: 
      position += delta 
      delta = 1 
     else: 
      delta += 1 
     # i believe this is supposed to be [key] not [value] in OP's code 
     positions[key] = position 
     last = value 
+1

如果第4名有雙向平行關係,我認爲下一位應該是第6位 – 2010-02-22 06:10:40

+0

@gnibbler:你說的有兩個屬性的偶然組合(a)它是合理的(b)這是OP明確想要的。 – 2010-02-22 06:24:52

+0

@gnibbler:好點。我根據你的建議修復了代碼 – vishvananda 2010-02-22 08:30:30

0
>>> sorted_scores = [ 
...  ('Apolo Ohno', 0), 
...  ('Shanie Davis', -1), 
...  ('Bodie Miller', -2), 
...  ('Lindsay Vohn', -3), 
...  ('Shawn White', -3), 
...  ('Bryan Veloso',-4) 
... ] 
>>> 
>>> from itertools import groupby 
>>> from operator import itemgetter 
>>> 
>>> place=1 
>>> res={} 
>>> for _,items in groupby(sorted_scores,key=itemgetter(1)): 
...  for i,item in enumerate(items): 
...   res[item[0]]= place 
...  place+=i+1 
... 
>>> print res 
{'Apolo Ohno': 1, 'Bryan Veloso': 6, 'Shanie Davis': 2, 'Lindsay Vohn': 4, 'Bodie Miller': 3, 'Shawn White': 4} 

記住類型的字典是無序的,所以要按順序迭代,您需要這樣做

>>> print sorted(res.items(),key=itemgetter(1)) 
[('Apolo Ohno', 1), ('Shanie Davis', 2), ('Bodie Miller', 3), ('Lindsay Vohn', 4), ('Shawn White', 4), ('Bryan Veloso', 6)] 
1

解決方案

這裏有一個簡單的方法,通過修改代碼一點,而不是導入模塊,以做到這一點:

prev = None 
rank = 0 
incr = 1 
positions = {} 
for key, value in sorted_list: 
    if value is not None: 
     if value != prev: 
      rank += incr 
      incr = 1 
     else: 
      incr += 1 
     positions[key] = rank 
     prev = value 

的檢驗

對於

sorted_list = [ 
    ('Apolo Ohno', 0), 
    ('Shanie Davis', -1), 
    ('Bodie Miller', -2), 
    ('Lindsay Vohn', -3), 
    ('Shawn White', -3), 
    ('Bryan Veloso',-4) 
] 

我得到職位爲:

{'Apolo Ohno': 1, 
'Shanie Davis': 2, 
'Bodie Miller': 3, 
'Lindsay Vohn': 4, 
'Shawn White': 4, 
'Bryan Veloso': 6} 

我認爲這是你要什麼,即使你不是關於是否兩個4的後應該有一個6很清楚。

+0

的確,應該有一個6.我編輯了我的問題來反映這一點。 – 2010-02-22 06:52:33

+0

這似乎是我的答案的確切副本。我想偉大的思想都一樣嗎? – vishvananda 2010-02-23 08:02:15

+0

@vishvananda:它似乎是地方性的:-( – 2010-02-23 09:16:10

1

這裏是一個融合了一些其他解決方案方面成爲一個靈活的發電機的方法功能。

def rank_sorted(sequence, start=1, key=None, reverse=True): 
    """A combination of `enumerate` and `sorted` iterators that deals 
    with tied ranks. 

    """ 
    previous_value = object() # won't compare equal to anything 
    sorted_iterator = sorted(sequence, key=key, reverse=reverse) 
    for index, item in enumerate(sorted_iterator, start=start): 

     # use key function to choose value if given 
     if key is None: 
      value = item 
     else: 
      value = key(item) 

     # only update rank when sort value changes 
     if value != previous_value: 
      previous_value = value 
      rank = index 

     yield rank, item 

可以具有不同值的要求startkeyreverse以允許等級在0或1開始,通過一個自定義的鍵功能(如itemgetter(1)用於分揀由值字典),以及輕鬆切換到較低的分數考慮更高的排名。使用原始問題中的示例:

from operator import itemgetter 

sorted_scores = [ 
    ('Apolo Ohno', 0), 
    ('Shanie Davis', -1), 
    ('Bodie Miller', -2), 
    ('Lindsay Vohn', -3), 
    ('Shawn White', -3), 
    ('Bryan Veloso', -4) 
] 

higher_is_better = dict(
    (name, rank) 
    for rank, (name, score) 
    in rank_sorted(sorted_scores, key=itemgetter(1)) 
) 
# {'Apolo Ohno': 1, 'Bryan Veloso': 6, 'Shanie Davis': 2, 'Lindsay Vohn': 4, 'Bodie Miller': 3, 'Shawn White': 4} 

lower_is_better = dict(
    (name, rank) 
    for rank, (name, score) 
    in rank_sorted(sorted_scores, key=itemgetter(1), reverse=False) 
) 
# {'Apolo Ohno': 6, 'Bryan Veloso': 1, 'Shanie Davis': 5, 'Lindsay Vohn': 2, 'Bodie Miller': 4, 'Shawn White': 2}