等級號碼列表，允許領帶

可以說我有一個這樣的名單：等級號碼列表，允許領帶

newIndexVertList = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22]

我想將其轉換爲：

newIndexVertList = [0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

這裏，轉型是基於編號在原始列表中按升序排列。

0 --> 0 0th position in sorted list 
1 --> 1 1st position in sorted list 
2 --> 2 2nd position in sorted list 
3 --> 3 3rd position in sorted list 
20 --> 4 4th position in sorted list 
21 --> 5 5th position in sorted list 
22 --> 6 6th position in sorted list 
23 --> 7 7th position in sorted list

下面是我的代碼來實現這一目標：

c = 0 
for i in xrange(len(newIndexVertList)): 
    if c < newIndexVertList[i]: 
     newIndexVertList[i] = c 
     c += 1 
     continue 
    elif c == newIndexVertList[i]: 
     c += 1 
     continue 
    else: 
     continue 

# actual output: [0, 1, 2, 2, 1, 3, 4, 5, 6, 3, 7, 8] 
# expected output: [0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

什麼是我的代碼的問題，因此，在新的列表編號是基於邏輯取代？什麼是實現這一目標的優雅方式？

由於我的頂點列表將在100k範圍內，我正在尋找最快的執行。

來源

2017-01-31 RedForty

'指數= [排序（列表（集（頂點）））索引（v）的對於頂點中的v]，我認爲是OP正在尋找的東西。我同意這個問題的措辭應該有所改進 – zinfandel

順便說一句，金芬戴爾的答案將工作，但它具有巨大的時間複雜性。對排序列表中的每個迭代進行排序，轉換爲列表，轉換爲設置+搜索'v'，對排序的（列表（集合（頂點）））。索引（v）'進行計算 –

@MoinuddinQuadri Ah，所以雖然它更容易閱讀，但它可能不是最快的解決方案。由於我的頂點列表將在100k範圍內，我應該尋找最快的執行。你的答案會更快嗎？ – RedForty

您可以通過使用sorted()和set()與enumerate()通過創建中間dict對象的數目，其在原始列表位置映射實現它：

>>> my_list = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22] 
>>> num_map = {j: i for i, j in enumerate(sorted(set(my_list)))} 
#           ^^to get unique elements 
#           ^sort numbers in ascending order 

>>> [num_map[n] for n in my_list] 
[0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6]

作爲評論由Stefan，它可以被在實現單行使用map()爲：

list(map({j: i for i, j in enumerate(sorted(set(my_list)))}.get, my_list)) 
#^type-cast `map` object to `list` for Python 3.x compatibility

來源

2017-01-31 10:31:20

只是一個單線版本：'map（{j：i for i，j in enumerate（sorted（set（my_list）））}。get，my_list）' –

你在評論中提到了你的數據將是大（100K）一nd你正在尋找最快的執行。您應該考慮使用numpy的：

>>> vertices = [0, 1, 2, 2, 1, 20, 21, 21, 20, 3, 23, 22] 
>>> np.unique(vertices, return_inverse=True)[1] 
array([0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6])

0和100之間的隨機分佈的整數的10萬多頭排列，這是超過3倍比目前公認的答案更快。

由用戶DSM在Python聊天室建議另一種高性能的選項中，使用scipy.stats排名數據：

>>> import scipy.stats 
>>> (scipy.stats.rankdata(vertices, 'dense') - 1).astype(int) 
array([0, 1, 2, 2, 1, 4, 5, 5, 4, 3, 7, 6])

來源

2017-02-01 16:05:12 wim

甜！具有獨特功能的獨特解決方案。 – MYGz

等級號碼列表，允許領帶

回答

相關問題