2013-11-28 44 views
0

我目前正試圖排序的表格清單列表排序的字母數字:的Python列表

[["Chr1", "949699", "949700"],["Chr11", "3219", "444949"], 
["Chr10", "699", "800"],["Chr2", "232342", "235345234"], 
["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]] 

使用內置sorted(),我得到:

[「CHR1」, ''949699','949700'],['Chr1','950000','960000'],['Chr10','699','800'],['Chr11','3219','444949'] ,''Chr2','232342','235345234'],['ChrX','4567','45634']]

但我希望「Chr2」在「Chr10」之前。我目前的解決方案包括改編自頁一些代碼:Does Python have a built in function for string natural sort?

我目前的解決辦法是這樣的:

import re 

def naturalSort(l): 
    convert= lambda text: int(text) if text.isdigit() else text.lower() 
    alphanum_key= lambda key: [convert(c) for c in re.split('([0-9]+)', key)] 
    if isinstance(l[0], list): 
     return sorted(l, key= lambda k: [alphanum_key(x) for x in k]) 
    else: 
     return sorted(l, key= alphanum_key) 

屈服正確的順序:

[['Chr1', '949699', '949700'], ['Chr1', '950000', '960000'], ['Chr2', '232342', '235345234'], ['Chr10', '699', '800'], ['Chr11', '3219', '444949'], ['ChrX', '4567', '45634']] 

有沒有更好的方式來做到這一點?

+0

這被稱爲 '自然排序'。 –

+0

啊..但我認爲這可能不是一個騙局,因爲他試圖自己創造它。但是這個問題可能更適合http://codereview.stackexchange.com – aIKid

+0

我引用了自然排序頁面。我具體詢問如何對列表進行排序。 – Megatron

回答

0

它是否喜歡:

In [1]: l = [["Chr1", "949699", "949700"],["Chr11", "3219", "444949"],["Chr10", "699", "800"],["Chr2", "232342", "235345234"],["ChrX", "4567", "45634"],["Chr1", "950000", "960000"]] 

In [2]: sorted(l, key=lambda x: int(x[0].replace('Chr', '')) if x[0].replace('Chr', '').isdigit() else x[0]) 
Out[2]: 
[['Chr1', '949699', '949700'], 
['Chr1', '950000', '960000'], 
['Chr2', '232342', '235345234'], 
['Chr10', '699', '800'], 
['Chr11', '3219', '444949'], 
['ChrX', '4567', '45634']] 

或者更優雅的變體:

sorted(l, key=lambda x: int(''.join([i for i in x[0] if i.isdigit()])) if re.findall(r'\d+$', x[0]) else x[0]) 
+0

輸入並不總是這種形式。有時也可以只是沒有「Chr」前綴的「1」,「2」,「11」,「X」。 – Megatron

+0

改變了排序方式like'sorted(l,key = lambda x:int(''。join([i for i in x [0] if i.isdigit()]))if [i for i in x [0]如果i.isdigit()] else x [0])' – greg

+0

更有趣的變體:'import re;如果re.findall(r'\ d + $',x [0]),則返回true,否則返回false。 )else x [0])' – greg

0

這裏有一個更緊湊的解決方案:

natkey = lambda e: [x or int(y) for x, y in re.findall(r'(\D+)|(\d+)', e)] 
print sorted(data, key=lambda item: map(natkey, item))