2017-01-25 104 views
0

嗨我有複雜的數據對象,我想通過s排序。下面的簡化版本:如何使用特定規則對列表進行排序

class Data(object): 
    def __init__(self, s): 
     self.s = s 

這些數據對象的每一個都將放置在特定的類別中,以方便以後使用。簡體版下面再次

class DataCategory(object): 
    def __init__(self, id1, id2, linked_data=None): 
     self.id1 = id1 
     self.id2 = id2 
     self.ld = linked_data 

我想按照它們的s號碼排序數據但是有更少的規則。如果從第一個數據收集中使用一個數據對象,那麼我想使用第二個數據集中的一個數據對象,如果其數目相同或更低。這裏是我所得到的,我想實現

# order I get 
# [['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p01g01r02', 2], ['p01g01r03', 2], ['p01g01r06', 2], ['p02g01r02', 2], ['p02g01r03', 2], ['p02g01r04', 2], ['p01g01r04', 1], ['p02g01r06', 1]] 
# order I want 
# [['p02g01r05', 5], ['p01g01r05', 4], ['p02g01r01', 4], ['p01g01r01', 3], ['p02g01r02', 2], ['p01g01r02', 2], ['p02g01r03', 2], ['p01g01r03', 2], ['p02g01r04', 2], ['p01g01r06', 2], ['p02g01r06', 1]], ['p01g01r04', 1] 

這是我創建至今,但我在想,我這個要在錯誤的方向是什麼。我認爲,要替換的索引列表是正確的。

# Some data objects 
p01g01r01 = Data(3) 
p01g01r02 = Data(2) 
p01g01r03 = Data(2) 
p01g01r04 = Data(1) 
p01g01r05 = Data(4) 
p01g01r06 = Data(2) 

p02g01r01 = Data(4) 
p02g01r02 = Data(2) 
p02g01r03 = Data(2) 
p02g01r04 = Data(2) 
p02g01r05 = Data(5) 
p02g01r06 = Data(1) 

p01g01 = DataCategory("01", "01", []) 
p02g01 = DataCategory("02", "01", []) 


# link data to data category 
def ldtdc(dc): 
    lst = [] 
    data = "p" + dc.id1 + "g" + dc.id2 + "r" 
    for i in range(1, 7): 
     if i < 10: 
      lst.append(data + "0" + str(i)) 
     else: 
      lst.append(data + str(i)) 
    return lst 

p01g01.ld = ldtdc(p01g01) 
p02g01.ld = ldtdc(p02g01) 


# /@= This starts to get way too complicated fast ############################ 
def lstu(ag, dg): 
    lst = [] 
    # data list of first collection 
    dlofc = [] 
    # data list of second collection 
    dlosc = [] 

    # for every data unit that exists in data collection 
    for unit in ag.ld: 
     # lst.append([unit, globals()[unit].s+10]) 
     lst.append([unit, globals()[unit].s]) 
     dlofc.append([unit, globals()[unit].s]) 

    for unit in dg.ld: 
     lst.append([unit, globals()[unit].s]) 
     dlosc.append([unit, globals()[unit].s]) 

    # lambda function is used here to sort list by data value ([1] is index of the item) 
    lst = sorted(lst, key=lambda x: x[1], reverse=True) 
    # current index 
    ci = 0 

    previous_data = ["last data unit will be stored here", 0] 
    # sorted list 
    slst = [] 

    for unit in lst: 
     try: 
      next_data = lst[ci+1] 
     except IndexError: 
      next_data = ["endoflist", 0] 
     if previous_data[0] == "last data unit will be stored here": 
      pass 
     elif previous_data[0][:6] == unit[0][:6]: 
      if unit[0][:6] not in dlofc[0][0]: 
       slst.append([unit[0], unit[1], ci]) 
      elif unit[0][:6] not in dlosc[0][0]: 
       slst.append([unit[0], unit[1], ci]) 
      else: 
       print "Error" 

     previous_data = unit 
     ci += 1 

    print "slist below" 
    print slst 

    return lst 
# \@= END ##################################################################### 


print p01g01.ld 
print p02g01.ld 


data_list = lstu(p01g01, p02g01) 
print data_list 

什麼是排序這種數據的快速和正確的方法?

+1

你考慮過'sorted'函數或'list.sort'方法嗎? – skyking

+0

在上面的例子中,你可以看到我已經使用了排序,但它不足以滿足新列表的所有要求。 – Hsin

+0

你知道/意識到你可以控制'sorted'和'list.sort'在排序時比較元素的方式?一旦你可以控制,我不明白你爲什麼不應該能夠使用'sorted'或'list.sort'。 – skyking

回答

0

找到解決辦法。新lstu功能:

# replaced lambda with normal function 
def get_key(item): 
    return item[1] 


def lstu(ag, dg): 
    # ag list 
    agslst = [] 
    # dg list 
    dgslst = [] 

    # for every unit in first data collection 
    for unit in ag.u: 
     agslst.append([unit, globals()[unit].s]) 
    # sorted first data collection list 
    agslst = sorted(agslst, key=get_key, reverse=True) 
    print agslst 

    for unit in dg.u: 
     dgslst.append([unit, globals()[unit].s]) 
    # 2nd collection sorted list 
    dgslst = sorted(dgslst, key=get_key, reverse=True) 
    print dgslst 

    lst = [] 
    # last item 
    li = ["Empty", 0] 

    for item in range(0, len(agslst)+len(dgslst)+1): 
     if agslst and dgslst: 
      if agslst[0][1] == dgslst[0][1]: 
       if li[0][:6] == agslst[0][0][:6]: 
        li = dgslst.pop(0) 
        lst.append(li) 
       else: 
        li = agslst.pop(0) 
        lst.append(li) 

      elif agslst[0][1] > dgslst[0][1]: 
       li = agslst.pop(0) 
       lst.append(li) 
      else: 
       li = dgslst.pop(0) 
       lst.append(li) 

    return lst 

這樣,我履行新的(最終)列表前面所提到的要求

輸出:

[['p02g01r05', 5], ['p01g01r05', 4], ['p02g01r01', 4], ['p01g01r01', 3], ['p02g01r02', 2], ['p01g01r02', 2], ['p02g01r03', 2], ['p01g01r03', 2], ['p02g01r04', 2], ['p01g01r06', 2], ['p02g01r06', 1]], ['p01g01r04', 1]] 

我打開的任何優化建議。

1

您是否嘗試過先按字符串排序,然後按項目中的數字進行排序?

>>> items = [['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p01g01r02', 2], ['p01g01r03', 2], ['p01g01r06', 2], ['p02g01r02', 2], ['p02g01r03', 2], ['p02g01r04', 2], ['p01g01r04', 1], ['p02g01r06', 1]] 
>>> partially_sorted = sorted(items, key=lambda item: item[0], reverse=True) 
>>> sorted(partially_sorted, key=lambda item: item[1], reverse=True) 
[['p02g01r05', 5], ['p02g01r01', 4], ['p01g01r05', 4], ['p01g01r01', 3], ['p02g01r04', 2], ['p02g01r03', 2], ['p02g01r02', 2], ['p01g01r06', 2], ['p01g01r03', 2], ['p01g01r02', 2], ['p02g01r06', 1], ['p01g01r04', 1]] 
+0

它不會工作。如果他們有相同的「s」,則應該有p01g01中的一個項目,然後是p02g01中的一個項目。在上面的例子中,我們將從同一個集合中獲得許多具有相同「s」的項目。 – Hsin

+0

它基本上是合併兩個排序列表嗎?一個排序列表名爲p01g01,另一個是p02g01? – aisbaa

+0

不,python排序穩定https://en.wikipedia.org/wiki/Sorting_algorithm#Stability – aisbaa

相關問題