2013-10-18 27 views
0

我正在寫Wilcoxon排名總和測試的擴展,這要求我先寫這個測試的基本功能。這也意味着我不能在這個練習中使用SciPy。排名關係的元組

我在那裏有基本的骨架代碼,但我很難平均關係的級別。這是我的代碼:

#read in data 
m1 = [0,0,0,0,0,2,3,3,3,4,4,5,6,10,10,10,11,12,15,15,15,20,22,25,25,27,30] 
w1 = [0,0,0,0,0,0,1,3,3,3,3,7,8,8,19,20,27,30] 

#convert to tuples, incl where they came from 
m1t = [] 
for m in m1: 
    m1t.append((m, "m1")) 
w1t = [] 
for w in w1: 
    w1t.append((w, "w1")) 

all1t = m1t + w1t #combine 

all1ts = sorted(all1t, key=lambda tup: tup[0]) #sort 

all1tsr = [row+(i,) for i,row in enumerate(all1ts,0)] #rank 

#revert to back to original grouping 
m1r = [i for i in all1tsr if i[1]=="m1"] 
w1r = [i for i in all1tsr if i[1]=="w1"] 

和這裏的電流輸出:

>>> all1tsr[:15] 
[(0, 'm1', 0), 
(0, 'm1', 1), 
(0, 'm1', 2), 
(0, 'm1', 3), 
(0, 'm1', 4), 
(0, 'w1', 5), 
(0, 'w1', 6), 
(0, 'w1', 7), 
(0, 'w1', 8), 
(0, 'w1', 9), 
(0, 'w1', 10), 
(1, 'w1', 11), 
(2, 'm1', 12), 
(3, 'm1', 13), 
(3, 'm1', 14)] 

元件1 eachtuple的是由它們所排序的值,元件2是一個標識符,和元件3的按元素1進行排序。有10個觀察值爲「0」作爲元素1,現在他們全部被分配上升的隊伍,但我想以某種方式平均這些等級(分配所有等級爲5)。

換句話說,我想這一點:

[(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(1, 'w1', 11), 
(2, 'm1', 12), 
(3, 'm1', 13.5), 
(3, 'm1', 13.5)] 

所有的反饋是歡迎,感謝

回答

1

對於初學者來說,我會得到all1ts在更短的方式:

import itertools 

all1ts = sorted(itertools.chain(((m, "m1") for m in m1), 
           ((w, "w1") for w in w1))) 

all1tsr = [row+(i,) for i,row in enumerate(all1ts)] 

然後我打算使用itertools.groupby,這基本上是爲了做這樣的事情而設計的。

groups = [] 
for _, group in itertools.groupby(all1tsr, lambda x: x[0]): 
    group = list(group) 
    rank = sum(x[2] for x in group)/len(group) 
    groups.extend((val, identifier, rank) for val, identifier, _ in group) 

運行在您的測試數據給了我這樣的結果:

[(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'm1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(0, 'w1', 5), 
(1, 'w1', 11), 
(2, 'm1', 12), 
(3, 'm1', 16), 
(3, 'm1', 16), 
(3, 'm1', 16), 
(3, 'w1', 16), 
(3, 'w1', 16), 
(3, 'w1', 16), 
(3, 'w1', 16), 
(4, 'm1', 20), 
(4, 'm1', 20), 
(5, 'm1', 22), 
(6, 'm1', 23), 
(7, 'w1', 24), 
(8, 'w1', 25), 
(8, 'w1', 25), 
(10, 'm1', 28), 
(10, 'm1', 28), 
(10, 'm1', 28), 
(11, 'm1', 30), 
(12, 'm1', 31), 
(15, 'm1', 33), 
(15, 'm1', 33), 
(15, 'm1', 33), 
(19, 'w1', 35), 
(20, 'm1', 36), 
(20, 'w1', 36), 
(22, 'm1', 38), 
(25, 'm1', 39), 
(25, 'm1', 39), 
(27, 'm1', 41), 
(27, 'w1', 41), 
(30, 'm1', 43), 
(30, 'w1', 43)] 

我認爲這是你想要的。

+0

這就是我以後,謝謝。我需要更多地瞭解你稱之爲「itertools」的魔法。 – alexhli

+0

這真是太棒了。 :-)如果這就是你所追求的,你能否將這個答案標記爲已接受?謝謝! –