2013-08-21 66 views
0

考慮下面的代碼:一旦條件失敗就停止查找運行平均值?

sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,  1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375] 
avg = []; final = [] 

def runningMean(seq, n=0, total=0): #function called recursively 
     if not seq: 
     return [] 
     total =total+int(seq[-1]) 
     return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)] 

def main(): 

    avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375 
    print avg 
    for i in range(len(sub)): 
     if (int(sub[i]) > float(avg[i] * 0.9)): #checking the condition 
     final.append(sub[i]) 
    print final 


if __name__ == '__main__': 
     main() 

輸出由runningmean &子列表列表的不滿足條件:

[1282960.6216216215, 1297286.75, 1312372.4571428571, 1328319.6764705882, 1345230.0909090908, 1363181.3125, 1382289.2580645161, 1402634.7, 1409742.7931034483, 1417241.142857143, 1425232.111111111, 1433651.3846153845, 1442738.76, 1452397.5, 1462798.0869565217, 1474143.2727272727, 1486568.142857143, 1492803.2, 1499691.7368421052, 1507344.111111111, 1515724.0, 1525005.25, 1535471.9333333333, 1547401.642857143, 1561126.2307692308, 1577136.75, 1595934.1818181819, 1618484.2, 1646032.3333333333, 1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0] 

    [1361867, 1361921, 1361949, 1364886, 1367224, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375] 

我需要做的是它應該停止的發現是什麼一旦條件不滿足

(sub[i] > float(avg[i] * 0.9)) 

平均運行I,E的結果應該是:

[1680349.875, 1710198.857142857, 1710330.6666666667, 1710344.0, 1710353.0, 1710363.3333333333, 1710370.0, 1710375.0] 
    [1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375] 

如果任何人都可以在python中爲此提出解決方案,這將有所幫助。

+1

你的移動平均算法是不正確的,我不認爲。基本上,每個結果都是當前值的平均值,以及所有跟隨它的值,均等。以前的值被忽略。雖然這不是一個真正的答案,所以我將它作爲評論留下。 – Blckknght

+0

@Blckknght我的運行平均算法正常工作,我找到從列表的最後一項開始的運行平均值,例如:1710375運行平均值爲-1710375.0,接下來爲(1710365 + 1710375)/ 2 == 1710370.0等等 –

+0

OK ,只是想確保這真的是你想要的。當你把它描述成「跑步的平均水平」時,這並不是我期望的第一件事。 – Blckknght

回答

1

我建議將您的平均計算器重新實現爲生成器。一個生成器只會計算它在迭代時需要產生的下一個值。如果你提前停止迭代,其餘的計算將不會完成。

此外,設計代碼以便向前迭代而不是向後迭代更容易。如果您需要倒退,請使用reversed函數獲取反向迭代器,或者在列表中調用reverse方法。

這裏有一臺發電機,計算累計平均值(在前進的方向,而不是向後看):

def runningMean(iterable): 
    """A generator, yielding a cumulative average of its input.""" 
    num = 0 
    denom = 0 
    for x in iterable: 
     num += x 
     denom += 1 
     yield num/denom 

爲了得到你想要的反向累計平均,你需要使用一個reversed迭代的你原始數據:

>>> sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,  1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375] 
>>> list(runningMean(reversed(sub))) 
[1710375.0, 1710370.0, 1710363.3333333333, 1710353.0, 1710344.0, 1710330.6666666667, 1710198.857142857, 1680349.875, 1646032.3333333333, 1618484.2, 1595934.1818181819, 1577136.75, 1561126.2307692308, 1547401.642857143, 1535471.9333333333, 1525005.25, 1515724.0, 1507344.111111111, 1499691.7368421052, 1492803.2, 1486568.142857143, 1474143.2727272727, 1462798.0869565217, 1452397.5, 1442738.76, 1433651.3846153845, 1425232.111111111, 1417241.142857143, 1409742.7931034483, 1402634.7, 1382289.2580645161, 1363181.3125, 1345230.0909090908, 1328319.6764705882, 1312372.4571428571, 1297286.75, 1282960.6216216215] 

您可以用list.reverse()方法扭轉這一點,如果你想看到它在同一順序作爲原始輸入,但如果你想早點停止計算,我認爲你需要保持它去吧ckwards再長一點。

要停止,當你發現一個值,該值比累計平均水平超過10%,則可以使用itertools.takewhile

import itertools 

results = list(itertools.takewhile(lambda x: x[0] > 0.9 * x[1], 
            itertools.izip(reversed(sub), 
                runningMean(reversed(sub))))) 

在Python 3,使用常規zip內置,而不是itertools.izip

這給你一個滿足你的條件的值和平均值的列表,從結尾開始並在第一個失敗的測試值之前停止。這裏是你如何看待他們:

results.reverse() # put them back in regular order 
for value, average in results: 
    print value, results 

輸出:

1709408 1710198.857142857 
1710264 1710330.6666666667 
1710308 1710344.0 
1710322 1710353.0 
1710350 1710363.3333333333 
1710365 1710370.0 
1710375 1710375.0 
+0

@Blckknghtdef runningMean(迭代): NUM = 0 DENOM = 0 對於x在迭代: NUM + = X DENOM + = 1個 收率NUM/DENOM 平均值=列表(runningMean(反轉(子))): print avg當我這樣做時,它不給我連續的平均值,只是以相反的順序打印子列表 –

+0

@JhonWatson:嗯,我看不出會發生什麼。你確定這與原始值完全相反,而不僅僅是其他整數?在Python 2中,您可能需要確保在文件頂部(在任何其他導入之前)執行'from __future__ import division',否則該部門將截斷爲下一個最小整數。或者,我想,另一個解決辦法是將'num'或'denom'初始化爲'0.0'而不是'0'。 – Blckknght

0
sub = [767220, 769287, 770167, 770276, 770791, 770835, 771926, 1196500, 1199789,  1201485, 1206331, 1206467, 1210929, 1213184, 1213204, 1213221, 1361867, 1361921, 1361949, 1364886, 1367224, 1368005, 1368456, 1368982, 1369000, 1370365, 1370434, 1370551, 1371492, 1471407, 1709408, 1710264, 1710308, 1710322, 1710350, 1710365, 1710375] 

def runningMean(seq, n=0, total=0): #function called recursively 
    if not seq: 
     return [] 
    total = total + int(seq[-1]) 
    if int(seq[-1]) < total/float(n+1) * 0.9: # Check your condition to see if it's time to stop averaging. 
     return [] 
    return runningMean(seq[:-1], n=n+1, total=total) + [total/float(n+1)] 

avg = runningMean(sub, n = 0, total = 0) 

print avg 
print sub[-len(avg):] 
+0

如果一個想打印的seq也滿足條件怎麼做?i,e [1709408,1710264,1710308,1710322,1710350,1710365,1710375] –

+0

@JhonWatson在上面加了它。 – Brionius

0

獲得預期的運行平均,我跑:

sub.reverse() 
avg = runningMean(sub,n = 0,total = 0) #function call to obtain running mean starting from last element in the list i,e 1710375 
print avg 

下一個比較部分尚不清楚。你能用文字來描述算法嗎?