我有一個問題,如何減少運行時間。
我做的代碼是Python。
它需要一個龐大的數據集作爲輸入,處理它,計算並將輸出寫入數組。大多數計算可能非常簡單,如求和。在輸入文件中,大約有1億行和3列。我面臨的問題是運行時間太長。如何減少運行時間?Python:如何減少運行時間?
這裏是我寫的代碼。
我需要從新文件中計算出所有新值(從GenePair到帶有標題的RM_pval)。提前謝謝你。
fi = open ('1.txt')
fo = open ('2.txt','w')
import math
def log(x):
return math.log(x)
from math import sqrt
import sys
sys.path.append('/tools/lib/python2.7/site-packages')
import numpy
import scipy
import numpy as np
from scipy.stats.distributions import norm
for line in fi.xreadlines():
tmp = line.split('\t')
GenePair = tmp[0].strip()
PCC_A = float(tmp[1].strip())
PCC_B = float(tmp[2].strip())
ZVAL_A = 0.5 * log((1+PCC_A)/(1-PCC_A))
ZVAL_B = 0.5 * log((1+PCC_B)/(1-PCC_B))
ABS_ZVAL_A = abs(ZVAL_A)
ABS_ZVAL_B = abs(ZVAL_B)
Var_A = float(1)/float(21-3) #SAMPLESIZE - 3
Var_B = float(1)/float(18-3) #SAMPLESIZE - 3
WT_A = 1/Var_A #float
WT_B = 1/Var_B #float
ZVAL_A_X_WT_A = ZVAL_A * WT_A #float
ZVAL_B_X_WT_B = ZVAL_B * WT_B #float
SumofWT = (WT_A + WT_B) #float
SumofZVAL_X_WT = (ZVAL_A_X_WT_A + ZVAL_B_X_WT_B) #float
#FIXED MODEL
meanES = SumofZVAL_X_WT/SumofWT #float
Var = float(1)/SumofWT #float
SE = math.sqrt(float(Var)) #float
LL = meanES - (1.96 * SE) #float
UL = meanES - (1.96 * SE) #float
z_score = meanES/SE #float
p_val = scipy.stats.norm.sf(z_score)
#CAL
ES_POWER_X_WT_A = pow(ZVAL_A,2) * WT_A #float
ES_POWER_X_WT_B = pow(ZVAL_B,2) * WT_B #float
WT_POWER_A = pow(WT_A,2)
WT_POWER_B = pow(WT_B,2)
SumofES_POWER_X_WT = ES_POWER_X_WT_A + ES_POWER_X_WT_B
SumofWT_POWER = WT_POWER_A + WT_POWER_B
#COMPUTE TAU
tmp_A = ZVAL_A - meanES
tmp_B = ZVAL_B - meanES
temp = pow(SumofZVAL_X_WT,2)
Q = SumofES_POWER_X_WT - (temp /(SumofWT))
if PCC_A !=0 or PCC_B !=0:
df = 0
else:
df = 1
c = SumofWT - ((pow(SumofWT,2))/SumofWT)
if c == 0:
tau_square = 0
else:
tau_square = (Q - df)/c
#calculation
Var_total_A = Var_A + tau_square
Var_total_B = Var_B + tau_square
WT_total_A = float(1)/Var_total_A
WT_total_B = float(1)/Var_total_B
ZVAL_X_WT_total_A = ZVAL_A * WT_total_A
ZVAL_X_WT_total_B = ZVAL_B * WT_total_B
Sumoftotal_WT = WT_total_A + WT_total_B
Sumoftotal_ZVAL_X_WT= ZVAL_X_WT_total_A + ZVAL_X_WT_total_B
#RANDOM MODEL
RM_meanES = Sumoftotal_ZVAL_X_WT/Sumoftotal_WT
RM_Var = float(1)/Sumoftotal_WT
RM_SE = math.sqrt(float(RM_Var))
RM_LL = RM_meanES - (1.96 * RM_SE)
RM_UL = RM_meanES + (1.96 * RM_SE)
RM_z_score = RM_meanES/RM_Var
RM_p_val = scipy.stats.norm.sf(RM_z_score)
1.使用** profiler **找到瓶頸。 2.搜索該特定代碼的解決方案。 3.如果沒有找到,請在這裏問。 – sashkello
此外,這個問題似乎適合http://codereview.stackexchange.com – aIKid
但我其實不認爲你可以在這裏做任何顯着的加速。我的意思是,有很多小東西,但他們不會貢獻太多。 (爲什麼你總是重新計算'SumofWT' - 這不是一個常量嗎?) – sashkello