0
我正在嘗試做以下事情,但需要很長時間。 可有人請建議做這個簡化大數據處理腳本
f = open('answer.csv','w')
f.write('Datetime,0: Vm,0: Va,1: Vm,1: Va,2: Vm,2: Va,3: Vm,3: Va,4: Vm,4: Va,5: Vm,5: Va,6: Vm,6: Va,7: Vm,7: Va,8: Vm,8: Va,9: Vm,9: Va,10: Vm,10: Va,11: Vm,11: Va,12: Vm,12: Va,13: Vm,13: Va\n')
# 'n' is around 8000000
# 'PQ_data' is a pandas DataFrame with more than n rows
# 'class' is a python class object with some functions in it
for i in range(n):
p = []
q = []
for j in range(1,14):
if j<=10:
p.append(PQ_data['{} P'.format(j)][i])
q.append(PQ_data['{} Q'.format(j)][i])
else:
p.append(0)
q.append(0)
class.do_something(p,q)
vm = class.get_Vm().tolist()
va = class.get_Va().tolist()
# above methods return 14 length lists.
# PQ_data.index has datetime values
f.write('{}'.format(PQ_data.index[i]))
for j in range(len(vm)):
f.write(',{},{}'.format(vm[j],va[j]))
f.write('\n')
f.close()
謝謝! @ inspectorG4dget這比我的代碼更好,但仍需要很多時間。 可能是由於函數do_something本身花費的時間本身 –
@code_dragon:很可能。如果你用'do_something'的定義創建一個新帖子,我們可能會優化 – inspectorG4dget
,但do_something不是簡單的@ inspectorG4dget。類使用一些API來做一些計算並返回必要的輸出。 –