我試圖從Netezza導出一個大文件(使用Netezza ODBC + pyodbc),這個解決方案拋出memoryError,如果我循環沒有「list」它非常慢。你有沒有殺死我的服務器/ python進程但可以運行得更快的中間解決方案的想法?用Python(內存不足)導出2Gb + SELECT到CSV
cursorNZ.execute(sql)
archi = open("c:\test.csv", "w")
lista = list(cursorNZ.fetchall())
for fila in lista:
registro = ''
for campo in fila:
campo = str(campo)
registro = registro+str(campo)+";"
registro = registro[:-1]
registro = registro.replace('None','NULL')
registro = registro.replace("'NULL'","NULL")
archi.write(registro+"\n")
---- ----編輯
謝謝你,我想這一點: 其中, 「SQL」 爲查詢, cursorNZ是
connMy = pyodbc.connect(DRIVER=.....)
cursorNZ = connNZ.cursor()
chunk = 10 ** 5 # tweak this
chunks = pandas.read_sql(sql, cursorNZ, chunksize=chunk)
with open('C:/test.csv', 'a') as output:
for n, df in enumerate(chunks):
write_header = n == 0
df.to_csv(output, sep=';', header=write_header, na_rep='NULL')
有這: AttributeError:'pyodbc.Cursor'對象沒有屬性'遊標' 任何想法?
的可能的複製http://stackoverflow.com/questions/17707264/iterating- over-pyodbc-result-without-fetchall特別是對[fetchmany]的引用(http://code.google.com/p/pyodbc/wiki/Cursor#fetchmany)。 – tdelaney
代替傳遞'read_sql'連接。我會編輯我的答案來反映這個 –