我已經寫了一個腳本,它必須從一個文件夾(大約10,000)中讀取大量的excel文件。該腳本加載excel文件(其中一些文件有2000多行)並讀取一列來計算行數(檢查內容)。如果行數不等於給定數量,它會將警告寫入日誌。使用openpyxl和大型數據的內存錯誤擅長
問題出現在腳本讀取超過1,000個excel文件時。那麼當它拋出內存錯誤時,我不知道問題出在哪裏。以前,該腳本會讀取14,000行兩個csv文件並將其存儲在列表中。這些列表包含excel文件的標識符及其相應的行數。如果這個行數不等於excel文件的行數,它會寫入警告。可能是閱讀這些列表的問題?
我使用openpyxl加載工作簿,是否需要在打開下一個之前關閉它們?
這是我的代碼:
# -*- coding: utf-8 -*-
import os
from openpyxl import Workbook
import glob
import time
import csv
from time import gmtime,strftime
from openpyxl import load_workbook
folder = ''
conditions = 0
a = 0
flight_error = 0
condition_error = 0
typical_flight_error = 0
SP_error = 0
cond_numbers = []
with open('Conditions.csv','rb') as csv_name: # Abre el fichero csv donde estarán las equivalencias
csv_read = csv.reader(csv_name,delimiter='\t')
for reads in csv_read:
cond_numbers.append(reads)
flight_TF = []
with open('vuelo-TF.csv','rb') as vuelo_TF:
csv_read = csv.reader(vuelo_TF,delimiter=';')
for reads in csv_read:
flight_TF.append(reads)
excel_files = glob.glob('*.xlsx')
for excel in excel_files:
print "Leyendo excel: "+excel
wb = load_workbook(excel)
ws = wb.get_sheet_by_name('Control System')
flight = ws.cell('A7').value
typical_flight = ws.cell('B7').value
a = 0
for row in range(6,ws.get_highest_row()):
conditions = conditions + 1
value_flight = int(ws.cell(row=row,column=0).value)
value_TF = ws.cell(row=row,column=1).value
value_SP = int(ws.cell(row=row,column=4).value)
if value_flight == '':
break
if value_flight != flight:
flight_error = 1 # Si no todos los flight numbers dentro del vuelo son iguales
if value_TF != typical_flight:
typical_flight_error = 2 # Si no todos los typical flight dentro del vuelo son iguales
if value_SP != 100:
SP_error = 1
for cond in cond_numbers:
if int(flight) == int(cond[0]):
conds = int(cond[1])
if conds != int(conditions):
condition_error = 1 # Si el número de condiciones no se corresponde con el esperado
for vuelo_TF in flight_TF:
if int(vuelo_TF[0]) == int(flight):
TF = vuelo_TF[1]
if typical_flight != TF:
typical_flight_error = 1 # Si el vuelo no coincide con el respectivo typical flight
if flight_error == 1:
today = datetime.datetime.today()
time = today.strftime(" %Y-%m-%d %H.%M.%S")
log = open('log.txt','aw')
message = time+': Los flight numbers del vuelo '+str(flight)+' no coinciden.\n'
log.write(message)
log.close()
flight_error = 0
if condition_error == 1:
today = datetime.datetime.today()
time = today.strftime(" %Y-%m-%d %H.%M.%S")
log = open('log.txt','aw')
message = time+': El número de condiciones del vuelo '+str(flight)+' no coincide. Condiciones esperadas: '+str(int(conds))+'. Condiciones obtenidas: '+str(int(conditions))+'.\n'
log.write(message)
log.close()
condition_error = 0
if typical_flight_error == 1:
today = datetime.datetime.today()
time = today.strftime(" %Y-%m-%d %H.%M.%S")
log = open('log.txt','aw')
message = time+': El vuelo '+str(flight)+' no coincide con el typical flight. Typical flight respectivo: '+TF+'. Typical flight obtenido: '+typical_flight+'.\n'
log.write(message)
log.close()
typical_flight_error = 0
if typical_flight_error == 2:
today = datetime.datetime.today()
time = today.strftime(" %Y-%m-%d %H.%M.%S")
log = open('log.txt','aw')
message = time+': Los typical flight del vuelo '+str(flight)+' no son todos iguales.\n'
log.write(message)
log.close()
typical_flight_error = 0
if SP_error == 1:
today = datetime.datetime.today()
time = today.strftime(" %Y-%m-%d %H.%M.%S")
log = open('log.txt','aw')
message = time+': Hay algún Step Percentage del vuelo '+str(flight)+' menor que 100.\n'
log.write(message)
log.close()
SP_error = 0
conditions = 0
的,如果最終的語句是檢查和書面警告日誌。
我使用的Windows XP與8 GB內存和英特爾至強W3505(雙核,2,53 GHz)。
此選項似乎不存在了( openpyxl 2.4.1)。您提供的鏈接沒有提及這樣的選項。也許你知道一個替代品? – SdaliM