0
我使用Python。我有100個zip文件。每個zipfile包含超過100個xmlfiles。使用xmlfiles我創建csvfiles。Python,多處理:如何優化代碼?讓代碼更快?
from xml.etree.ElementTree import fromstring
import zipfile
from multiprocessing import Process
def parse_xml_for_csv1(data, writer1):
root = fromstring(data)
for node in root.iter('name'):
writer1.writerow(node.get('value'))
def create_csv1():
with open('output1.csv', 'w') as f1:
writer1 = csv.writer(f1)
for i in range(1, 100):
z = zipfile.ZipFile('xml' + str(i) + '.zip')
# z.namelist() contains more than 100 xml files
for finfo in z.namelist():
data = z.read(finfo)
parse_xml_for_csv1(data, writer1)
def create_csv2():
with open('output2.csv', 'w') as f2:
writer2 = csv.writer(f2)
for i in range(1, 100):
...
if __name__ == "__main__":
p1 = Process(target=create_csv1)
p2 = Process(target=create_csv2)
p1.start()
p2.start()
p1.join()
p2.join()
請告訴我,如何優化我的代碼?讓代碼更快?
每個未壓縮的xml文件有多大?你正在寫的csvs? – goncalopp
goncalopp,xml文件很小(約10行)。我只需要2個CSV文件。 – Olga
我會使用lxml來完成處理,並儘可能在c級儘可能多地處理它http://lxml.de/FAQ.html#id1 –