1
我有一個腳本,用於檢查2個文本文件並打印出共同的字段。不過,我覺得它不夠快,我正在尋求優化。檢查2列腳本快速運行
FILE1(10k行,3列)和FILE2(200k行,2列),兩個文件(csv文件)共有1個字段。
FILE1:
92073263d,86674404000555506123,通信
FILE2:
163738212,7a93632111w7-01e7-40e7-9387-1863e7683eca 63729jd83,07633221122c-6598-4489-B539-e42e2dcb3235 8djdy37w8,2b8retyre396-2472-4b2d-8d07-e170fa3d1f64 92073263d,07633221122c-6ew8-4eww-B539-e42dsadsadsa
with open('FILE1') as file1:
file1_contents = { tuple(line.split(',')) for line in file1 }
print file1_contents
with open('FILE2') as file2:
for line in file2:
c1,c2 = line.split()
if c1 in file1_contents:
f = open("FILE3","w")
f.write(c2)
f.close()
如果file1_contents中的c1給我一個困難時間,我想避免任何嵌套循環來保持高速度。任何建議?
嘗試使用'pandas' ...... –
感謝COLDSPEED。 – Jul