嵌套循環讀取Python中不同的CSV文件

基本上，我必須打開一份CSV報告（約30,000行），並將其更名爲ARTIST和TITLE，如果它們出現在更正的ARTIST和TITLE的第二個CSV文件（約10,000行）。嵌套循環讀取Python中不同的CSV文件

我想出的代碼將掃描所有31,400行，但由於某種原因，它只會替換它找到的第一個實例。

這裏是我的代碼：

def convert(): # StackOverflow refuses to display the indents correctly 
global modified 
print "\n\nConverting: " + logfile + "\n\n" 
songCount = 0  # Number of lines required to be reported 
unclaimedCount = 0 # Number of lines not required to be reported (used to double check accuracy or report) 
freport = open(musicreportname, "w") # This is the new report we will create 
flogfile = open(logfile, "r")  # This is the existing report 
freplacefile = open(replacefile, "r")# This file contains corrected names to be substituted and ISRC Codes 
freport.write("^NAME_OF_SERVICE^|^TRANSMISSION_CATEGORY^|^FEATURED_ARTIST^|^SOUND_RECORDING_TITLE^|^ISRC^|^ALBUM_TITLE^|^MARKETING_LABEL^|^ACTUAL_TOTAL_PERFORMANCES^\n") 
lineCount = 0 
rlinecount = 0 
for line in csv.reader(flogfile, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
    lineCount += 1 
    if line[0][0] == "#": 
     continue 
    if line[16] == "S": 
     songCount += 1 
     matched = "FALSE" 
     rlineCount = 0 
     for rline in csv.reader(freplacefile, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
      rlineCount += 1 
      if line[3] == rline[2]: 
       print "Matched " + line[3] 
       if line[4] == rline[1]: 
        print "Matched " + line[3], rline[1] 
        output = "^" + service + "^|^" + "B" + "^|^" + rline[8] + "^|^" + rline[7] + "^|^" + rline[6] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
        freport.write(output) 
        matched = "TRUE" 
        modified += 1 
        break 
      if matched == "FALSE": 
       output = "^" + service + "^|^" + "B" + "^|^" + line[3] + "^|^" + line[4] + "^|^" + line[8] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
       freport.write(output) 
    else: 
     unclaimedCount += 1 
freport.close() 
flogfile.close() 
freplacefile.close() 
print str(songCount) + " Total Songs Found." 
print "Checked " + str(lineCount) + " lines." 
print "Replaced " + str(modified) + " lines."

任何幫助將不勝感激！感謝您的期待！

來源

2016-04-27 L Purloi

我自己和另一位用戶修改了代碼格式，使其更清晰一些 - 您能否確認我們沒有將縮進的螺絲擰緊？爲了將來的參考，如果在代碼之前添加四個空格，它將被放置在一個代碼塊中並且更易於閱讀。 – thegrinner

嵌套循環是這樣做的錯誤方式，因爲它意味着它必須讀取第二個文件30,000次。閱讀第二個文件並創建一個包含所有映射的字典。然後讀取第一個文件並使用映射字典執行所有重命名。 – Barmar

我仍然試圖編輯這個方式，它顯示正確的縮進...... –

讀取第二個文件一次，並創建一個包含所有映射的字典。然後讀取第一個文件並使用映射字典執行所有重命名。 - Barmar 4月27日18:12

我跟隨了Barmar的建議。我剛剛開始。我用元組而不是字典，但同樣的想法。我從來沒有在上面的代碼中找到錯誤，但現在一切都按預期工作。感謝Barmar。

來源

2016-04-30 22:44:22

嵌套循環讀取Python中不同的CSV文件

回答

相關問題