2016-04-27 36 views
0

基本上,我必須打開一份CSV報告(約30,000行),並將其更名爲ARTIST和TITLE,如果它們出現在更正的ARTIST和TITLE的第二個CSV文件(約10,000行) 。嵌套循環讀取Python中不同的CSV文件

我想出的代碼將掃描所有31,400行,但由於某種原因,它只會替換它找到的第一個實例。

這裏是我的代碼:

def convert(): # StackOverflow refuses to display the indents correctly 
global modified 
print "\n\nConverting: " + logfile + "\n\n" 
songCount = 0  # Number of lines required to be reported 
unclaimedCount = 0 # Number of lines not required to be reported (used to double check accuracy or report) 
freport = open(musicreportname, "w") # This is the new report we will create 
flogfile = open(logfile, "r")  # This is the existing report 
freplacefile = open(replacefile, "r")# This file contains corrected names to be substituted and ISRC Codes 
freport.write("^NAME_OF_SERVICE^|^TRANSMISSION_CATEGORY^|^FEATURED_ARTIST^|^SOUND_RECORDING_TITLE^|^ISRC^|^ALBUM_TITLE^|^MARKETING_LABEL^|^ACTUAL_TOTAL_PERFORMANCES^\n") 
lineCount = 0 
rlinecount = 0 
for line in csv.reader(flogfile, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
    lineCount += 1 
    if line[0][0] == "#": 
     continue 
    if line[16] == "S": 
     songCount += 1 
     matched = "FALSE" 
     rlineCount = 0 
     for rline in csv.reader(freplacefile, delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True): 
      rlineCount += 1 
      if line[3] == rline[2]: 
       print "Matched " + line[3] 
       if line[4] == rline[1]: 
        print "Matched " + line[3], rline[1] 
        output = "^" + service + "^|^" + "B" + "^|^" + rline[8] + "^|^" + rline[7] + "^|^" + rline[6] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
        freport.write(output) 
        matched = "TRUE" 
        modified += 1 
        break 
      if matched == "FALSE": 
       output = "^" + service + "^|^" + "B" + "^|^" + line[3] + "^|^" + line[4] + "^|^" + line[8] + "^|^" + line[5] + "^|^" + line[6] + "^|^" + line[2] + "^\n" 
       freport.write(output) 
    else: 
     unclaimedCount += 1 
freport.close() 
flogfile.close() 
freplacefile.close() 
print str(songCount) + " Total Songs Found." 
print "Checked " + str(lineCount) + " lines." 
print "Replaced " + str(modified) + " lines." 

任何幫助將不勝感激!感謝您的期待!

+1

我自己和另一位用戶修改了代碼格式,使其更清晰一些 - 您能否確認我們沒有將縮進的螺絲擰緊?爲了將來的參考,如果在代碼之前添加四個空格,它將被放置在一個代碼塊中並且更易於閱讀。 – thegrinner

+0

嵌套循環是這樣做的錯誤方式,因爲它意味着它必須讀取第二個文件30,000次。閱讀第二個文件並創建一個包含所有映射的字典。然後讀取第一個文件並使用映射字典執行所有重命名。 – Barmar

+0

我仍然試圖編輯這個方式,它顯示正確的縮進...... –

回答

0

讀取第二個文件一次,並創建一個包含所有映射的字典。然後讀取第一個文件並使用映射字典執行所有重命名。 - Barmar 4月27日18:12

我跟隨了Barmar的建議。我剛剛開始。我用元組而不是字典,但同樣的想法。我從來沒有在上面的代碼中找到錯誤,但現在一切都按預期工作。感謝Barmar。