如何逐行讀取CSV文件並將其每次存儲到新行中的新CSV文件？

我是Python新手。我正在嘗試讀取CSV文件，並從文件中刪除停用詞後，我必須將其存儲到新的CSV文件中。我的代碼是刪除停用詞，但它將第一行復制到單行文件的每一行。（例如，如果文件中有三行，則它將在第一行中將第一行復制三次）。如何逐行讀取CSV文件並將其每次存儲到新行中的新CSV文件？

正如我分析它，我認爲問題是在循環中，但我沒有得到它。我的代碼附在下面。

代碼：

import nltk 
import csv 
from nltk.corpus import stopwords 
from nltk.tokenize import word_tokenize 

def stop_Words(fileName,fileName_out): 
    file_out=open(fileName_out,'w') 
    with open(fileName,'r') as myfile: 
     line=myfile.readline() 
     stop_words=set(stopwords.words("english")) 
     words=word_tokenize(line) 
     filtered_sentence=[" "] 
     for w in myfile: 
      for n in words: 
       if n not in stop_words: 
       filtered_sentence.append(' '+n) 
     file_out.writelines(filtered_sentence) 
    print "All Done SW" 

stop_Words("A_Nehra_updated.csv","A_Nehra_final.csv") 
print "all done :)"

來源

2016-06-08 SmartF

這不是很清楚，你應該表現出輸入，電流輸出和預期輸出的一個例子。 – polku

你只是讀取文件的第一行：line=myfile.readline()。你想遍歷文件中的每一行。要做到這一點的方法之一是

with open(fileName,'r') as myfile: 
    for line in myfile: 
     # the rest of your code here, i.e.: 
     stop_words=set(stopwords.words("english")) 
     words=word_tokenize(line)

而且，你有這樣的循環

for w in myfile: 
    for n in words: 
     if n not in stop_words: 
      filtered_sentence.append(' '+n)

但是你會發現，在最外層循環所定義的w從未在循環內使用。你應該能夠刪除這一點，只是寫

for n in words: 
    if n not in stop_words: 
     filtered_sentence.append(' '+n)

編輯：

import nltk 
import csv 
from nltk.corpus import stopwords 
from nltk.tokenize import word_tokenize 

def stop_Words(fileName,fileName_out): 
    file_out=open(fileName_out,'w') 
    with open(fileName,'r') as myfile: 
     for line in myfile: 
      stop_words=set(stopwords.words("english")) 
      words=word_tokenize(line) 
      filtered_sentence=[""] 
      for n in words: 
       if n not in stop_words: 
        filtered_sentence.append(""+n) 
      file_out.writelines(filtered_sentence+["\n"]) 
    print "All Done SW"

來源

2016-06-08 16:04:20 Greg

用於MYFILE行：）線= myfile.readline（ STOP_WORDS =集（stopwords.words（「英語」））詞語= word_tokenize（線） filtered_sentence = [」「] 用於詞語N：如果n不在stop_words中： filtered_sentence.append（''+ n） file_out.writelines（filtered_sentence）我已經使用此代碼。它給出了以下錯誤： line = myfile.readline（） ValueError：混合迭代和讀取方法會丟失數據 – SmartF

您不需要'line = myfile.readline（）。'使用'in line in myfile'替換這個。 – Greg

非常感謝。一個問題就解決了。但它仍然將所有數據存儲在一行中。我無法在字符串中連接'\ n'運算符。請幫忙嗎？ – SmartF

如何逐行讀取CSV文件並將其每次存儲到新行中的新CSV文件？

回答

相關問題