我試圖刪除大csv文件中的所有URL並用字符串「URL」(所謂的等價標記)替換它。代碼做我想要的,但它在一行中聚集/連接一些行。用字符串替換tweet網址
這意味着原始csv有63.000行,輸出csv只有55000.這不是我想要的。我如何使用此令牌替換鏈接並將所有列分開?
#links are replaced with links
import re
with open('data_feat1.csv',"r", encoding="utf-8") as oldfile2, open('data_feat2.csv', 'w',encoding="utf-8") as newfile2:
for line in oldfile2:
line=re.sub(r"http\S+", r"URL", line) #replaces links with "URL"
newfile2.write(line)
newfile2.close()
您可以發佈一些示例數據嗎? –