複製列，添加一些文本，並在新的csv文件

我想打一個腳本，將從多個CSV文件複製第2列的文件夾中，並將其保存到一個CSV文件之前添加一些文字書寫。複製列，添加一些文本，並在新的csv文件

這裏就是我想要做的：從所有CSV文件

2）附加文本「Hello」 &「歡迎」在開始的每一行

1）在第2列中獲取數據並最終

3）將數據寫入到一個文件中

我試着用熊貓

import os 
import pandas as pd 
dataframes = [pd.read_csv(p, index_col=2, header=None) for p in ('1.csv','2.csv','3.csv')] 
merged_dataframe = pd.concat(dataframes, axis=0) 
merged_dataframe.to_csv("all.csv", index=False)

創建它

的問題是 -

在上面的代碼中，我不得不手動提的文件名，這是非常困難的，作爲一個解決方案，我需要包括所有CSV文件*.csv
需要使用類似writr.writerow(("Hello"+r[1]+"welcome"))
由於在每個文件中有多個csv文件，並且有很多行（大約100k），所以我需要加快速度。

下面是CSV文件的一個樣本：

"1.csv"  "2.csv"   "3.csv" 
    a,Jac   b,William   c,James

這裏是我怎麼想的輸出看all.csv：

Hello Jac welcome 
Hello William welcome 
Hello James welcome

任何解決方案使用.merge().append()或.concat() ??

我怎樣才能做到這一點使用Python？

來源

2017-06-21 Nancy

南希嗨。你可以像這樣獲得所有帶有模塊glob的csv文件：'paths = glob.glob（'foo/*。csv'）'。 –

你不需要這個熊貓。下面是與csv

import csv 
import glob 


with open("path/to/output", 'w') as outfile: 
    for fpath in glob.glob('path/to/directory/*.csv'): 
     with open(fpath) as infile: 
      for row in csv.reader(infile): 
       outfile.write("Hello {} welcome\n".format(row[1]))

來源

2017-06-21 17:46:11 inspectorG4dget

不會大熊貓加快工作嗎？ – Nancy

@Nancy：我不能確定地說，但我認爲你不會用Pandas爲這個應用程序加速「足夠」 - 你仍然通過編寫輸出的瓶頸 – inspectorG4dget

1）這樣做，如果你想導入一個文件夾中所有的.csv文件非常簡單的方法，你可以用

for i in [a in os.listdir() if a[-4:] == '.csv']: 
    #code to read in .csv file and concatenate to existing dataframe

2）要追加的文本並寫入文件，則可以將函數映射到數據框的列2的每個元素以添加文本。

#existing dataframe called df 
df[df.columns[1]].map(lambda x: "Hello {} welcome".format(x)).to_csv(<targetpath>) 
#replace <targetpath> with your target path

所有你可以傳遞給to_csv的各種參數見http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.to_csv.html。

來源

2017-06-21 17:52:30 victor

這裏是使用內置的CSV模塊的非大熊貓溶液。不知道速度。

import os 
import csv 

path_to_files = "path to files" 
all_csv = os.path.join(path_to_files, "all.csv") 
file_list = os.listdir(path_to_files) 

names = [] 

for file in file_list: 
    if file.endswith(".csv"): 
     path_to_current_file = os.path.join(path_to_files, file) 

     with open(path_to_current_file, "r") as current_csv: 
      reader = csv.reader(current_csv, delimiter=',') 

      for row in reader: 
       names.append(row[1]) 

with open(all_csv, "w") as out_csv: 
    writer = csv.writer(current_csv, delimiter=',') 

    for name in names: 
     writer.writerow(["Hello {} welcome".format(name))

來源

2017-06-21 17:58:46 Hopeless

複製列，添加一些文本，並在新的csv文件

回答

相關問題