2014-01-29 86 views
0

我有2個文件,有很多觀察。我需要將它們兩個垂直合併。例如: CSV文件中包含了一些人的FOLL數據:合併2個相同編號的csv文件。特徵Python

Sr.no Name, Age , Sex, Weight, Height 
1.  A, 12, M, 40,  4ft 
2.  B, 13, F, 35,  3.9ft  
3.  C, 15, F, 45,  4.2ft 

的CSV B包含:

1. D,20,M,55,5.3ft 
2. E,22,F,53,5.0ft 

我想輸出爲看:

1. A, 12, M, 40, 4ft 
2. B, 13, F, 35, 3.9ft 
3. C, 15, F, 45, 4.2ft 
4. D, 20, M, 55, 5.3ft 
5. E, 22, F, 53, 5.0ft 

嘗試a.merge但不知道如何處理參數。
是的,我忘了提及索引。合併後的csv應該顯示爲1,2,3,4,5。 Csv A的索引爲1,2,3,Csv B爲1,2。合併後,結果索引是1,2,3,4,5 ..

+1

爲什麼你不追加這兩個文件?結果仍然是有效的CSV文件。 – arocks

+0

和索引呢?都從1開始,所以我想要第二個索引繼續Csv的索引A –

+1

該索引不存在於該文件中,對不對?所以Python應該正確地索引它。 – arocks

回答

0

我得到它..其相當簡單不過。所有你需要做的是
FileC = FileA.append(FileB ,ignore_index = True)

索引以正確的方式自動重新調整。雖然索引編號從0開始,但這不是問題,因爲每個觀察值都有其唯一索引編號。

0

也許這就是你想要什麼:

A = open('A.csv','r') 
B = open('B.csv','r') 
out = open('out.csv','w') 
i = 0 
# writing A file 
for line in A: 
    if i==0: # This is to handle the headers line in csv A 
     out.write(line) 
    else: 
     out.write("%s. %s" %(i,line[line.find('.')+1:])) 
    i = i+1 

# This is to handle where there is not end-of-line at the end of A.csv 
if not line.endswith("\n"): 
    out.write("\n") 

# writing B file 
for line in B: 
    out.write("%s. %s" %(i,line[line.find('.')+1:])) 
    i = i+1 
A.close() 
B.close() 
out.close() 
0

一般來說,我會說,合併和創建自己的索引中忽略指數(只需通過計算行數)在運行時再次讀取文件時可能會更快。 然後,在談到file.merge(),請查看這篇文章約pandasMerge, join, and concatenateMerge不是你要找的東西。它可以通過數據庫中的方式合併CSV文件。你可以彎曲它來適應你的目的,但我認爲最好的方法就是使用下面的簡單代碼。

我建議(對於Python 2.5及更高版本)使用with來打開文件。 (Python's with statement)。

import shutil; 

def merge(): 
    print '*** Merging started ***'; 
    # opening all the files using with 
    with open('fileA.csv','r') as fileA, open('fileB.csv','r') as fileB, open('fileOutput.csv','w') as output: 
     # if all the files start by index no. 1, then you don't need to copy line by line the whole file and you can just use a copy of whole file 
     # you just need to count the number of lines in order to know which number to use for fileB 
     lines_counter = 0; 
     for line in fileA: 
      lines_counter += 1; 

     # only copy the file fileA in fileOutput 
     shutil.copyfile('fileA.csv','fileOutput.csv'); 

     # if the last line of fileA was not ended by end of line, append it 
     if not line.endswith('\n'): 
      output.write('\n'); 

     # copy all the lines of fileB and add the index which belongs to the line  
     for line in fileB: 
      lines_counter += 1; 
      line_without_index = line[line.find('.'):]; 
      output.write('{}{}'.format(str(lines_counter),line_without_index)); 

    print '*** Merging finished ***'; 

merge(); 

編輯:

shutil不工作,你仍然可以只是刪除

shutil.copyfile('fileA.csv','fileOutput.csv'); 

和一個行添加到第一個for循環,所以第一個for循環會看如下:

for line in fileA: 
    lines_counter += 1; 
    output.write(line); 

它應該以同樣的方式工作。性能可能只有很小的差別。但我想這不是什麼大不了的事情。

+0

嗨,thanx代碼,但shutil模塊不可調用。我如何安裝它? (或導入它)? –

+0

------------------------------------------------- -------------------------- TypeError Traceback(最近呼叫最後一個) in () 22 lines_counter + = 1; 23 line_without_index = line [line.find('。'):]; ---> 24 output.write('{} {}'.format(str(lines_counter),line_without_index)); 26 print'***合併完***'; TypeError:'module'對象不可調用 ***開始合併*** –

+0

對不起。您可以在代碼的開頭通過'import shutil'導入['shutil'](http://docs.python.org/release/2.5.2/lib/module-shutil.html)。我編輯了代碼。我希望現在可以。 – Marek

0

試試下面的代碼:

import csv 
from itertools import chain 

with open('a.csv') as a, open('b.csv') as b, open('out.csv', 'w') as out: 
    a = csv.reader(a) 
    b = csv.reader(b) 
    out = csv.writer(out) 
    for i, row in enumerate(chain(a, b), 1): 
     row[0] = i 
     out.writerow(row)