2016-12-16 88 views
1

我學習的Python愚蠢的問題很抱歉..添加行時第二CSV具有相同的第一排

我有兩個文件:

list.csv

john 
mary 
joanna 
lucas 
kate 

db.csv

john^chief^portland 
mary^secretary^ny 
joanna^supervisor^washington 

我想達到什麼是比較這兩個文件和輸出都 alphabeticaly通過第一列ABD排序在名稱不以dB爲單位在第二列中添加None這樣的:

output.csv

joanna^supervisor^washington 
john^chief^portland 
kate^None 
lucas^None 
Mary^secretary^ny 

我開始用它從這個代碼,我發現SO開始打:

masterlist = list(reader22) 

for hosts_row in reader21: 
    row = 1 
    found = False 
    for master_row in masterlist: 
     results_row = hosts_row 
     if hosts_row[0] == master_row[0]: 
      results_row.append('FOUNDTHISLINE in master list (row ' 
           + str(row) + ')') 
      found = True 
      break 
     row = row + 1 
    if not found: 
     results_row.append('THISLINENOTFOUND in master list') 
    writer23.writerow(results_row) 

請幫助理解它是如何應該做的最好的方式。

+0

哪個值變爲第三列呢? – MMF

+0

當你說「同一首行」時,你的意思是_column_? –

+0

對不起列。 MMF:沒有第三列 – Lucas

回答

2

它很容易和有效的做你想做的僅使用csv模塊,什麼Python的內置數據結構,如列表和字典:

import csv 

with open('list.csv', 'rb') as csvfile: 
    masterlist = sorted(row[0] for row in csv.reader(csvfile)) 

with open('db.csv', 'rb') as csvfile: 
    db = {row[0]: row[1:] for row in csv.reader(csvfile, delimiter='^')} 

with open('output.csv', 'wb') as csvfile: 
    writer = csv.writer(csvfile, delimiter='^') 
    for name in masterlist: 
     writer.writerow([name] + db[name] if name in db else [name, 'None', '']) 

output.csv的內容創建:

joanna^supervisor^washington 
john^chief^portland 
kate^None^ 
lucas^None^ 
mary^secretary^ny 
+0

夥計糾正我,如果我錯了,但在python <2.7這一行:'db = {row [0]:row [1:] for csv.reader(csvfile,delimiter ='^')}'應該看起來像這樣? 'db = dict((row [0],row [1:])for csv.reader(csvfile,delimiter ='^'))' – Lucas

+0

Lucas:我不知道什麼版本的Python [字典顯示](https://docs.python.org/2/reference/expressions.html#dictionary-displays)被引入,但它已經存在了很長一段時間。也就是說,您顯示的[alternative](https://docs.python.org/2/library/stdtypes.html#mapping-types-dict)方法可以從[generator expression](https: //docs.python.org/2/reference/expressions.html#generator-expressions)生成一個'key','value'對的序列也應該可以在從2.4到2.7的版本中工作。 – martineau

2

這是熊貓圖書館的完美案例。我知道你剛開始學習,但是檢查出來的數據操作(請忽略編號:))

In [37]: list_df = pd.read_csv('list.csv', header=None) 

In [38]: db_df = pd.read_csv('db.csv', sep='^', header=None) 

In [51]: db_df 
Out[51]: 
     0   1   2 
0 john  chief portland 
1 mary secretary   ny 
2 joanna supervisor washington 


In [48]: list_df 
Out[48]: 
     0 
0 john 
1 mary 
2 joanna 
3 lucas 
4 kate 

In [52]: df = list_df.merge(db_df, how='left') 

In [53]: df 
Out[53]: 
     0   1   2 
0 john  chief portland 
1 mary secretary   ny 
2 joanna supervisor washington 
3 lucas   NaN   NaN 
4 kate   NaN   NaN 

In [54]: df.sort(0) 
Out[54]: 
     0   1   2 
2 joanna supervisor washington 
0 john  chief portland 
4 kate   NaN   NaN 
3 lucas   NaN   NaN 
1 mary secretary   ny 

從那裏,你可以調用df.to_csv功能,讓你正在尋找的輸出。

(回寫) http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_csv.html

+0

你如何回寫該文件? – praveenraj

+0

編輯指向文檔的答案。它非常直接。 – Kelvin

相關問題