2015-08-24 159 views
2

追加匹配的名字,我有2個csv文件awk中搜索和其他csv文件

文件1包含

product_id, category_id, price 
pid01,cat01,10 
pid02,cat01,10 
pid03,cat01,20 
pid04,cat02,30 
pid05,cat02,20 
pid06,cat03,30 

文件2包含

category_id, category_name 
cat01,Mouse 
cat02,Cat 
cat03,Fish 
cat04,Dog 

我需要這樣

結果
product_id, category_id, category_name, price 
pid01,cat01,Mouse,10 
pid02,cat01,Mouse,10 
pid03,cat01,Mouse,20 
pid04,cat02,Cat,30 
pid05,cat02,Cat,20 
pid06,cat03,Fish,30 

product_id, category_name, price 
pid01,Mouse,10 
pid02,Mouse,10 
pid03,Mouse,20 
pid04,Cat,30 
pid05,Cat,20 
pid06,Fish,30 

我怎麼achive它Bash或awk中?

+0

do es file2的第一行包含標題 – amdixon

+0

是的,讓我更新問題 – billyduc

回答

4

這awk將做到這一點:

awk -F, 'NR==FNR{a[$1]=$2;next}FNR>1{print $1,$2,a[$2],$3}' OFS=, file2 file1 

順便說一句,你還需要添加標題。讓我以多行格式解釋腳本:

# Specify the field delimiter and print the headers 
BEGIN { 
    FS=OFS="," 
    $1="product_id" 
    $2="category_id" 
    $3="category_name" 
    $4="price" 
    print 
} 

# As long as the total number of records (NR) equals 
# number of records is equal to the number of records 
# in the current input file (FNR) we populate data 
# from file2 to the lookup table 'a' 
NR==FNR{ 
    a[$1]=$2 
    next # Skip the following block and go on parsing file2 
} 

# Skip line 1 in file1, inject column 3 with the value from 
# the lookup table and output the record 
FNR>1{ 
    print $1,$2,a[$2],$3 
} 

請檢查anubhava's comment。在gawkmawk使用-F', *'可以更簡單地實現標題的打印。逗號後面的可選空格是因爲列標題中有一個空格。我會在處理之前簡單地刪除該空間。

+0

'awk -F',*'-v OFS =,'FNR == NR {a [$ 1] = $ 2;下一個} {print $ 1,$ 2,a [$ 2],$ 3}'file2 file1'也會得到標題行。 – anubhava

+1

@anubhava好抓! :)我已經想知道爲什麼它不在首位工作,但想完成我的解釋。錯過了標題中的空間!謝謝! – hek2mgl

+0

Fanstatic!謝謝hek2mgl – billyduc

3

隨着加入:

join --header -t , -1 2 -2 1 -o 1.1,1.2,2.2,1.3 file1 file2 

輸出:

 
pid01,cat01,Mouse,10 
pid02,cat01,Mouse,10 
pid03,cat01,Mouse,20 
pid04,cat02,Cat,30 
pid05,cat02,Cat,20 
pid06,cat03,Fish,30 
+1

不錯的一個,我總是爲了得到這個工具。 – hek2mgl

0

您可以創建一個shell腳本(process_csv.sh)像這樣:

#!/bin/sh 

data=`cat file1.csv | sed -n '/pid/,$ p'` 
data2=`cat file2.csv` 
echo "product_id, category_id, price, category_name" > final.csv 
#since category_id is common in both files, we lookup category names based on that id. 
for row in $data 
      do 
        cat_id=`printf $row | awk -F "," '{print $2'}` 
        category_name=`printf "$data2" | grep "$cat_id" | cut -f2 -d','` 
        #now we write category_name to file and append it to row/line with corresponding product_id 
        echo $row","$categor_name >> final.csv 


      done 

只要運行」 ./process_csv .sh「和final.csv文件將包含您的結果