解釋awk關於合併csv文件的命令

我發現一個有用的awk command來自與我具有相同problem的人。解釋awk關於合併csv文件的命令

awk -F, 'NR==FNR{a[$2]=$0;next}$2 in a{ print a[$2],$4, $5 }' OFS=, file1.csv file2.csv

我想修改它以適應我們的csv格式，但我很難理解它的功能。我很遺憾不得不在短時間內通知我，我希望你們能幫助我。

謝謝！

來源

2014-12-01 H-H

-F,

將FS設置爲,用於場分割。

NR==FNR{a[$2]=$0;next}

噹噹前處理的行數（NR）等於當前文件的行號（FNR）（即處理第一非空文件時）。將輸入行保存在a陣列中，該行的第二個字段的鍵（$2）下跳至處理下一行（next）。

$2 in a{ print a[$2],$4, $5 }

當電流線（$2）的第二場是在該陣列a打印從後跟當前行的場4下此鍵a[$2]接着OFS（逗號）的陣列（$4領域），然後是OFS，然後是當前行的第五個字段（$5）。

OFS=,

設置OFS到,處理輸入文件之前。

tl; dr將file2.csv的第4列和第5列追加到file1.csv的匹配行（基於字段2）。

來源

2014-12-01 17:33:11

-F,   # Set the field separator to a comma 

NR==FNR  # Test if we are looking the first file 
      # NR is incremented for every line read across all input files 
      # FNR is incremented for every line read in current file and resets to 0 
      # The only time NR==FNR is when we are looking at the first file 

a[$2]=$0  # Create a lookup for the line based on the value in the 2nd column 

next   # Short circuit the script and get the next input line 

$2 in a  # If we are here we are looking at the second file 
      # Check if we have seen the second field in the first file 

a[$2],$4,$5 # Print the whole matching line from the first file 
      # with the 4th & 5th fields from the second 

OFS=,  # Separate the output with a comma

來源

2014-12-01 17:36:51

WRT'唯一的一次NR == FNR是，當我們在看第一file'或者如果我們看第二個文件，第一個文件是空的。偶爾會偶爾絆倒人的東西要牢記在心。 – 2014-12-01 19:29:47

@EdMorton更好地使用'ARGIND == 1'並丟失兩個字符 – 2014-12-01 20:43:23

@Jidder這是GNU特有的，但是對於gawk來說這是一個很好的解決方案。您可以爲非gawks添加'FNR == 1 {ARGIND ++}'，但這又回到了在空文件上失敗。您也可以使用'FILENAME == ARGV [1]'，但如果您在文件名區域設置變量，則失敗（另一個原因是不這樣做！）。真的沒有理想的解決方案，所以'NR == FNR'通常很好，只需要記住它會失敗，如果你的第一個文件是空的。 – 2014-12-01 20:46:38

解釋awk關於合併csv文件的命令

回答

相關問題