在每個文件上使用帶有標題的AWK的文件拆分命令

awk -F "\",\"" 'NR==1 { hdr=$0; next } $10 != prev { prev=text=$10; gsub(/[^[:alnum:]_]/,"",text); $0 = hdr "\n" $0 } { print > ("test."text".batch.csv") }' test.batch1.csv

有awk命令無法正常工作，它將文件拆分（基於文件的$ 10列值）並將標題放在每個文件上。我試圖理解命令行，但我不太明白。欣賞是否有人向我解釋每一行的內容？在每個文件上使用帶有標題的AWK的文件拆分命令

來源

2016-10-06 Pooja25

awk -F "\",\"" '      # set field separator to "," 
NR==1 {        # pick the header from the first record 
    hdr=$0; next      # and skip to next record 
} 
$10 != prev {       # if 10th the field differs from previous 
    prev=text=$10      # prev and text are set equal to 10th field 
    gsub(/[^[:alnum:]_]/,"",text)  # remove all but aA-zZ, 0-9, _ from text 
    $0 = hdr "\n" $0     # header preceeds data 
} 
{          # f.ex. ..,"foo/bar_123",... would output 
    print > ("test."text".batch.csv") # to file test.foobar_123.batch.csv 
} 
' test.batch1.csv      # input file

如果它不工作，因爲它曾經，我會首先檢查數據文件10號字段排序。

來源

2016-10-06 16:22:28

這是正確的詹姆斯，其不按預期工作。它正在創造一個數據混亂。非常感謝您的建議和幫助。 – Pooja25

由於您沒有提供輸入樣本，因此這裏是一個簡化版本。

假設您想要將文件分割關鍵數值

$ cat file 
header 
1 
2 
2 
3 
3 
3 

$ awk 'NR==1{header=$0; next}    # save header 
    prev!=$1{fn=$1;      # when value changed, set new file counter, 
      prev=$1;      # save current key value, 
      $0=header RS $0}    # and insert header before first record 
      {print > FILENAME"."fn}' file # print records to the file 

$ head file.{1..3} 
==> file.1 <== 
header 
1 

==> file.2 <== 
header 
2 
2 

==> file.3 <== 
header 
3 
3 
3

來源

2016-10-06 15:54:09 karakfa

在每個文件上使用帶有標題的AWK的文件拆分命令

回答

相關問題