2017-09-07 115 views
1

我有多個CSV文件象下面這樣:awk格式化csv文件| unix | Solaris | AWK

~/Prod/Jcs/BIN/Dash_PPLP/load$ ls -lt *csv 
-rw-rw-r-- 1 tellus tellus  81 Sep 7 14:27 extraction_MBBSCS_PPL_USAGE_IMPORT.csv 
-rw-rw-r-- 1 tellus tellus  83 Sep 7 14:27 extraction_MBBSCS_PPL_INVOICE_IMPORT.csv 
-rw-rw-r-- 1 tellus tellus  71 Sep 7 14:27 extraction_INVOICE.csv 
-rw-rw-r-- 1 tellus tellus  69 Sep 7 14:27 extraction_USGRERUN.csv 
-rw-rw-r-- 1 tellus tellus  69 Sep 7 14:27 extraction_USG.csv 
-rw-rw-r-- 1 tellus tellus  72 Sep 7 14:27 extraction_LIA.csv 
-rw-rw-r-- 1 tellus tellus  74 Sep 7 14:27 extraction_MSISDN.csv 

通過打開一個文件

cat extraction_LIA.csv 
PPL_LIABILITY,2468705,Fri Sep 01 06:56:41 2017,Fri Sep 01 06:58:33 2017 

格式名,行,START_TIME和END_TIME每個流我要監控,以使它們「可加載」到ORACLE表中。

我已經做出了這樣的腳本做變換和覆蓋它們每一個,象下面這樣:

cat transform_to_load.bash 
#!/bin/bash 
csv_files=$(ls *.csv) 
for i in $csv_files 
do 
x=$(nawk 'BEGIN { OFS=","; FS=","} {split($3,a," ");split($3,b," ")} 
{$3=a[3]"/"a[2]"/"a[5]" "a[4];$4=b[3]"/"b[2]"/"b[5]" "b[4]} 
{print}' $i) 
echo $x > $i 
done 

的問題是我NAWK:

x=$(nawk 'BEGIN { OFS=","; FS=","} {split($3,a," ");split($3,b," ")} 
    {$3=a[3]"/"a[2]"/"a[5]" "a[4];$4=b[3]"/"b[2]"/"b[5]" "b[4]} 
    {print}' $i) 

產生以下(開始時間與結束時間相同)

[email protected]:~/Prod/Jcs/BIN/Dash_PPLP/load$ cat extraction_LIA.csv 
PPL_LIABILITY,2468705,01/Sep/2017 06:56:41,01/Sep/2017 06:56:41 

我想實現的是將其格式化爲w ithnak(SunOS)像這樣每個人:

PPL_LIABILITY,2468705,01/Sep/2017 06:56:41,01/Sep/2017 06:58:33 

你能幫我用我的nawk輸出正確的格式嗎?

非常感謝!

回答

2

你幾乎接近你的目標,需要糾正一點

原因:

它,因爲在你的代碼中有,

{split($3,a," "); split($3,b," ")} 
         ^
        So you get same result in end time 

正確的像低於

解決方案:

{split($3,a," "); split($4,b," ")} 
         ^
         Fourth Column will be used 

同時,如果你有興趣,可以簡化像下面,

不需要的

  • csv_files=$(ls *.csv)
  • x=$(nawk '{..}')
  • echo $x > $i

簡體版

$ cat test.sh 
#!usr/bin/env bash 

for i in *.csv; do 

# Better Prefer 
# /usr/xpg4/bin/awk or /usr/xpg6/bin/awk 

    nawk ' 
      BEGIN{ 
       FS=OFS="," 
      } 
      function format_dt(v, a){ 
       split($v,a,/ /); 
       $v=a[3]"/"a[2]"/"a[5]" "a[4] 
      } 
      { 
       format_dt(3); 
       format_dt(4) 
      }1 
     ' "$i" >tmpfile && mv tmpfile "$i" 
done 
+0

嘿!非常感謝,所以調整到4美元將解決它,正確?? –

+0

@tln_jupiter:是的,你可以看到'$ 3'意思是第3個字段/列 –

+1

真的很有用,非常感謝:) –