2017-08-14 64 views
1

我擁有數百萬行這樣的行。我想編寫簡單的bash腳本來獲取一些信息。將特定的行轉換爲列

      Name:     1FJ 
         HA_RMSDs:   -1000.0000 
         HA_RMSDh:   -1000.0000 
         HA_RMSDm:    0.0000 
         Grid_Score:   -24.958729 
       Grid_vdw_energy:   -24.958729 
        Grid_es_energy:   0.000000 
     Internal_energy_repulsive:   5.894002 
          Name:  ZINC103990867 
         HA_RMSDs:   -1000.0000 
         HA_RMSDh:   -1000.0000 
         HA_RMSDm:    0.0000 
         Grid_Score:   -22.196136 
       Grid_vdw_energy:   -17.917459 
        Grid_es_energy:   -4.278677 
     Internal_energy_repulsive:   14.832469 

我想要這樣;

Name   Grid_Score 
ZINC103990867 -22.196136 
1FJ    -24.958729 

我找到了一些solution但我做不到。

任何幫助將是非常可觀的。

回答

1

當你在輸入有名字值映射關係,它通常是最好先創建一個數組持有這些映射,然後就通過它們的名字打印值:

$ cat tst.awk 
{ sub(/:/,"") } 

NR==1 { key=$1 } 
$1==key { prt() } 
{ f[$1] = $2 } 
END { prt() } 

function prt( i) { 
    if (NR==1) { 
     numCols = split(c,cols,/,/) 
     for (i=1; i<=numCols; i++) { 
      printf "%s%s", cols[i], (i<numCols?OFS:ORS) 
     } 
    } 
    else { 
     for (i=1; i<=numCols; i++) { 
      printf "%s%s", f[cols[i]], (i<numCols?OFS:ORS) 
     } 
    } 
} 

$ awk -v c='Name,Grid_Score' -f tst.awk file | column -t 
Name   Grid_Score 
1FJ   -24.958729 
ZINC103990867 -22.196136 

$ awk -v c='Name,Grid_Score,HA_RMSDs,Grid_es_energy' -f tst.awk file | column -t 
Name   Grid_Score HA_RMSDs Grid_es_energy 
1FJ   -24.958729 -1000.0000 0.000000 
ZINC103990867 -22.196136 -1000.0000 -4.278677 
1

如果你需要比這更好的格式,awk有一個printf()語句。

% awk 'BEGIN{print "Name","Grid_Score"}$1=="Name:"{name=$2}$1=="Grid_Score:"{print name,$2}' inputfile.txt 
+0

按順序打印順序打印需要做什麼? @keithpjolley – sulejmani

+0

你的代碼| sort -n -k 2 – sulejmani

+1

如果你需要漂亮的輸出,pipe plain awk output into'column -t' –