組合輸出我有一個在下面的格式使用awk
創建的文件:通過現場用awk
文件
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 230 bases with an average of 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
chr2:211460199-211460318 CPS1-1200|gc=41.2 119 bases with an average of 105.6
我所試圖做的是結合匹配所有$2
一排接一排地脫掉-
。文件中的每一行都會有一個匹配項,儘管這些在示例中沒有顯示。謝謝 :)。
所需的輸出
chr2:211471445-211471675 CPS1|gc=48.3 230 bases with an average of 264.7
chr2:211460199-211460318 CPS1|gc=41.2 119 bases with an average of 105.6
chr2:46603668-46603902 EPAS1-902|gc=54.3 234 bases with an average of 253.1
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 221 bases with an average of 195.8
我想:
AWK
awk '{k=$1 FS $2; a[k]+=split[$2] "-"; c[k]++}
END{for(k in a)
{split(k,ks,FS);
print ks[1],c[k],ks[2],a[k]/c[k]}}' file > output.txt
如果「每一行都有匹配」,爲什麼不直接在第二個字段中去掉「 - [digits]」。 –