2014-02-19 39 views
1

嗨,我是在shell腳本新我一直無法做到這一點:編輯數據刪除換行符,並把一切都在一排

我的數據是這樣的(實際上要大得多):

>SampleName_ZN189A 
01000001000000000000100011100000000111000000001000 
00110000100000000000010000000000001100000010000000 
00110000000000001110000010010011111000000100010000 
00000110000001000000010100000000010000001000001110 
0011 
>SampleName_ZN189B 
00110000001101000001011100000000000000000000010001 
00010000000000000010010000000000100100000001000000 
00000000000000000000000010000000000010111010000000 
01000110000000110000001010010000001111110101000000 
1100 

注:每50個字符後有一個換行符,但有時少當數據完成,並有一個新的樣品名稱

我想,每一個50個字符後,換行符將被刪除,所以我的數據看起來是l IKE在此:

>SampleName_ZN189A 
0100000100000000000010001110000000011100000000100000110000100000000000010000000000001100000010000000... 
>SampleName_ZN189B 
0011000000110100000101110000000000000000000001000100010000000000000010010000000000100100000001000000... 

我嘗試使用TR,但我得到了一個錯誤:事先

tr '\n' '' < my_file 

tr: empty string2 

感謝

回答

1

您可以使用此AWK:

awk '/^ *>/{if (s) print s; print; s="";next} {s=s $0;next} END {print s}' file 

>SampleName_ZN189A 
010000010000000000001000111000000001110000000010000011000010000000000001000000000000110000001000000000110000000000001110000010010011111000000100010000000001100000010000000101000000000100000010000011100011 
>SampleName_ZN189B 
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100 
+0

太棒了!它工作,但我的名字實際上與SampleName非常不同,它們有字母,數字和「_」,有沒有辦法調用awk來識別所有這些字符,而不是寫入SampleName? – JM88

+0

他們是否都以'>'字符開頭? – anubhava

+0

檢查更新的答案,希望它會爲你工作。 – anubhava

0

試試這個

cat SampleName_ZN189A | tr -d '\r' 
# tr -d deletes the given/specified character from the input 

使用簡單的awk,同樣會可以實現的。

awk 'BEGIN{ORS=""} {print}' SampleName_ZN189A #Output doesn't contains an carriage return 
at the end, If u want an line break at the end this works. 

awk 'BEGIN{ORS=""} {print}END{print "\r"}' SampleName_ZN189A 
# select the correct line break charachter (i.e) \r (or) \n (\r\n) depends upon the file format. 
+0

它沒有工作,再加上我在每個文件上有很多名字相似的文件。 – JM88

+0

已經嘗試過它正常工作,而且我沒有得到您以前的評論 – Fidel

2

TR與 「-d」 可以刪除指定的字符

$ cat input.txt 
00110000001101000001011100000000000000000000010001 
00010000000000000010010000000000100100000001000000 
00000000000000000000000010000000000010111010000000 
01000110000000110000001010010000001111110101000000 
1100 
$ cat input.txt | tr -d "\n" 
001100000011010000010111000000000000000000000100010001000000000000001001000000000010010000000100000000000000000000000000000010000000000010111010000000010001100000001100000010100100000011111101010000001100 
0

你可以使用這個sed

sed '/^>Sample/!{ :loop; N; /\n>Sample/{n}; s/\n//; b loop; }' file.txt 
1

使用awk

awk '/>/{print (NR==1)?$0:RS $0;next}{printf $0}' file 

,如果你不介意這對一線追加新線的結果,這裏是一個短

awk '{printf (/>/?RS $0 RS:$0)}' file 
1

這可能會爲你工作(GNU SED):

sed '/^\s*>/!{H;$!d};x;s/\n\s*//2gp;x;h;d' file 

在保存空間中建立記錄,並在遇到下一個記錄的開始或文件結束時刪除換行符並打印出來。