另一種選擇用GNU AWK爲RT
:
$ printf 'abc}def}ghi\n' | awk -v RS='}' '{ORS=(RT?"}\n":"")}1'
abc}
def}
ghi
與其他awks:
$ printf 'abc}def}ghi\n' | awk -v RS='}' -v ORS='}\n' 'NR>1{print p} {p=$0} END{printf "%s",p}'
abc}
def}
ghi
我決定測試所有當前發佈的解決方案用於使用輸入文件的功能和執行時間此命令生成E:
awk 'BEGIN{for(i=1;i<=1000000;i++)printf "foo}"; print "foo"}' > file1m
和這裏就是我的了:以上
1)awk(兩者的awk腳本也有類似的結果):
time awk -v RS='}' '{ORS=(RT?"}\n":"")}1' file1m
得到預期的輸出,定時=
real 0m0.608s
user 0m0.561s
sys 0m0.045s
2)shell loop:
$ cat tst.sh
#!/bin/bash
# as long as there exists another } in the file, read up to it...
while IFS= read -r -d '}' piece; do
# ...and print that content followed by '}' and a newline.
printf '%s}\n' "$piece"
done
# print any trailing content after the last }
[[ $piece ]] && printf '%s\n' "$piece"
$ time ./tst.sh < file1m
GOT預期的輸出,定時=
real 1m52.152s
user 1m18.233s
sys 0m32.604s
3)tr+sed:
$ time tr '}' '\n' < file1m | sed 's/$/}/'
沒有產生預期的輸出(由不期望的}
在文件的結尾),定時=
real 0m0.577s
user 0m0.468s
sys 0m0.078s
用一個調整,以去除最後不可取}
:
$ time tr '}' '\n' < file1m | sed 's/$/}/; $s/}//'
real 0m0.718s
user 0m0.670s
sys 0m0.108s
4)fold+sed+tr:
$ time fold -w 1000 file1m | sed 's/}/}\n\n/g' | tr -s '\n'
GOT預期的輸出,定時=
real 0m0.811s
user 0m1.137s
sys 0m0.076s
5)split+sed+cat:
$ cat tst2.sh
mkdir tmp$$
pwd="$(pwd)"
cd "tmp$$"
split -b 1m "${pwd}/${1}"
sed -i 's/}/}\n/g' x*
cat x*
rm -f x*
cd "$pwd"
rmdir tmp$$
$ time ./tst2.sh file1m
得到預期的輸出,計時=
real 0m0.983s
user 0m0.685s
sys 0m0.167s
您可能要考慮使用'jq'來以流方式處理您的文件。 – chepner