保存緩衝區以重新排列文本

我不知道這樣做的好方法（請參閱/ awk/perl）;我結合的HTML文件的多個章節，它具有以下結構保存緩衝區以重新排列文本

<a href="#chapter11">title</a> 
<a href="#chapter12">title</a> 
<a href="#chapter13">title</a> 
<p>first chapter contents, multiple 
pages</p> 
<a href="#chapter21">title</a> 
<a href="#chapter22">title</a> 
<a href="#chapter23">title</a> 
<p>Second chapter contents, multiple pages 
more informations</p> 
<a href="#chapter31">title</a> 
<a href="#chapter32">title</a> 
<a href="#chapter33">title</a> 
<p>Third chapter contents, multiple pages 
few more details</p>

我希望他們能夠重新組織像下面

<a href="#chapter11">title</a> 
<a href="#chapter12">title</a> 
<a href="#chapter13">title</a> 
<a href="#chapter21">title</a> 
<a href="#chapter22">title</a> 
<a href="#chapter23">title</a> 
<a href="#chapter31">title</a> 
<a href="#chapter32">title</a> 
<a href="#chapter33">title</a> 
<p>first chapter contents, multiple 
pages</p> 
<p>Second chapter contents, multiple pages 
more informations</p> 
<p>Third chapter contents, multiple pages 
few more details</p>

我有一個HTML五章重新進行排列。我試圖採用sed保持緩衝區，但這似乎是困難的，我的知識。我不限於sed或awk。任何幫助將不勝感激，謝謝。

編輯

對不起改變的源文件，它也有幾行不總是啓動要麼

<a or <p

反正是有有像sed逆選擇劇本，像

/^<a!/p/

來源

2013-12-09 kuruvi

如何運行sed兩次，第一次輸出<a>標籤，那麼<p>標籤：

sed -n '/^<a/p' input.txt 
sed -n '/^<p/p' input.txt

使用holdspace它可以這樣做：

sed -n '/^<a/p; /^<p/H; ${g; s/\n//; p}' input.txt

打印所有<a>標籤，把所有<p>標籤到holdspace，在文檔（$）結束時，獲得holdspace並打印出來。 H總是在追加到持有空間之前添加一個換行符，這是我們不想要的第一個換行符，這就是爲什麼我們使用s/\n//刪除它。

如果你想存儲輸出，你可以把它重定向

sed -n '/^<a/p; /^<p/H; ${g; s/\n//; p}' input.txt > output.txt

要使用直接sed -i，我們需要調整一下代碼：

sed -i '${x; G; s/\n//; p}; /^<p/{H;d}' input.txt

但這是有點乏味。

如果您有從其他字符的線條，只是想將所有開始與<a>標籤前，你可以做

sed -n '/^<a/p; /^<a/! H; ${g; s/\n//; p}' input.txt

來源

2013-12-09 22:47:41 pfnuesel

通過包括（bz的mac BSD sed）很好地工作;在關閉像（sed -n'/^ kuruvi

編輯我的答案。 'sed -i'也應該工作，但需要對代碼進行一些重構。 – pfnuesel

感謝pfnuessel，我的源文件中也有幾行並不總是以 kuruvi

grep的工作太：

(grep -F '<a' test.txt ; grep -F '<p' test.txt)

來源

2013-12-09 22:48:17

做了這項工作，但我怎樣才能輸出文件？我正在嘗試像這樣（grep -F' final.txt），但沒有完成結果。 – kuruvi

對不起，我的源文件行並不總是以 kuruvi

sed -n '/^ *<[aA]/ !H 
/^ *<[aA]/ p 
$ {x;s/\n//;p;} 
' YourFile

如果一個更精確（並且還允許上限和小的變化）不存在於行的開頭，則將其保存到緩衝區中。

如果存在的話，則打印內容

在結束時，負載緩衝液，除去第一新行（我們先從追加所以在第一保持下一頁末線）和打印內容

來源

2013-12-10 07:05:38 NeronLeVelu

使用awk

awk '{if ($0~/<a/) a[NR]=$0; else b[NR]=$0} END {for (i=1;i<=NR;i++) if (a[i]) print a[i];for (j=1;j<=NR;j++) if (b[j]) print b[j]}' file 
<a href="#chapter11">title</a> 
<a href="#chapter12">title</a> 
<a href="#chapter13">title</a> 
<a href="#chapter21">title</a> 
<a href="#chapter22">title</a> 
<a href="#chapter23">title</a> 
<a href="#chapter31">title</a> 
<a href="#chapter32">title</a> 
<a href="#chapter33">title</a> 
<p>first chapter contents, multiple 
pages</p> 
<p>Second chapter contents, multiple pages 
more informations</p> 
<p>Third chapter contents, multiple pages 
few more details</p>

來源

2013-12-10 07:43:17 Jotne

保存緩衝區以重新排列文本

回答

相關問題