修剪使用正則表達式文件/ sed的

我有幾個行的文件是這樣的：修剪使用正則表達式文件/ sed的

*wordX*-Sentence1.;Sentence2.;Sentence3.;Sentence4.

其中一個句子可能會或可能不會包含wordX。我要的是修剪文件，使它看起來像這樣：

*wordX*-Sentence1.;Sentence2.

凡Sentence3是第一個包含wordX。

我該如何用sed/awk做到這一點？

編輯：

下面是一個示例文件：

*WordA*-This sentence does not contain what i want.%Neither does this one.;Not here either.;Not here.;Here is WordA.;But not here. 
*WordB*-WordA here.;WordB here, time to delete everything.;Including this sentece. 
*WordC*-WordA, WordB. %Sample sentence one.;Sample Sentence 2.;Sample sentence 3.;Sample sentence 4.;WordC.;Discard this.

這裏是所需的輸出：

*WordA*-This sentence does not contain what i want.%Neither does this one.;Not here either.;Not here. 
*WordB*-WordA here. 
*WordC*-WordA, WordB. %Sample sentence one.;Sample Sentence 2.;Sample sentence 3.;Sample sentence 4.

來源

2013-05-08 figos

如果'句子[n]的.'包含'WordX'，刪除到行尾的？ – 2013-05-08 19:05:45

是的，這是正確的。 – figos 2013-05-08 19:09:07

這個任務是更適合的awk。使用以下awk命令：

awk -F ";" '/^ *\*.*?\*/ {printf("%s;%s\n", $1, $2)}' inFile

這是假設你要匹配的話總是包裹在星號*。

來源

2013-05-08 19:08:39 anubhava

我怎樣才能使用遞歸的整個文件，而不必手動指定什麼是wordX（*之間的字，s？ – figos 2013-05-08 19:45:14

是wordX總是附帶2'*'？ – anubhava 2013-05-08 19:48:14

是的，並且字不同於行 – figos 2013-05-09 14:52:00

這可能爲你工作（GNU SED）：

sed -r 's/-/;/;:a;s/^(\*([^*]+)\*.*);[^;]+\2.*/\1;/;ta;s/;/-/;s/;$//' file

轉換的-以下的wordX到;。刪除包含wordX的句子（從後面到行頭）。替換原來的-。刪除最後的;。

來源

2013-05-08 21:07:29 potong

沒有工作，這裏的輸出： * WordA * - 這句話不包含我想要的東西。％這也不是。 * WordB * -WordA在這裏; WordB在這裏，時間刪除所有東西;包括這個Sentece。 * WordC * -WordA，WordB。％例句一。;樣例句2;樣例句3;樣例句4; WordC;;捨棄這個。 – figos 2013-10-23 20:20:09

sed -r -e 's/\.;/\n/g' \ 
     -e 's/-/\n/' \ 
     -e 's/^(\*([^*]*).*\n)[^\n]*\2.*/\1/' \ 
     -e 's/\n/-/' \ 
     -e 's/\n/.;/g' \ 
     -e 's/;$//'

（編輯：添加的-：\n互換來處理在所述第一句子匹配）

來源

2013-05-09 15:06:53 jthill

修剪使用正則表達式文件/ sed的

回答

相關問題