2016-01-05 17 views
0

我正在尋找從文件中濾除特定記錄。我認爲,最簡單的方法是用它來awk的(或sed的,等等)以下:從平面文件中過濾(awk,set等)記錄

for i in aaa bbb ccc; do awk '$i,/Record Closing String/' filename.txt >> output_file.txt; done 

這將是在有這樣的事情一個文件...

aaa this is just scrap text 

There is more scrap text. 
And even more scrap text with the identifier again: aaa. 

And even more scrap text. 
Record Closing String 

xaa this is just scrap text 

There is different scrap text. 
And even more scrap text with the identifier again: xaa. 

And even more scrap text. 
Record Closing String 

bbb this is just scrap text 

There is more slightly different scrap text. 
And even more different scrap text with the identifier again: bbb. 

And even more scrap text. 
Record Closing String 

ddd this is just scrap text 

There is different scrap text. 
And even more different scrap text with the identifier again: ddd. 

And even more scrap text. 
Record Closing String 

eee this is just scrap text 

There is different scrap text. 
And even more different scrap text with the identifier again: eee. 

And even more scrap text. 
Record Closing String 

ccc this is just scrap text 

There is different scrap text. 
And even more different scrap text with the identifier again: ccc. 

And even more scrap text. 
Record Closing String 

然而,我的結果集大於我的原始文件(它似乎包含原文件的很少部分MANY,很多次)...是否有命令我可以運行從第一個實例獲得一個我的記錄副本字符串以匹配下一個記錄關閉字符串?我基本上要在第一個結果匹配到下一個紀錄收盤串去(見下文)...

aaa this is just scrap text 

There is more scrap text. 
And even more scrap text with the identifier again: aaa. 

And even more scrap text. 
Record Closing String 

bbb this is just scrap text 

There is more slightly different scrap text. 
And even more different scrap text with the identifier again: bbb. 

And even more scrap text. 
Record Closing String 

ccc this is just scrap text 

There is different scrap text. 
And even more different scrap text with the identifier again: ccc. 

And even more scrap text. 
Record Closing String 
+1

爲了得到這個答案,你可能需要添加更多的信息:http://stackoverflow.com/help/mcve – Brian

+0

謝謝Brian。我會計劃今天晚些時候生成一個例子:)。 – Padawan

+0

添加了一個示例。希望這可以幫助! – Padawan

回答

0

如果你可以使用GAWK(GNU AWK),那麼你可以使用正則表達式記錄分隔符,而這變得非常容易:

gawk -v RS='Record Closing String' '/aaa|bbb|ccc/' filename 

注意,這會在任何地方找到整個記錄,這可能是也可能不是你所需要的標識符。如果需要,您可以添加更具體的正則表達式模式(取決於真實數據的樣子)。