2016-12-13 43 views
1

我有一個包含很多這樣的塊文件:如何從文件中提取最後一塊

==9673== 
==9673== HEAP SUMMARY: 
==9673==  in use at exit: 0 bytes in 0 blocks 
==9673== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9673== 
==9673== All heap blocks were freed -- no leaks are possible 
==9673== 
==9673== For counts of detected and suppressed errors, rerun with: -v 
==9673== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 
.... 
.... 
.... 
.... 

==9655== 
==9655== HEAP SUMMARY: 
==9655==  in use at exit: 0 bytes in 0 blocks 
==9655== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9655== 
==9655== All heap blocks were freed -- no leaks are possible 
==9655== 
==9655== For counts of detected and suppressed errors, rerun with: -v 
==9655== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

.... 
.... 
.... 

==9699== 
==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

我要提取的最後一個塊開始行:

==XXXX== HEAP SUMMARY: 

所以在我的示例我想只提取最後一個區塊:

==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

我該怎麼用bash做到這一點?

+1

[編輯]您輸入擺脫所有的'...'S和使之成爲具體的,可測試的例子。塊之間的文本與塊一樣重要。例如,如果在每個塊之間確實存在空白行,那麼您所需要的只是'awk -v RS ='{s = $ 0} END {print s}'文件',並且如果每個塊都是8行所有你需要的是「尾-8文件」,但如果其中任何一個真的是你的輸入格式化或不。 –

回答

1

使用grep -zoP和負前瞻正則表達式:

grep -zoP '==\w{4}== HEAP SUMMARY:(?![\s\S]*==\w{4}== HEAP SUMMARY:)[\s\S]*\z' file 

==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 
  • -z會把文件終止,而不是新的行數據爲空值終止
  • (?![\s\S]*==\w{4}== HEAP SUMMARY:)是負先行斷言我們沒有的另一個實例在下面的文件中也是一樣。

RegEx Demo

1

如果你有tac,這可能是最簡單的

$ tac file | awk '1; /==....== HEAP SUMMARY/{exit}' | tac 
1

如果你知道塊總是9行代碼,你可以簡單地使用tail

tail -n9 file 
1

With sed:

$ sed -n '/HEAP SUMMARY/{:a;/ERROR SUMMARY/bb;N;ba;:b;$p;d}' infile 
==9699== HEAP SUMMARY: 
==9699==  in use at exit: 0 bytes in 0 blocks 
==9699== total heap usage: 75,308 allocs, 75,308 frees, 7,099,382 bytes allocated 
==9699== 
==9699== All heap blocks were freed -- no leaks are possible 
==9699== 
==9699== For counts of detected and suppressed errors, rerun with: -v 
==9699== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) 

這裏是如何工作的:

sed -n '     # Do not print lines at end of each cycle 
    /HEAP SUMMARY/ {  # If line matches "HEAP SUMMARY" 
     :a     # Label to jump back to 
     /ERROR SUMMARY/bb # If line matches "ERROR SUMMARY", jump to :b 
     N     # Append next line to pattern space 
     ba     # Jump to :a 
     :b     # Label to jump forward to 
     $p     # If we are on the last line, print pattern space 
     d     # Delete pattern space 
    } 
' infile 

每次遇到這種HEAP SUMMARY,它讀取所有行到下一個ERROR SUMMARY入模式空間。然後,它檢查是否已經到達最後一行;如果是,則打印模式空間,否則將被刪除。

0

如果文件的最後一行也有塊號,這將讓該塊數快速(無整個文件的閱覽找哪個號碼是):

n="$(tail -n1 infile | awk '{print $1}')" 

然後,我們可以選擇有這樣的塊數結束了所有行:

tac infile | awk -vn="$n" '!($1~n){exit};1'| tac 
0

這可能會爲你工作(GNU SED):

sed '/HEAP SUMMARY:/h;//!H;$!d;x' file 

遇到HEAP SUMMARY:時,用當前行替換保持空間(HS)中的任何內容。對於任何其他模式,將該行附加到HS。當模式空間(PS)與HS交換並打印出PS時,刪除除最後一行外的所有行。

0

使用數據的前面,一個id /組號數:

id=$(tail -n1 file | grep -Po '(?<=\=\=)[0-9]*') && grep "$id" file |tail -n+2