2017-03-09 43 views
0

需要幫助。我一直在尋找一整天沒有找到具體到我需要的解決方案。根據字符串查找並刪除文件中的行但保留最後一次出現

在一個文件中:

Lots 
of 
other 
lines 
... 
... 
# [email protected] ..........1323 <- Do not include '# Client=HOSTNAME' 
# [email protected] ..........123123 <- Do not include '# Client=HOSTNAME' 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- delete line 
[email protected] ....rndChars.... <- keep last occurrence 
... 
... 
more 
lines 

我要找到匹配的所有行「客戶端=」以上,刪除該行除了最後occurrance。問題是我永遠不知道主機名是什麼。

輸出應該是:

Lots 
of 
other 
lines 
... 
... 
# [email protected] ..........1323 <- Do not include '# Client=HOSTNAME' 
# [email protected] ..........123123 <- Do not include '# Client=HOSTNAME' 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- keep last occurrence 
... 
... 
more 
lines 

提前THX。

+0

你嘗試過這麼遠嗎? –

回答

0

Perl來拯救。讀取文件兩次,將每個主機的最後一行數保存在散列表中。

#!/usr/bin/perl 
use warnings; 
use strict; 

my $client_re = qr/Client=(.*?)@/; 

my $filename = shift; 

open my $IN, '<', $filename or die $!; 

my %lines; 
while(<$IN>) { 
    next if /^#/; 

    # Overwrite the line number if already present. 
    $lines{$1} = $. if /$client_re/; 
} 

seek $IN, 0, 0; # Rewind the file handle. 
$. = 0;   # Restart the line counter. 
while (<$IN>) { 
    if (! /^#/ && (my ($hostname) = /$client_re/)) { 
     print if $lines{$hostname} == $.; # Only print the stored line. 
    } else { 
     print; 
    } 
} 
0

使用tac & awk

tac file | awk '/^Client/{ if(!a[$1]){a[$1]++;print};next}1' | tac 

輸出:

$ tac file | awk '/^Client/{ if(!a[$1]){a[$1]++;print};next}1' | tac 
Lots 
of 
other 
lines 
... 
... 
# [email protected] ..........1323 <- Do not include '# Client=HOSTNAME' 
# [email protected] ..........123123 <- Do not include '# Client=HOSTNAME' 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- keep last occurrence 
[email protected] ....rndChars.... <- keep last occurrence 
... 
... 
more 
lines 
0
sed -r ':a;N;$!ba;:b;s/(.*)(Client=[^@]+\b)[^\n]+\n*(.*\2)/\1\3/;tb' file 
相關問題