多個模式匹配

我有一個包含一個散列〜15000點的鍵：多個模式匹配

say %hash =(key1=>[x,yz],key2=>[o,p,e] ,key3......till keys 15000)

我有一個文件（100 MB）

在該文件中的哈希鍵在許多行中重複，所以我的文件可能包含

key1 there is adog 
key2 there is cattt 
key1 there is man 
key3 there is elephant 
key2 etc...............

現在我想的是

foreach my $key (keys %hash) 
{ 
    open (IN,$file) or die ; 
    while ($input=<IN>){ 
     ($animal)=$input=~/$key.*?there is (.*?)/I; 
     #I want to match the last occurrence of the pattern I.e key1 there is **man** 
    } 
    push @array,$animal; 
}

正如你看到的，這是工作正常，但腳本被打開多次（15000）次文件的每個鍵，以便它採取很多的時間。

如何優化的代碼，以便它需要相對較少的時間

我用

my $stg=`grep -w $key /path/to/file |tail -1`;

但仍grep命令將執行15000次也被拍了很多的時間。

我的問題是我如何更快地執行此操作。

來源

2014-03-12 user3411200

當您閱讀每一行時，只需使用當前行中的值覆蓋鍵值即可。這是通過你的文件的一次通過。

my %refs; 
open my $IN '<', $file or die; 
while ($input = <$IN>) 
{ 
    my($key, $animal) = $input=~/^(^(\s+).*?there is (.*?)/I; 
    $refs{$key} = $animal; 
}

現在%refs包含了每個鍵的最後一個條目的動物名：

foreach my $key (%refs) 
{ 
    print "$key = $refs{$key}\n"; 
}

來源

2014-03-12 15:08:16

多個模式匹配

回答

相關問題