2012-08-17 150 views
-3

FILE1文件具有數千行,終止模式爲_Pattern1。perl:基於模式匹配的字符串提取

第二個文件也有幾千行具有相同終止模式_Pattern1的行。

我現在必須:

  • 讀FILE1逐行

  • 查找出來,如果行有任何字符串_Pattern1

  • 提取字符串,並將其存儲到一個變量終止

  • 打開文件2並逐行讀取

  • 查找出來,如果從FILE2剛讀行包含存儲在變量上

字符串這是如何在Perl做什麼?

EDIT2:

還好吧,有一點谷歌搜索,並參考鏈接低於入伍,我解決我的問題。 這是代碼片段。

#!/usr/bin/perl 
use strict; 
use warnings; 

my $OriginalHeader=$ARGV[0]; ## Source file 
my $GeneratedHeader=$ARGV[1];## File to compare against 
my $DeltaHeader=$ARGV[2]; ## File to store misses 

my $MatchingPattern="_Pos"; 
my $FoundPattern; 

open FILE1, $OriginalHeader or die $!; 
open FILE2, $GeneratedHeader or die $!; 
open (FILE3, ">$DeltaHeader") or die $!; 

my $lineFromOriginalHeader; 
my $lineFromGeneratedHeader; 
my $TotalMacrosExamined = 0; 
my $TotalMacrosMissed = 0; 

while($lineFromOriginalHeader=<FILE1>) 
{ 
if($lineFromOriginalHeader =~ /$MatchingPattern/) 
    { 
    my $index = index($lineFromOriginalHeader,$MatchingPattern); 

    my $BackIndex = $index; 
    my $BackIndexStart = $index; 

    $BackIndex = $BackIndex - 1; 

    ## Use this while loop to extract the substring. 
    while (1) 
    { 
     my $ExtractedChar = substr($lineFromOriginalHeader,$BackIndex,1); 
     if ($ExtractedChar =~//) 
     { 
     $FoundPattern = substr($lineFromOriginalHeader,$BackIndex + 1,$BackIndexStart + 3 - 
                       $BackIndex); 
     print "Identified $FoundPattern \n"; 
     $TotalMacrosExamined = $TotalMacrosExamined + 1; 
     ##Skip the next line 
     $lineFromOriginalHeader = <FILE1>; 
     last;  
     } 
    else 
    { 
     $BackIndex = $BackIndex - 1; 
    } 

    } ##while(1) 

## We now look for $FoundPattern in FILE2 
while ($lineFromGeneratedHeader = <FILE2>) 
{ 
    if (index($lineFromGeneratedHeader,$FoundPattern)!= -1) 
    { 
    ##Pattern found. Reset file pointer and break out of while loop 
    seek FILE2,0,0; 
    last; 
    } 
    else 
    { 
    if (eof(FILE2) == 1) 
     {   
     print FILE3 "Generated header misses $FoundPattern\n"; 
     $TotalMacrosMissed = $TotalMacrosMissed + 1; 
     seek FILE2,0,0; 
     last;  
     } 
    } 
} ##while(1) 

} 
else 
{ 
    ##NOP 
} 
} ##while (linefromoriginalheader) 

close FILE1; 
close FILE2; 
close FILE3; 
print "Total number of bitfields examined = $TotalMacrosExamined\n"; 
print "Number of macros obsolete = $TotalMacrosMissed\n"; 
+1

您介紹的步驟相當不錯。你是Perl的新手,還是不管語言編程?如果您剛剛接觸Perl,幾乎所有描述的內容都可以在http://perldoc.perl.org/perlintro.html找到。一旦您有一些代碼可以顯示我們可以幫助您處理棘手的部分。 – DavidO 2012-08-17 04:56:35

+0

有多種方法可以做到這一點,這裏有一個:'$ perl -ne'exec q; perl ;,「-ne」,q $ print(/\Q$.$1.q;/?"$。YES 「:$。.q \; NO \;);;」file2「if m; ^(。*)_ pat1;' file1'這應該做的伎倆,減去幾個陷阱。我不知道這是否編譯,但我喜歡它的外觀。請注意使用'exec'作爲循環終止符。我甚至不*有*分配給一個變量:-)有,當然,更簡單的方法 - [?你嘗試過什麼(http://whathaveyoutried.com) – amon 2012-08-17 05:32:33

+0

當使用正則表達式,我們可以聲明*通過將所有我們想要用parens捕捉的東西包含在捕捉組*中。你的正則表達式看起來像'/($ MatchingPattern)/'。該組的內容將在特殊變量「$ 1」中,直到您執行另一個正則表達式匹配。 [Perl的正則表達式教程](http://perldoc.perl。org/perlretut.html#Extracting-matches)在學習perl正則表達式時可能會派上用場。 – amon 2012-08-17 06:20:06

回答

0

編程在C我所有的生活中,我googled下面的perl結構的使用,並寫了一個C類似的程序。這對我來說完美無瑕。 :-)

編輯:這是爲了闡明爲什麼我必須跳過下面算法中的一行。在第二個文件中檢索並稍後搜索的模式發生在兩個連續的行上。因此,足以可靠地檢測到它的第一次發生。也是一個挑剔的問題,總是保證包含該模式的子字符串始終是該行上的第二個子字符串。

e.g的#define Something_Pos(某個值)

#!/usr/bin/perl 
use strict; 
use warnings; 

my $OriginalHeader=$ARGV[0]; 
my $GeneratedHeader=$ARGV[1]; 
my $DeltaHeader=$ARGV[2]; 

my $MatchingPattern="_Pos"; 
my $FoundPattern; 

open FILE1, $OriginalHeader or die $!; 
open FILE2, $GeneratedHeader or die $!; 
open (FILE3, ">$DeltaHeader") or die $!; 

my $lineFromOriginalHeader; 
my $lineFromGeneratedHeader; 
my $TotalMacrosExamined = 0; 
my $TotalMacrosMissed = 0; 

while($lineFromOriginalHeader=<FILE1>) 
{ 
if($lineFromOriginalHeader =~ /$MatchingPattern/) 
{ 
    my $index = index($lineFromOriginalHeader,$MatchingPattern); 

    my $BackIndex = $index; 
    my $BackIndexStart = $index; 

    $BackIndex = $BackIndex - 1; 

    ## Use this while loop to extract the substring. 
    while (1) 
    { 
    my $ExtractedChar = substr($lineFromOriginalHeader,$BackIndex,1); 
    if ($ExtractedChar =~//) 
    { 
    $FoundPattern = substr($lineFromOriginalHeader,$BackIndex + 1,$BackIndexStart + 3 - 
                       $BackIndex); 
    print "Identified $FoundPattern \n"; 
    $TotalMacrosExamined = $TotalMacrosExamined + 1; 
    ##Skip the next line 
    $lineFromOriginalHeader = <FILE1>; 
    last;  
    } 
    else 
    { 
    $BackIndex = $BackIndex - 1; 
    } 

} ##while(1) 

## We now look for $FoundPattern in FILE2 
while ($lineFromGeneratedHeader = <FILE2>) 
{ 
##print "Read the following line from FILE2: $lineFromGeneratedHeader\n"; 

    if (index($lineFromGeneratedHeader,$FoundPattern)!= -1) 
    { 
    ##Pattern found. Close the file and break out of while loop 
    seek FILE2,0,0; 
    last; 
    } 
    else 
    { 
    if (eof(FILE2) == 1) 
     {   
     print FILE3 "Generated header misses $FoundPattern\n"; 
     $TotalMacrosMissed = $TotalMacrosMissed + 1; 
     seek FILE2,0,0; 
     last;  
     } 
    } 
} ##while(1) 

} 
else 
{ 

} 
} ##while (linefromoriginalheader) 

close FILE1; 
close FILE2; 
close FILE3; 
print "Total number of bitfields examined = $TotalMacrosExamined\n"; 
print "Number of macros obsolete = $TotalMacrosMissed\n"; 
+0

這段代碼好像可以工作,但是我已經添加了一個接受這段代碼的答案,並使它更加Perlish。 http://stackoverflow.com/a/12012931/468327 – 2012-08-17 21:09:06

0

就在讓你代碼佩爾利第一晉級。其實還有很多可以做的,包括$some_var通常用於Perl中的$SomeVar,但我沒有得到那麼多。

#!/usr/bin/perl 
use strict; 
use warnings; 

my ($OriginalHeader, $GeneratedHeader, $DeltaHeader) = @ARGV; 
my $MatchingPattern=qr/(\S*_Pos)/; # all non-whitespace terminated by _Pos 

open my $file1, '<', $OriginalHeader or die $!; 
open my $file2, '<', $GeneratedHeader or die $!; 
open my $file3, '>', $DeltaHeader  or die $!; 

my $TotalMacrosExamined = 0; 
my $TotalMacrosMissed = 0; 

while(my $lineFromOriginalHeader=<$file1>) { 
    next unless $lineFromOriginalHeader =~ $MatchingPattern; 
    my $FoundPattern = $1; # matched string 

    print "Identified $FoundPattern \n"; 
    $TotalMacrosExamined++; 

    ##Skip the next line 
    <$file1>; 

    ## We now look for $FoundPattern in FILE2 
    my $match_found = 0; 
    while (my $lineFromGeneratedHeader = <$file2>) { 
    if (index($lineFromGeneratedHeader,$FoundPattern)!= -1) { 
     ##Pattern found. Close the file and break out of while loop 
     $match_found++; 
     last; 
    } 
    } 

    unless ($match_found) { 
    print $file3 "Generated header misses $FoundPattern\n"; 
    $TotalMacrosMissed++; 
    } 

    seek $file2,0,0; 

} 

print "Total number of bitfields examined = $TotalMacrosExamined\n"; 
print "Number of macros obsolete = $TotalMacrosMissed\n";