比較2個哈希中的2個處理密鑰

我想在文件中讀入一些符號，如「！」和「^」，並希望在將它們與另一行中的其他字符串進行比較之前將其刪除。如果兩個字符串在刪除符號後都相同，我想將它們存儲在另一個名爲「common」的散列中。例如... FILEA：比較2個哈希中的2個處理密鑰

hello!world 
help?!3233 
oh no^!! 
yes!

FILEB：「！」

hello 
help? 
oh no 
yes

在這種情況下，FILEA和FILEB應該是相同的，因爲我比較字符，直到達到地方或「^」出現。我用下面的代碼讀取文件：

open FILEA, "< script/".$fileA or die; 
my %read_file; 
while (my $line=<FILEA>) { 
    (my $word1,my $word2) = split /\n/, $line; 
    $word1 =~ s/(!.+)|(!.*)|(\^.+)|(\^.*)//;#to remove ! and^
    $read_file{$word1} = $word1; 
} 
close(FILEA);

我在哈希打印出來的鑰匙，它顯示了正確的結果（即其轉換FILEA爲「你好，幫忙？哦不，是）但是，當我做FILEA和FILEB的比較使用下面的代碼，它總是失敗。

while(($key,$value)=each(%config)) 
{ 
    $num=keys(%base_config); 
    $num--;#to get the correct index 
    while($num>=0) 
    { 
     $common{$value}=$value if exists $read_file{$key};#stored the correct matches in %common 
     $num--; 
    } 
}

我想測試我的替代和使用下面的示例2串之間的比較和它的作品。我不不知道爲什麼它不能將字符串讀取到文件中的散列表中。

use strict; 
use warnings; 

my $str="hello^vsd"; 
my $test="hello"; 
$str =~ s/(!.+)|(!.*)|(\^.+)|(\^.*)//; 
my %hash=(); 
$hash{$str}=(); 
foreach my $key(keys %hash) 
{ 
    print "$key\n"; 
} 
print "yay\n" if exists $hash{$test}; 
print "boo\n" unless exists $hash{$test};

這兩個文件都可以有不同數量的文本行，並且搜索時文本的行不必相同。即。「哦，不」可以在「你好」之前。

來源

2012-02-23 Sakura

輸出，你可以假設，在文件中的每一行應該只相比，在文件B對應的線？並且文件A和文件B具有相同的行數？ – ardnew 2012-02-23 03:20:10

否.FileA和FileB可以具有不同的行數，並且行不必是相同的順序。 – Sakura 2012-02-23 04:05:07

您可以使用正則表達式字符類s/[？^] // g來刪除^和？，請注意^必須是組中的最後一個，或者您需要將其轉義。（如果稍後添加其他字符，可能會更安全，以免它們被否定）。

我處理所有的文件，使用散列來計算該單詞存在的文件。

爲了比較這些差異，我使用了2 **（＃文件），所以我得到了值2 ** 0 = 1,2 ** 1 = 2,2 ** 2 = 4等等。我用來顯示字符串屬於哪個文件。如果它們全部存在，它們將等於全部文件，所以在這種情況下，2（3 + 2）表示它們在兩個文件中，1表示僅FileA，2表示FileB。你通過按位進行檢查和（&）。

編輯：加入試驗條件

<!-- language: perl --> 

my @files = qw(FileA.txt FileB.txt); 
my %words; 
foreach my $i (0 .. $#files) { 
    my $file = $files[$i]; 
    open(FILE,$file) or die "Error: missing file $file\n$!\n"; 
    while (<FILE>) { 
     chomp; 
     next if /^$/; 
     my ($word) = split /[!\^]/; 
     $word =~ s/[?\^]//g; # removes^and ? 
     $words{$word} += 2**$i; 
    } 
    close(FILE); 
} 

my %common; 
foreach my $key (sort keys %words) { 
    my @found; 
    foreach my $i (0 .. $#files) { 
     if ($words{$key} & 2**$i) { push @found, $files[$i] } 
    } 
    if ($words{$key} & 2**$#files) { $common{$key}++ } 
    printf "%10s %d: @found\n",$key,$words{$key}; 
} 

my @tests = qw(hello^vsd chuck help? test marymary^); 
print "\nTesting Words: @tests\n"; 
foreach (@tests) { 
    my ($word) = split /[!\^]/; 
    $word =~ s/[?\^]//g; # removes^and ? 
    if (exists $common{ $word }) { 
     print "Found: $word\n"; 
    } 
    else { 
     print "Cannot find: $word\n"; 
    } 
}

輸出：

bahbah 2: FileB.txt 
    chucker 1: FileA.txt 
    hello 3: FileA.txt FileB.txt 
     help 3: FileA.txt FileB.txt 
    marymary 2: FileB.txt 
    oh no 3: FileA.txt FileB.txt 
     test 1: FileA.txt 
     yes 3: FileA.txt FileB.txt 

Testing Words: hello^vsd chuck help? test marymary^ 
Found: hello 
Cannot find: chuck 
Found: help 
Cannot find: test 
Found: marymary

來源

2012-02-23 03:19:47 Rich

這裏的另一個解決方案，同時讀取兩個文件（假定這兩個文件具有相等的行數）：

use strict; 
use warnings; 

our $INVALID = '!\^'; #regexp character class, must escape 

my $fileA = "file1.txt"; 
my $fileB = "file2.txt"; 

sub readl 
{ 
    my $fh = shift; 
    my $ln = ""; 

    if ($fh and $ln = <$fh>) 
    { 
    chomp $ln; 
    $ln =~ s/[$INVALID]+.*//g; 
    } 

    return $ln; 
} 

my ($fhA, $fhB); 
my ($wdA, $wdB); 
my %common =(); 

open $fhA, $fileA or die "$!\n"; 
open $fhB, $fileB or die "$!\n"; 

while ($wdA = readl($fhA) and $wdB = readl($fhB)) 
{ 
    $common{$wdA} = undef if $wdA eq $wdB; 
} 

print "$_\n" foreach keys %common;

輸出

[email protected]:comparefiles$ cat file1.txt 
hello!world 
help?!3233 
oh no^!! 
yes! 

[email protected]:comparefiles$ cat file2.txt 
hello 
help? 
oh no 
yes 

[email protected]:comparefiles$ perl comparefiles.pl 
yes 
oh no 
hello 
help?

來源

2012-02-23 03:44:22 ardnew

開始由可重複使用的包裝分段成子程序：

sub read_file { 
    open my $fh, "<", $_[0] or die "read_file($_[0]) error: $!"; 
     # lexical handles auto-close when they fall out of scope 
     # and detailed error messages are good 
    my %file; 
    while (my $line = <$fh>) { 
     chomp $line;   # remove newline 
     $line =~ s{[!^].*}{}; # remove everything starting from ! or^
     $file{$line}++; 
    } 
    \%file 
}

read_file接收輸入文件名和之前的任何!或^字符返回線段的散列。每個線段都是一個關鍵字，值是它出現的次數。

利用這一點，下一步就是找出哪些行的文件之間的匹配：

my ($fileA, $fileB) = map {read_file $_} your_file_names_here(); 

my %common; 
$$fileA{$_} and $common{$_}++ for keys %$fileB; 

print "common: $_\n" for keys %common;

，它將打印：

 
common: yes 
common: oh no 
common: hello 
common: help?

您可以定義your_file_names_here如下，如果你想測試一下：

sub your_file_names_here {\(<<'/A', <<'/B')} 
hello!world 
help?!3233 
oh no^!! 
yes! 
/A 
hello 
help? 
oh no 
yes 
/B

來源

2012-02-23 03:44:28

嗨。我剛剛開始學習Perl，不太明白「my（$ fileA，$ fileB）= map {read_file $ _} your_file_names_here（）;」你能進一步解釋嗎？ – Sakura 2012-02-23 08:42:48

'map'將轉換應用於列表並返回轉換後的列表。它與'my $ fileA = read_file（'filea.txt'）相同;我的$ fileB = read_file（'fileb.txt'）;'如果list_（'filea.txt'，'fileb.txt'）'由'your_file_names_here'佔位符返回。 – 2012-02-23 13:57:12

首先我們必須規範你的輸入。下面的代碼爲每個路徑創建一個散列。對於給定文件中的每一行，刪除從第一個!或^字符開始的所有內容並記錄它的存在。

sub read_inputs { 
    my @result; 

    foreach my $path (@_) { 
    my $data = {}; 

    open my $fh, "<", $path or die "$0: open $path: $!"; 
    while (<$fh>) { 
     chomp; 
     s/[!^].*//; # don't put the caret first without escaping! 
     ++$data->{$_}; 
    } 

    push @result, $data; 
    } 

    wantarray ? @result : \@result; 
}

Computing the intersection of two arrays被覆蓋在Perl FAQ list的Data Manipulation部分。根據您的情況調整技術，我們想知道所有輸入共有的線。

sub common { 
    my %matches; 
    for (@_) { 
    ++$matches{$_} for keys %$_; 
    } 

    my @result = grep $matches{$_} == @_, keys %matches; 
    wantarray ? @result : \@result; 
}

與

my @input = read_inputs "FileA", "FileB"; 
my @common = common @input; 
print "$_\n" for sort @common;

綁一起給出

hello 
help? 
oh no 
yes

來源

2012-12-06 13:12:36

比較2個哈希中的2個處理密鑰

回答

相關問題