從文件中的密鑰列表中查找文件中缺失的密鑰

因此，我有一個列出每個密鑰（每行一個）的文件keys.txt，例如從文件中的密鑰列表中查找文件中缺失的密鑰

VIEW_ACCOUNT_NAME_LABEL 
VIEW_ACCOUNT_NAME_DESCR 
VIEW_ACCOUNT_STREET_LABEL 
VIEW_ACCOUNT_CITY_SUBURB_LABEL 
VIEW_ACCOUNT_ZIP_POSTCODE_LABEL 
VIEW_ACCOUNT_COUNTRY_LABEL

以及各種配套語言文件，對於該鍵提供值，如en-GB.view.acccount.ini具有每行一個條目，像這樣：

VIEW_ACCOUNT_NAME_LABEL="Name:" 
VIEW_ACCOUNT_NAME_DESCR="Name of the account holder." 
VIEW_ACCOUNT_STREET_LABEL="Street:" 
VIEW_ACCOUNT_CITY_SUBURB_LABEL="City/Suburb:" 
VIEW_ACCOUNT_ZIP="Zip Code" 
VIEW_ACCOUNT_COUNTRY_LABEL="Country"

注：有許多關鍵和語言文件，實際文件有更多的條目 - 通常超過1000種語言。

我需要能夠找到

哪些鍵的語言文件丟失（例如，VIEW_ACCOUNT_ZIP_POSTCODE_LABEL）
哪些鍵在語言文件，但不能在密鑰文件（通常是過時的密鑰如VIEW_ACCOUNT_ZIP）

因爲我使用grep與-v反轉匹配選項嘗試的第一個要求，但結果不出我所料：

cppl ~ grep -v --file=keys.txt en-GB.view.acccount.ini 
VIEW_ACCOUNT_NAME_LABEL="Name:" 
VIEW_ACCOUNT_NAME_DESCR="Name of the account holder." 
VIEW_ACCOUNT_STREET_LABEL="Street:" 
VIEW_ACCOUNT_CITY_SUBURB_LABEL="City/Suburb:" 
VIEW_ACCOUNT_ZIP="Zip Code" 
cppl ~

來源

2013-07-02 Craig

您可以使用標準的UNIX工具join和uniq做到這一點。這是一種方法。

我假設你的密鑰文件在下面的例子中被命名爲file1。

生成只包含鍵的文件，而不包含值。

sed 's/=.*//' en-GB.view.acccount.ini > file2

你現在有file1和僅包含鍵file2。在這個例子中：

$ cat file1 
A 
B 
C 
D 

$ cat file2 
C 
D 
E

您現在可以使用的join，sort和uniq組合，讓您所需的輸出。

# Keys which are common to both files. 
$ join file1 file2 | cat - file1 | sort | uniq -d 
C 
D 

# Keys in file1 but not in file2 
$ join file1 file2 | cat - file1 | sort | uniq -u 
A 
B 

# Keys in file2 but not in file1 
$ join file1 file2 | cat - file2 | sort | uniq -u 
E

來源

2013-07-02 07:57:31

使用comm。

要找出哪些鍵的語言文件丟失：

$ comm -23 <(sort keys.txt) <(cut -d= -f1 en-GB.view.acccount.ini | sort) 
VIEW_ACCOUNT_ZIP_POSTCODE_LABEL

要了解哪些鍵在語言文件，但不能在密鑰文件：

$ comm -13 <(sort keys.txt) <(cut -d= -f1 en-GB.view.acccount.ini | sort) 
VIEW_ACCOUNT_ZIP

來源

2013-07-02 07:59:20 dogbane

我喜歡它。使用'comm'似乎比我想到的方法更適合這個確切的任務。 –

感謝您的回答，但我沒有得到相同的結果。如果我正確理解這一點，那麼這兩個文件都會被排序然後傳回給'comm'。語言文件通過cut傳遞，以便在排序前返回鍵，所以'comm'有效地比較了兩組鍵。這一切都非常有意義，除了我得到這個： 'cppl〜comm -23 <（sort keys.txt）<（cut -d = -f1 zh-GB.view.acccount.ini | sort） VIEW_ACCOUNT_CITY_SUBURB_LABEL VIEW_ACCOUNT_NAME_DESCR VIEW_ACCOUNT_NAME_LABEL VIEW_ACCOUNT_STREET_LABEL VIEW_ACCOUNT_ZIP_POSTCODE_LABEL' – Craig

你已經在鑰匙任何尾隨空格的文件？如果是的話，你將不得不刪除它們。嘗試'comm -23 <（sed's/* // g'keys.txt | sort）<（cut -d = -f1 en-GB.view.acccount.ini | sort）' – dogbane

您能夠使用perl的呢？如果是這樣，perl使這超級簡單。這是一個我鞭打的快速而骯髒的腳本。修改以適應您的口味。

#!/usr/bin/perl -w 

# usage: validate keys.txt file1.ini [file2.ini [file3.ini [...]]] 

open my $keys_file, "<", $ARGV[0] or die "cannot open $ARGV[0] for reading"; 

my %keys = (map { chomp; s/\s//g; $_ => 0 } <$keys_file>); 

close $keys_file; 

sub validate_file 
{ 
    my $filename = shift @_; 
    my (@missing, @unexpected, @repeated); 
    my %seen = %keys; 

    open my $f, "<", $filename or die "cannot open $filename for reading"; 

    foreach my $line (<$f>) 
    { 
     chomp $line; 

     if ($line =~ /\s*([^=]+)="[^"]*"/) 
     { 
      if (!defined $seen{$1}) 
      { 
       push @unexpected, $1; 
       $seen{$1} = 0; 
      } 
      $seen{$1}++; 
     } 
    } 

    @missing = grep { $seen{$_} == 0 } sort keys %keys; 
    @repeated = grep { $seen{$_} > 1 } sort keys %keys; 

    return \@missing, \@unexpected, \@repeated; 
} 


shift @ARGV; 

foreach my $file (@ARGV) 
{ 
    my ($missing, $unexpected, $repeated) = validate_file($file); 

    print "\nFile $file:\n"; 
    print "Missing keys:\n", join("\n", @$missing), "\n"; 
    print "Unexpected keys:\n", join("\n", @$unexpected), "\n"; 
    print "Repeated keys:\n", join("\n", @$repeated), "\n"; 
}

來源

2013-07-02 08:02:36

從文件中的密鑰列表中查找文件中缺失的密鑰

回答

相關問題