從排序在Perl列表索引排序和指數另一

說我有保存單詞的列表，另一個用於保存與這些詞相關的置信度：從排序在Perl列表索引排序和指數另一

my @list = ("word1", "word2", "word3", "word4"); 
my @confidences = (0.1, 0.9, 0.3, 0.6);

我想獲得第二以@list的元素組成的一對列表，其排序順序的可信度高於0.4及其相應的置信度。我如何在Perl中做到這一點？（即使用用於排序另一個列表索引的列表）

在上面的例子中，輸出將是：

my @sorted_and_thresholded_list = ("word2", "word4"); 
my @sorted_and_thresholded_confidences = (0.9, 0.6);

在@list的條目可能不是唯一的（即和排序應該保持穩定）
排序應該按降序排列。

來源

2012-09-14 Amelio Vazquez-Reina

是@list裏的條目獨特之處？ – Jean

當並行陣列打交道，必須與索引工作。

my @sorted_and_thresholded_indexes = 
    sort { $confidences[$b] <=> $confidences[$a] } 
    grep $confidences[$_] > 0.4, 
     0..$#confidences; 

my @sorted_and_thresholded_list = 
    @list[ @sorted_and_thresholded_indexes ]; 
my @sorted_and_thresholded_confidences = 
    @confidences[ @sorted_and_thresholded_indexes ];

來源

2012-09-14 15:28:29 ikegami

如果你確信你不會有重複的話，我想它可能是更容易使用的哈希此任務，例如：

my %hash = ("word1" => 0.1, 
      "word2" => 0.9, 
      "word3" => 0.3, 
      "word4" => 0.6 
      );

然後你就可以通過在鍵遍歷哈希只有找出符合條件的鍵：

foreach my $key (keys %hash) { 
    if ($hash{$key} > 0.4) { 
     print $key; 
    } 
}

來源

2012-09-14 15:26:00 j0nes

如果'@ list'中有重複項，該怎麼辦？ – duri

然後這將無法正常工作 - 根據您的評論編輯我的答案。 – j0nes

你忘了排序？ – ikegami

使用List::MoreUtils「pairwise和part：

use List::MoreUtils qw(pairwise part); 
my @list = ("word1", "word2", "word3", "word4"); 
my @confidences = (0.1, 0.9, 0.3, 0.6); 

my $i = 0; 
my @ret = part { $i++ % 2 } 
      grep { defined } 
      pairwise { $b > .4 ? ($a, $b) : undef } @list, @confidences; 

print Dumper @ret;

輸出：

$VAR1 = [ 
      'word2', 
      'word4' 
     ]; 
$VAR2 = [ 
      '0.9', 
      '0.6' 
     ];

來源

2012-09-14 15:29:52 simbabque

你忘了排序？ – ikegami

隨意添加。 – simbabque

雖然ikegami已經說明我的解決方案的第一選擇 - 使用indicies - 這裏也是選項將數組組合成一個二維數組（*）。好處是數據全部收集到相同的數據結構中，因此很容易操作。

use strict; 
use warnings; 
use Data::Dumper; 

my @list = ("word1", "word2", "word3", "word4"); 
my @conf = (0.1, 0.9, 0.3, 0.6); 
my @comb; 

for (0 .. $#list) {      # create two-dimensional array 
    push @comb, [ $list[$_], $conf[$_] ]; 
} 

my @all = sort { $b->[1] <=> $a->[1] } # sort according to conf 
      grep { $_->[1] > 0.4 } @comb; # conf limit 

my @list_done = map $_->[0], @all;  # break the lists apart again 
my @conf_done = map $_->[1], @all; 

print Dumper \@all, \@list_done, \@conf_done;

輸出：

$VAR1 = [ 
      [ 
      'word2', 
      '0.9' 
      ], 
      [ 
      'word4', 
      '0.6' 
      ] 
     ]; 
$VAR2 = [ 
      'word2', 
      'word4' 
     ]; 
$VAR3 = [ 
      '0.9', 
      '0.6' 
     ];

（*）=使用散列也是一種選擇，假定1）原始順序並不重要，2）中的所有字是唯一的。但是，除非快速查找是個問題，否則使用數組沒有缺點。

來源

2012-09-14 16:38:32 TLP

my @list = ("word1", "word2", "word3", "word4"); 
my @confidences = (0.1, 0.9, 0.3, 0.6); 

my @result = map { $list[$_] } 
       sort { $confidences[$b] <=> $confidences[$a] } 
       grep { $confidences[$_] > 0.4 } (0..$#confidences);

來源

2012-09-15 00:20:11 snoofkin

從排序在Perl列表索引排序和指數另一

回答

相關問題