拒絕哈希的內容，如果他們不在陣

我有此數組：拒絕哈希的內容，如果他們不在陣

array = ["1", "2", "3", "4"]

我有這個數組哈希：

ah = [ 
{:id=>"1", :value=>"A"}, 
{:id=>"2", :value=>"B"}, 
{:id=>"3", :value=>"C"}, 
{:id=>"4", :value=>"D"}, 
{:id=>"5", :value=>"E"}, 
{:id=>"6", :value=>"F"}, 
{:id=>"7", :value=>"G"}, 
{:id=>"8", :value=>"H"}, 
    ]

我需要拒絕ah其ID的哈希值不在array。

達到此目的的最佳方法是什麼？

來源

2015-10-26 Seal

這聞起來像一個XY問題。我會重新考慮使用這種結構。相反，使用'：id'值作爲鍵與相關的'：value'值構建一個簡單的哈希值作爲值。那麼使用'ah.keys - array'可以很容易地讓密鑰被拒絕。它會非常快。但是更快的時候就是使用'ah.values_at（* values）'來提取所需的值。如果可能存在重複的鍵，這會中斷，但允許關聯值的數組可以修復這個問題。 –

您可以選擇逆 - 其id爲array通過使用此代碼的哈希值：

ah.select{|el| array.include?(el[:id])}

如果你喜歡reject，你可以使用：

ah.reject{|el| !array.include?(el[:id])}

欲瞭解更多信息：，Array#select。如果您想要使用Array#reject!或Array#select!修改這些方法，將創建一個新陣列。

來源

2015-10-26 15:00:35 tbrisker

不是拒絕那些沒有在數組中ID的更好的解決方案是隻接受做的：

ah.select { |hash| array.include?(hash[:id]) }

來源

2015-10-26 15:02:13 Josh

對於大的數據塊我會用一些預處理去避免O(n*m)查找。

array = ["1", "2", "3", "4"] 
array_hash = array.each_with_object({}){ |i, h| h[i] = true } 
ah.select{ |obj| array_hash[obj[:id]] }

來源

2015-10-26 16:31:23 fl00r

我正在研究O（n^2）解決方案的替代方案，但我非常喜歡你的解決方案。如果沒有問題，我會將其包含在我的基準測試結果中 – Anthony

不錯的一個！...... –

我知道已經有一個公認的答案，但因爲在這裏所有的答案都在O(n*m)，我想我會建議在O（n）的替代*。

如果ah數組有100_000項，並且我們在子數組中有10_000項，下面是一個粗略的基準。我在這裏包括fl00r的答案和Cary的，因爲我們都試圖避免O(n*m)的情況。

       user  system  total  real 
select with include  34.610000 0.110000 34.720000 (34.924679) 
reject with include  34.320000 0.100000 34.420000 (34.611992) 
group and select   0.170000 0.010000 0.180000 ( 0.182358) 
select by value    0.040000 0.000000 0.040000 ( 0.041073) 
select with set    0.040000 0.000000 0.040000 ( 0.048331) 
hashify then values   0.130000 0.010000 0.140000 ( 0.139686)

代碼重現此：

require 'benchmark' 
require 'set' 

list_size = 100_000 
sub_list_size = 10_000 

ah = Array.new(list_size) { |i| { id: i, value: "A" } } 

array = [] 
sub_list_size.times { array << (0..list_size).to_a.sample } 

def group_than_select(ah, array) 
    grouped = ah.group_by { |x| x[:id] } 

    good_keys = grouped.keys - array 
    good_keys.map { |i| grouped[i] }.flatten 
end 

def select_by_fl00r(ah, array) 
    array_hash = array.each_with_object({}){ |i, h| h[i] = true } 
    ah.select{ |obj| array_hash[obj[:id]] } 
end 

def select_with_set(ah, array) 
    array_to_set = array.to_set 
    ah.select { |h| array_to_set.include?(h[:id]) } 
end 

def hashify_then_values_at(ah, array) 
    h = ah.each_with_object({}) { |g,h| h.update(g[:id]=>g) } 
    h.values_at(*(h.keys & array)) 
end 

Benchmark.bm(25) do |x| 
    x.report("select with include") do 
    ah.select{|el| array.include?(el[:id])} 
    end 
    x.report("reject with include") do 
    ah.reject{|e| !array.include?(e[:id])} 
    end 
    x.report("group and select") do 
    group_than_select(ah, array) 
    end 
    x.report("select by value") do 
    select_by_fl00r(ah, array) 
    end 
    x.report("select with set") do 
    select_with_set(ah, array) 
    end 
    x.report("hashify then values") do 
    hashify_then_values_at(ah, array) 
    end 
end

哈希映射通常是O（1）搜索雖然O（n）的最壞的情況下是可能的。

來源

2015-10-26 16:47:17 Anthony

我推薦[水果]（https://github.com/marcandre/fruity）gem for benchmark。它爲你處理所有骯髒的工作。 –

這裏有兩種可能性。

array = ["1", "2", "3", "4", "99999999"]

＃1

我預計include?解決方案將大大加快，如果array首先轉換成一組：

require 'set' 

def select_with_set(ah, array) 
    array_to_set = array.to_set 
    ah.select { |h| array_to_set.include?(h[:id]) } 
end 

select_with_set(ah, array) 
    #=> [{:id=>"1", :value=>"A"}, {:id=>"2", :value=>"B"}, 
    # {:id=>"3", :value=>"C"}, {:id=>"4", :value=>"D"}]

＃2

如果，如在示例中那樣，的散列元素有:id不同的值，我們可以做到這一點：

def hashify_then_values_at(ah, array)  
    h = ah.each_with_object({}) { |g,h| h.update(g[:id]=>g) } 
    h.values_at(*(h.keys & array)) 
end 

hashify_then_values_at(ah, array) 
    #=> [{:id=>"1", :value=>"A"}, {:id=>"2", :value=>"B"}, 
    # {:id=>"3", :value=>"C"}, {:id=>"4", :value=>"D"}]

來源

2015-10-27 03:24:36

轉換爲一個集合是一個好主意，它的基準相當快！ – Anthony

拒絕哈希的內容，如果他們不在陣

回答

相關問題