2012-02-01 23 views
3

我有這個散列:如何刪除哈希陣列中的部分?

{"path_1" => [1,2,3], "path_2" => [1,4,5], "path_3" => [1,2,3,4]} 

我想從哈希刪除所有「部分」路徑。所以path_1將需要去,因爲它是path_3的部分; [1,2,3][1,2,3,4]的「未完成」數組。所有「部分」需要從這個散列中刪除。

這裏是我當前的代碼的作品,但與大哈希處理時,它的速度慢:

# hash sorted by length of value 
hash_array = {"path_1" => [1,2,3], "path_2" => [1,4,5], "path_3" => [1,2,3,4]} 
# make a separate copy of the hash 
cloned_hash_array = hash_array.clone 

hash_array.each {|path_index, path| 
    # delete this path from the cloned hash so it doesn't match itself 
    cloned_hash_array.delete(path_index) 

    cloned_hash_array.each{|cloned_path_index, cloned_path| 
    if cloned_path[0,path.length] == path.clone 
     hash_array.delete(path_index) 
    end 
    } 
} 
+1

這僅僅是慣例,但通常多行塊使用'做... end'而不是'{...}'喜歡你正在處理你的每個人。 – 2012-02-01 18:24:04

+0

同意,多行'{...}'塊看起來很奇怪:-) – 2012-02-01 18:25:35

+0

這些數組中的順序是否重要,還是更合適?如果他們是集合,你可以利用Set#proper_subset – DGM 2012-02-01 18:41:48

回答

1

取決於你想要多快的速度是和你有多少元素都有。你可以嘗試這樣的事情(看起來瘋狂,但它的真快):

scatter = 
    lambda { |tree, path, name| 
    if path.empty? 
     tree[:tag] = name 
     tree[:path] unless tree.has_key?(:path) 
    else 
     head, *tail = path 
     unless tree[:path].has_key?(head) 
     tree[:path][head] = {:path => {}} 
     end 
     scatter[tree[:path][head], tail, name] 
    end 
    } 

gather = 
    lambda { |tree| 
    if tree[:path].empty? 
     [[tree[:tag], []]] 
    else 
     tree[:path].map { |k, v| 
     gather[v].map do |tag, path| 
      [tag, [k] + path] 
     end 
     }.flatten(1) 
    end 
    } 

scatter_gather = 
    lambda { |paths| 
    tree = {:path => {}} 
    paths.each do |tag, path| 
     scatter[tree, path, tag] 
    end 
    Hash[gather[tree]] 
    } 

scatter_gather["path_1" => [1,2,3], "path_2" => [1,4,5], "path_3" => [1,2,3,4]] 
#=> {"path_2"=>[1, 4, 5], "path_3"=>[1, 2, 3, 4]} 
+0

維克多,你是一個天才。雖然其他方法也起作用,但此方法速度最快並返回正確的結果。在109秒內完成我的哈希。 A + – 2012-02-02 15:54:05

1

你可以試試這個,應該是快一點點(沒有雙迴路)。

h = {"path_1" => [1,2,3], "path_2" => [1,4,5], "path_3" => [1,2,3,4]} 

h2 = {} 

a = h.sort{|l, r| r[1] <=> l[1]} 
puts a.inspect 
# => [["path_2", [1, 4, 5]], ["path_3", [1, 2, 3, 4]], ["path_1", [1, 2, 3]]] 

last_path = nil 
a.each do |key, path| 
    # now all paths are sorted in descending order. 
    # if a current path is a prefix for last_path, then discard it. 
    # otherwise, write it to a result and start comparing next ones against it. 
    if !last_path || last_path[0, path.length] != path 
    h2[key] = path 
    last_path = path 
    end 
end 

puts h2.inspect 
# => {"path_2"=>[1, 4, 5], "path_3"=>[1, 2, 3, 4]} 
0

這個怎麼樣:

hash_array = {"path_1" => [1,2,3], "path_2" => [1,4,5], "path_3" => [1,2,3,4]} 
cloned_hash_array = hash_array.clone 

cloned_hash_array.each do |key, value| 

    hash_array.each do |key2, value2| 
    if key != key2 and value.length <= value2.length 
     if value.all? { |i| value2.include?(i) } 
     cloned_hash_array.delete(key) 
     break 
     end 
    end 
    end 
end 

puts cloned_hash_array.inspect 
+0

這是否更快?它也有2個循環,所以我不知道它有多快。 – 2012-02-01 19:36:00

+0

你正在做'path.clone',如果這可能會減慢速度。試試我的答案,看看它的更快=) – 2012-02-01 19:41:07

+0

'.clone'不會減慢速度。這是循環緩慢,我已經基準測試;) – 2012-02-01 21:05:03