2015-10-06 55 views
0

我有一個固定數量的類別的大型數據集。我最初一直將所有內容存儲在哈希數組中。效果很好,但考慮到數據的大小和類別的冗餘,效率不高。將散列添加到現有散列數組的r​​uby方法是什麼?

我現在正在使用不同類型/類別的散列,並在每個類別中存儲散列數組。

現在我的當前添加數據的方法是在將每個散列添加到類型數組之前刪除每個散列的:type鍵。一切正常。不過,我相信有這樣做的更簡化的「紅寶石路」:

# Very large data set with redundant types. 
gigantic_array = [ 
    { type: 'a', organization: 'acme inc', president: 'bugs bunny' }, 
    { type: 'a', organization: 'looney toons', president: 'donald' }, 
    { type: 'b', organization: 'facebook', president: 'mark' }, 
    { type: 'b', organization: 'myspace', president: 'whoknows' }, 
    { type: 'c', organization: 'walmart', president: 'wall' } 
    # multiply length by ~1000 
] 

# Still gigantic, but more efficient. 
# Stores each type as symbol. 
# Each type is an array of hashes. 
more_efficient_hash = { 
    type: { 
    a: [ 
     { organization: 'acme inc', president: 'bugs bunny' }, 
     { organization: 'looney toons', president: 'donald' } 
    ], 

    b: [ 
     { organization: 'facebook', president: 'mark' }, 
     { organization: 'myspace', president: 'whoknows' } 
    ], 

    c: [ 
     { organization: 'walmart', president: 'wall' } 
     # etc.... 
    ] 
    } 
} 

hash_to_add = { type: 'c', organization: 'target', president: 'sharp' } 

# Adds hash to array of types inside the gigantic more_efficient_hash. 
# Is there a better way? 
more_efficient_hash[:type][hash_to_add[:type].to_sym].push(hash_to_add.delete(:type)) 
+0

第二個散列效率如何? –

+1

@TheCha͢mp更正常嗎? – binarymason

+0

我不知道你在問什麼 –

回答

1

undur_gongor同意,一些小的數據類將是有益的,而且在你的結果:type鍵不添加任何值。

對於從gigantic_array開始的初始轉換,您可以使用group_by輕鬆完成。請注意,Hash#delete返回已刪除鍵的值,而不是散列值,所以我不確定最後一行是否按照您希望的方式工作。

> more_efficient_hash = gigantic_array.group_by {|item| item.delete(:type).to_sym} 
{ 
    a: [ 
    {:organization=>"acme inc", :president=>"bugs bunny"}, 
    {:organization=>"looney toons", :president=>"donald"} 
    ], 
    b: [ 
    {:organization=>"facebook", :president=>"mark"}, 
    {:organization=>"myspace", :president=>"whoknows"} 
    ], 
    c: [ 
    {:organization=>"walmart", :president=>"wall"} 
    ] 
} 

從這一點來說,你的最後一行很乾淨。由於delete具有破壞性,因此我們可以縮短一點。

> more_efficient_hash[hash_to_add.delete(:type).to_sym] << hash_to_add 
# ... 
    c: [ 
    {:organization=>"walmart", :president=>"wall"}, 
    {:organization=>"target", :president=>"sharp"} 
    ]