2011-11-15 61 views
0

我正在研究一個將比較兩個.csv文件的程序。將相關數據從一個csv文件中提取到一個數組數組後,我需要組合相關條目。例如,我希望把這個數組:如何在Ruby中結合相似數據的數組?

[["11/13/15", ["4001", "1392"], "INBOUND"], 
["11/13/15", ["4090", "540"], "INBOUND"], 
["11/13/15", ["1139", "162"], "INBOUND"], 
["11/13/15", ["1158", "64"], "INBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["4055", "448"], "OUTBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["1139", "162"], "OUTBOUND"], 
["11/13/15", ["1158", "64"], "OUTBOUND"], 
["11/13/15", ["4091", "520"], "OUTBOUND"]] 

成這樣:

[["11/13/15", ["4001", "1392"], "INBOUND"], 
["11/13/15", ["4090", "540"], "INBOUND"], 
["11/13/15", ["1139", "162"], "INBOUND"], 
["11/13/15", ["1158", "64"], "INBOUND"], 
["11/13/15", ["4055", "1152"], "OUTBOUND"], 
["11/13/15", ["1139", "162"], "OUTBOUND"], 
["11/13/15", ["1158", "64"], "OUTBOUND"], 
["11/13/15", ["4091", "520"], "OUTBOUND"]] 

對於數組的某些元素,如果項目在[0][1][0][2]比賽的另外一個,然後創建一個新項目(數組),其項目爲[1][1],爲[1][1]上所有項目的總和,並刪除舊數組。如果它會更容易,我可以改變提取相關數據的方式,以便[1]的項目不是數組,而每行有4項而不是3個。

+0

是連續結合這些元素? – tokland

+0

數據將被排序,以便它看起來像頂部數組如果打印,所以是的(如果我理解你的問題)。 –

+0

假設托克蘭的回答真的是你想要的,你的問題就是結果數組中'[4] [1] [1]'的值有一個拼寫錯誤,這是唯一的關鍵值。它應該是1152,而不是1115.我必須說你的問題很sl。。 – sawa

回答

2

我假設要分組的元素是連續的我們可以使用Enumerable#chunk。功能的方法:

grouped_xs = xs.chunk { |date, (id1, id2), direction| [date, id1, direction] } 
grouped_xs.map do |(date, id1, direction), ary| 
    id2_sum = ary.map { |date, (id1, id2), direction| id2.to_i }.inject(:+) 
    [date, id1, id2_sum.to_s, direction] 
end 

輸出(你想要的輸出數組中4元,對吧?):

[["11/13/15", "4001", "1392", "INBOUND"], 
["11/13/15", "4090", "540", "INBOUND"], 
["11/13/15", "1139", "162", "INBOUND"], 
["11/13/15", "1158", "64", "INBOUND"], 
["11/13/15", "4055", "1152", "OUTBOUND"], 
["11/13/15", "1139", "162", "OUTBOUND"], 
["11/13/15", "1158", "64", "OUTBOUND"], 
["11/13/15", "4091", "520", "OUTBOUND"]] 
+0

這是完美的。謝謝! –

+0

@肖恩:不客氣。就像一般性建議一樣,我認爲最好不要急於選擇一個答案太快,有人可能會想出一個更好的解決方案:-) – tokland

+0

這是一些非常小巧的Ruby @tokland :) –

0

這應做到:

def lookup(list, id, direction) 
    index = nil 
    list.each_with_index do |e, i| 
    if (id == e[1][0]) and (e[2] == direction) 
     index = i 
     break 
    end 
    end 
    index 
end 

b = [] 

a.each do |e| 
    id = e[1][0] 
    direction = e[2] 
    i = lookup(b, id, direction) 
    if i.nil? 
    b << e 
    else 
    count = e[1][1].to_i 
    sum = count + b[i][1][1].to_i 
    b[i][1][1] = sum.to_s 
    end 
end 

b.each{|e| p e} 

輸出:

["11/13/15", ["4001", "1392"], "INBOUND"] 
["11/13/15", ["4090", "540"], "INBOUND"] 
["11/13/15", ["1139", "162"], "INBOUND"] 
["11/13/15", ["1158", "64"], "INBOUND"] 
["11/13/15", ["4055", "1152"], "OUTBOUND"] 
["11/13/15", ["1139", "162"], "OUTBOUND"] 
["11/13/15", ["1158", "64"], "OUTBOUND"] 
["11/13/15", ["4091", "520"], "OUTBOUND"] 
0
h = Hash.new(0) 
[["11/13/15", ["4001", "1392"], "INBOUND"], 
["11/13/15", ["4090", "540"], "INBOUND"], 
["11/13/15", ["1139", "162"], "INBOUND"], 
["11/13/15", ["1158", "64"], "INBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["4055", "448"], "OUTBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["1139", "162"], "OUTBOUND"], 
["11/13/15", ["1158", "64"], "OUTBOUND"], 
["11/13/15", ["4091", "520"], "OUTBOUND"]] 
.each{|a, (b, c), d| h[[a, b, d]] += c.to_i} 
p h.map{|(a, b, d), c| [a, [b, c], d]} 

會給:

[["11/13/15", ["4001", 1392], "INBOUND"], 
["11/13/15", ["4090", 540], "INBOUND"], 
["11/13/15", ["1139", 162], "INBOUND"], 
["11/13/15", ["1158", 64], "INBOUND"], 
["11/13/15", ["4055", 1152], "OUTBOUND"], 
["11/13/15", ["1139", 162], "OUTBOUND"], 
["11/13/15", ["1158", 64], "OUTBOUND"], 
["11/13/15", ["4091", 520], "OUTBOUND"]] 
2

而只是舉例 - 我的一行(與兩個1.8和1.9紅寶石作品):

table = [["11/13/15", ["4001", "1392"], "INBOUND"], 
["11/13/15", ["4090", "540"], "INBOUND"], 
["11/13/15", ["1139", "162"], "INBOUND"], 
["11/13/15", ["1158", "64"], "INBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["4055", "448"], "OUTBOUND"], 
["11/13/15", ["4055", "352"], "OUTBOUND"], 
["11/13/15", ["1139", "162"], "OUTBOUND"], 
["11/13/15", ["1158", "64"], "OUTBOUND"], 
["11/13/15", ["4091", "520"], "OUTBOUND"]] 

result = table.group_by {|a, (b, c), d| [a, [b], d]}.map {|k, v| k[1] << v.map {|a| a[1][1].to_i}.inject(:+).to_s; k} 
+0

我投這個,但有一個錯字 - 最後] – pguardiario

+0

更正,謝謝! –

+1

我特別喜歡「group_by {| a,(b,c),d |」 - 非常好。 – pguardiario