我有這樣的代碼從一個CSV文件中提取數據,然後重新格式化,以便它可以與其它數據集進行比較:如何濃縮/簡化從csv文件中提取數據的代碼塊?
def dataExtract
dates = File.open(@filename_data).read.scan /\d{2}\/\d{2}\/\d{2}/
data_extracted = []
index = 0
dates.each do |date|
inbound_row = @data[4+(11*index)]
outbound_row = @data[6+(11*index)]
data_extracted.push [date, '4001', (inbound_row[1].gsub(/\,/,"").to_i + inbound_row[2].gsub(/\,/,"").to_i).to_s, 'AI', 'INBOUND']
data_extracted.push [date, '4090', inbound_row[3].gsub(/\,/,""), 'AI', 'INBOUND']
data_extracted.push [date, '1139', inbound_row[4].gsub(/\,/,""), 'RU STANDRD', 'INBOUND']
data_extracted.push [date, '1158', inbound_row[5].gsub(/\,/,""), 'RU STANDRD', 'INBOUND']
data_extracted.push [date, '4055', outbound_row[1].gsub(/\,/,""), 'RU PLUS', 'OUTBOUND']
data_extracted.push [date, '4055', outbound_row[2].gsub(/\,/,""), 'AR', 'OUTBOUND']
data_extracted.push [date, '1139', outbound_row[4].gsub(/\,/,""), 'RU STANDRD', 'OUTBOUND']
data_extracted.push [date, '1158', outbound_row[5].gsub(/\,/,""), 'RU STANDRD', 'OUTBOUND']
data_extracted.push [date, '4091', outbound_row[3].gsub(/\,/,""), 'RU STANDRD', 'OUTBOUND']
index += 1
end
return data_extracted
end
這裏是CSV數據的樣本(這是一天對於多天有這樣的塊與其間他們一個空的空間):在此csv文件(除了日期使用
Date,BLOCK,,Wood,Miscellaneous,,Totals,MO
Monday,4055-RU,4055-AR,4091,1139,1158,,100
11/4/15,C Sort,B,C,iGPS,PECO,,
Starting,714,228,858,82,129,"2,011",
Sorted,"2,738",190,"1,110",144,228,"4,410",
Subtotal 1,"3,452",418,"1,968",226,357,"6,421",
Shipped,"2,700",0,"1,865",0,0,"4,565",
,752,418,103,226,357,"1,856",
Physical,752,418,103,226,357,"1,856",
Variance,0,0,0,0,0,0,
唯一的數據)的排序和運行。無論如何,就像我說的,這個工作,它不是很漂亮。是否有更好的方式來執行dates.each塊,因爲有重複的信息(每個數組中的日期+入站/出站)?
不知道你想達到什麼......可能是用2個名爲'inbound'和'outbound'的數組替換你的'date_extracted'數組,這是一個開始。 – steenslag
如果您的第一行包含列標題,那麼每一列都應該有一個。這真的不重要,但看起來更好,可能會得心應手。 –