如何「分裂和基團」的對象基於其屬性

之一的陣列我有一個稱爲TimesheetEntry類的實例的Array。

這裏是TimesheetEntry構造：

def initialize(parameters = {}) 
    @date  = parameters.fetch(:date) 
    @project_id = parameters.fetch(:project_id) 
    @article_id = parameters.fetch(:article_id) 
    @hours  = parameters.fetch(:hours) 
    @comment = parameters.fetch(:comment) 
    end

創建TimesheetEntry的陣列的數據與來自一個.csv文件對象：

timesheet_entries = [] 
    CSV.parse(source_file, csv_parse_options).each do |row| 
    timesheet_entries.push(TimesheetEntry.new(
     :date  => Date.parse(row['Date']), 
     :project_id => row['Project'].to_i, 
     :article_id => row['Article'].to_i, 
     :hours  => row['Hours'].gsub(',', '.').to_f, 
     :comment => row['Comment'].to_s.empty? ? "N/A" : row['Comment'] 
    )) 
    end

我也有包含兩個元素Hash的Set，像這樣創建的：

all_timesheets = Set.new [] 
    timesheet_entries.each do |entry| 
    all_timesheets << { 'date' => entry.date, 'entries' => [] } 
    end

現在，我想用TimesheetEntries填充該哈希中的數組。每個哈希數組必須只包含一個特定日期的TimesheetEntries。

我這樣做，是這樣的：

timesheet_entries.each do |entry| 
    all_timesheets.each do |timesheet| 
     if entry.date == timesheet['date'] 
     timesheet['entries'].push entry 
     end 
    end 
    end

雖然這種方法能夠完成任務，這不是很有效的（我是相當新的這個）。

問題

什麼是實現相同的最終結果的一個更有效的方法？實質上，我想要「拆分」TimesheetEntry對象數組，將具有相同日期的對象「分組」。

來源

2015-01-16 Leif

您可以通過將Set替換爲類似字典的數據結構Hash來修復性能問題。

這意味着你的內循環all_timesheets.each do |timesheet| ... if entry.date ...將被一個更有效的散列查找替代：all_timesheets[entry.date]。

此外，沒有必要提前創建密鑰，然後然後填充日期組。這些都可以一次完成：

all_timesheets = {} 

timesheet_entries.each do |entry| 
    all_timesheets[entry.date] ||= [] # create the key if it's not already there 
    all_timesheets[entry.date] << entry 
end

哈希的一個好處是您可以在遇到不存在的鍵時自定義它們的行爲。您可以使用帶塊的constructor來指定在這種情況下發生的情況。讓我們告訴我們的哈希自動添加新的密鑰並用空數組初始化它們。這使得我們可以從上面的代碼刪除all_timesheets[entry.date] ||= []行：

all_timesheets = Hash.new { |hash, key| hash[key] = [] } 

timesheet_entries.each do |entry| 
    all_timesheets[entry.date] << entry 
end

有，然而，實現這一分組，使用Enumerable#group_by method的一個更簡潔的方式：

all_timesheets = timesheet_entries.group_by { |e| e.date }

，當然，有一種方法可以使這個更簡潔，使用尚未another trick：

all_timesheets = timesheet_entries.group_by(&:date)

來源

2015-01-16 08:01:09 GolfWolf

謝謝！從你的答案中學到了幾件新事物。我一直認爲'Set'比'Hash'或'Array'快，用於過濾出獨特的項目，因爲當你嘗試添加它們時會自動忽略重複;我需要深入研究。另外，我不知道「|| ='或」group_by「。 – Leif

我按照你的建議用幾行代碼替換了幾乎所有的代碼：'all_timesheets = timesheet_entries.group_by（＆：date）'，它也更快。男人，我愛Ruby。再次感謝指針。 – Leif

如何「分裂和基團」的對象基於其屬性

回答

相關問題