2017-02-04 56 views
0

我正在分析我公司的原材料需求,我正在採用的方法是使用成品的銷售記錄與物料清單爲每個成品。我現在面臨的問題是,每個成品都由多個組件組成,許多成品共享通用組件。我試圖保留每個成品的所有單個銷售記錄,並使用UnitsSold與每個組件的單位數量相乘以獲得原材料的需求。這裏是集樣本代碼:dplyr合併成品銷售和物料清單的兩個數據集

fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2), 
         Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6), 
         FG_UnitsSold=c(100,200,300,400,500,600)) 

bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4), 
          Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"), 
          Qty=rnorm(3,1,n = 12))%>% 
          arrange(FG_PartNumber) 

我感到很熟悉dplyr left_join但似乎沒有工作,因爲它總是給我以每個成品的第一個組件。

任何人都可以提供幫助嗎? 謝謝。

回答

0

也許我不理解的問題,但如果你組由FG_PartNumber你的兩個數據幀,並就你感興趣的量的關係透視表,你可以得到你正在尋找的總額:

#Create data 
    set.seed(1) 
     fg_Sales <- data_frame(FG_PartNumber=rep(c("A","B","C"),2), 
          Order_Date=seq.Date(as.Date("2011-1-1"),as.Date("2012-1-10"),length.out = 6), 
          FG_UnitsSold=c(100,200,300,400,500,600)) 

    bill_materials <- data_frame(FG_PartNumber=rep(c("A","B","C"),4), 
           Components=c("C1","C2","C3","C4","C5","C6","C7","C7","C7","C8","C8","C9"), 
           Qty=rnorm(3,1,n = 12))%>% 
     arrange(FG_PartNumber) 

    library(dplyr) 
#make pivot tables for sales and quantity 

    tot_sales <- fg_Sales %>% 
     group_by(FG_PartNumber) %>% 
     summarise(tot_sales = sum(FG_UnitsSold)) 

    tot_materials <- bill_materials %>% 
     group_by(FG_PartNumber) %>% 
     summarise(tot_qty = sum(Qty)) 

#join the pivot tables together  
    df <- left_join(tot_sales, tot_materials) 

> df 
# A tibble: 3 × 3 
    FG_PartNumber tot_sales tot_qty 
      <chr>  <dbl> <dbl> 
1    A  500 13.15087 
2    B  700 14.76326 
3    C  900 11.30953 
0

我認爲inner_joindplyr是這裏最好的選擇:

library(dplyr) 
fg_Sales_ext <- inner_join(x = fg_Sales, 
          y = bill_materials, 
          by = "FG_PartNumber") 

inner_join文檔:「如果在matche的所有組合的X和Y之間的多個匹配, s返回。「

有了fg_Sales_ext您現在可以使用group_bysummarise執行任何類型的分析。

+0

嗨evgeniC,這正是我需要的。謝謝你的幫助! –