2016-12-31 43 views
3

我有命名爲數據的數據幀,其具有以下要素:數據分類

Model Garage City Unit.Price Invoice.Date Components  
Hyundai A  NY  500  31/12/2016 HL 
Honda B  NJ  700  31/12/2016 TL  
Porsche A  NY  800  30/12/2016 TL  
BMW  B  NJ  800  30/12/2016 HL 
BMW  A  NJ  700  31/12/2016 HL 
Porsche B  NY  800  30/12/2016 TL 
Honda A  NY  400  30/12/2016 TL 
Honda A  NY  500  30/12/2016 HL 
Honda B  NY  600  30/12/2016 HL 
Honda A  NY  200  29/12/2016 TL 
Honda A  NY  300  29/12/2016 HL 

我想數據的輸出分成汽車排序與Invoice.Date使得電流成本第一捕獲。

Ex:Honda 

Components GarageA GarageB  
HL    500   600  
TL    400   700 

這是我如何開始:

Category <- as.data.frame(c("BMW","Honda","Porsche","Hyundai")) 

for(i in 1:nrow(Category)) 
{ 
    m <- Category[i,1] 
    X <- subset(Data,Model==m) 
    X <- Data[order(Data$Invoice.Date,decreasing = T),] 
    Pivot_A<-dcast(X,Name~Garage,value.var = "Unit.Price",function(x) length((x))) 
    write.csv(Pivot,file = paste(X,"Cars.csv",sep = "_")) 
} 

我得到的唯一問題是映射正確的單價。有沒有任何代碼或功能與dcast做到這一點? dcastsum,count的選項。如果我想要的確切金額,而不是sumaverage

回答

0

我們可以從dcastdata.table做到這一點。將'data.frame'轉換爲'data.table'(setDT(df1)),order將'Invoice.Date'行和dcast從'long'轉換爲'wide'與dcast同時指定fun.aggregate只選擇第一個觀察

library(data.table) 
library(lubridate) 
dcast(setDT(df1)[order(dmy(Invoice.Date))] , Model + Components ~ 
    paste0("Garage", Garage), value.var = "Unit.Price", function(x) x[1]) 
#  Model Components GarageA GarageB 
#1:  BMW   HL  700  800 
#2: Honda   HL  300  600 
#3: Honda   TL  200  700 
#4: Hyundai   HL  500  NA 
#5: Porsche   TL  800  800 
1

你可以做到這一點:

require(tidyverse) # dplyr would be enough... 
dat %>% 
    mutate(Invoice.Date = as.Date(Invoice.Date, "%d/%m/%Y")) %>% 
    group_by(Model, Garage, Components) %>% 
    summarise(Unit.Price = first(Unit.Price, order_by = Invoice.Date)) %>% 
    spread(Garage, Unit.Price, sep = "") 

它給你:

Model Components GarageA GarageB 
* <chr>  <chr> <int> <int> 
1  BMW   HL  700  800 
2 Honda   HL  300  600 
3 Honda   TL  200  700 
4 Hyundai   HL  500  NA 
5 Porsche   TL  800  800 

現在,我不知道如何解釋闖入你的問題的汽車。你可以管(%>%)到上述

  • split(.$Model)讓每個列表元素代表一個型號列表。
  • nest(-Model)得到一個嵌套tibble ...
0

而考慮的r最佳的解決方案,base

library(base) # COMPLETELY REDUNDANT =) 

df <- df[with(df, order(Invoice.Date)),] 
dfagg <- aggregate(Unit.Price ~ Model + Components + Garage, df, function(i) tail(i)[1]) 
dfwide <- reshape(dfagg, timevar='Garage', idvar=c('Model', 'Components'), direction="wide") 
names(dfwide) <- gsub("Unit.Price.", "Garage", names(dfwide)) 

#  Model Components GarageA GarageB 
# 1  BMW   HL  700  800 
# 2 Honda   HL  300  600 
# 3 Hyundai   HL  500  NA 
# 4 Honda   TL  200  700 
# 5 Porsche   TL  800  800