我想通過兩個分組變量(resp & company)和三個數字響應變量(質量,數量,意義)將寬數據幀整形爲寬數據框。我試圖用dcast函數來完成它,但它不允許我通過兩個變量進行分組。誰能幫我嗎?使用由兩個因素分組的合併函數將長整型數據幀重整爲寬數據框
#Current long dataframe: two grouping variables (resp & company), three numerical respons variables (Quality, Amount, Sense)
resp <- c(1325851107,1325851108,1325851109,1325851107,1325851108,1325851109,1325851107,1325851108,1325851109,1325851107,1325851108,1325851109)
company <- c("Dark.nl","Dark.nl","Dark.nl","Dark.nl","Dark.nl","Dark.nl","Manual.nl","Manual.nl","Manual.nl","Dark.nl","Dark.nl","Dark.nl")
question <- c("Quality","Quality","Quality","Amount","Amount","Amount","Quality","Quality","Quality","Sense","Sense","Sense")
score <- c(4,1,2,6,8,10,5,5,7,4,6,7)
current <- data.frame(resp,company,question,score,answer); current
#Desired wide dataframe
resp2 <- c(1325851107,1325851107,1325851108,1325851108,1325851109,1325851109)
company2 <- c("Dark.nl","Manual.nl","Dark.nl","Manual.nl","Dark.nl","Manual.nl")
Quality <- c(4,5,1,5,2,7)
Amount <- c(6,NA,8,NA,10,NA)
Sense <- c(4,NA,6,NA,7,NA)
desired <- data.frame(resp2,company2,Quality,Amount,Sense); desired
#Using dcast function to reshape
library("reshape2")
dcast(current, resp + company ~ question, value.var="score")
Parfait提供的合併函數有效。我在這裏提供了製作技巧的腳本(謝謝Parfait;))。
cols2keep <- c("resp", "company", "score")
df <- merge(current[current$question=='Quality', cols2keep], #merge two dataframes
current[current$question=='Amount', cols2keep],
by=c("resp", "company"), all=TRUE)
df <- merge(df,current[current$question=='Sense', c("resp","company","score")], #merge third respons variable into new dataframe
by=c("resp", "company"), all=TRUE)
colnames(df) <- c("resp","company","quality","amount","sense")
該解決方案有效,但我的真實數據集存在53個響應變量。因此這種方法非常耗時。我嘗試了Parfait的迭代方法,但是我得到以下錯誤。
dfList <- lapply(unique(current$question), function(i){
temp <- setNames(current[current$question==i, c("resp", "company", "score")],
c("resp", "company", paste0(i)))
})
finaldf <- Reduce(function(...) merge(..., y=c("resp", "company"), all=T), dfList)
Error in fix.by(by.x, x) :
'by' must specify one or more columns as numbers, names or logical
我對R編碼比較陌生,無法掌握我寫的錯誤。我對現在的解決方案感到滿意,但如果有更高效的解決方案,我願意接受。
非常感謝你Parfait。這個腳本很容易使用,併產生我想到的數據框。 – SHW
好聽!樂意效勞。請接受以確認解決方案。快樂的編碼! – Parfait
現在我遇到一些困難時,我的一個分組變量(公司)由兩個以上的級別組成(請參閱我已添加到原始帖子中的附加代碼:#Grouping變量超過兩個級別,包括「Senses」)。我得到這個錯誤:fix.by(by.x,x)中的錯誤:'by'必須指定一個或多個列作爲數字,名稱或邏輯。任何想法這裏出了什麼問題? – SHW