0
下面是一些遊戲數據在BY語句多個條件data.table總
df = data.frame(ID = c(1,1,1,2,2,2,2,3,3),
food = c("bacon","bacon","bacon","bacon","bacon","cheese","sausage","avocado","ham"),
enjoyment = c(20,20,20,20,20,20,20,20,20))
導致
ID food enjoyment
1 1 bacon 20
2 1 bacon 20
3 1 bacon 20
4 2 bacon 20
5 2 bacon 20
6 2 cheese 20
7 2 sausage 20
8 3 avocado 20
9 3 ham 20
我希望做的是,每個人(ID),總結他們的的享受燻肉和奶酪只
到目前爲止我的代碼是
library(data.table)
setDT(df)
df[,id_enjoyment_sum := sum(enjoyment), by =.(ID,food == "bacon"|food == "cheese")]
導致
ID food enjoyment id_enjoyment_sum
1: 1 bacon 20 60
2: 1 bacon 20 60
3: 1 bacon 20 60
4: 2 bacon 20 60
5: 2 bacon 20 60
6: 2 cheese 20 60
7: 2 sausage 20 20
8: 3 avocado 20 40
9: 3 ham 20 40
這已經做了我想做的事情,但它也總結出每個人,他們的非培根和奶酪非食品的享受。請注意,ID 3不吃培根或奶酪,但我的代碼仍然總結了他享用他吃的東西。
理想情況下,代碼會導致
ID food enjoyment id_enjoyment_sum
1: 1 bacon 20 60
2: 1 bacon 20 60
3: 1 bacon 20 60
4: 2 bacon 20 60
5: 2 bacon 20 60
6: 2 cheese 20 60
7: 2 sausage 20 60
8: 3 avocado 20 0
9: 3 ham 20 0
所以我的問題是,我該如何建立BY子句來概括,每個ID只有培根和奶酪的享受?
我想你想要'df [食品%in%c(「培根」,「奶酪」),s:=總和(享受),by = ID]''''我建議通過這些小插曲,澄清典型的語法模式:https://github.com/Rdatatable/data.table/wiki/Getting-started – Frank
'df [,s:= sum(享受[食物%%c(「培根」,「奶酪」 )]),by = ID]'計算預期結果 – HubertL
謝謝你們,HubertL的解決方案處理了我的真實數據,出於某種原因,Frank的結果與我原來的解決方案相同 –