我有一個看起來像一個數據表:如何使用變量查找在R中創建新列? [R編程
Cause of Death Ethnicity Count
1: ACCIDENTS EXCEPT DRUG POISONING ASIAN & PACIFIC ISLANDER 1368
2: ACCIDENTS EXCEPT DRUG POISONING HISPANIC 3387
3: ACCIDENTS EXCEPT DRUG POISONING NON-HISPANIC BLACK 3240
4: ACCIDENTS EXCEPT DRUG POISONING NON-HISPANIC WHITE 6825
5: ALZHEIMERS DISEASE ASIAN & PACIFIC ISLANDER 285
---
我想創建一個新的列是一個簡單的人,從死亡的一個具體原因廢去種族之間的百分比。像這樣:
Cause of Death Ethnicity Count PercentofDeath
1: ACCIDENTS EXCEPT DRUG POISONING ASIAN & PACIFIC ISLANDER 1368 0.09230769
2: ACCIDENTS EXCEPT DRUG POISONING HISPANIC 3387 0.22854251
3: ACCIDENTS EXCEPT DRUG POISONING NON-HISPANIC BLACK 3240 0.21862348
4: ACCIDENTS EXCEPT DRUG POISONING NON-HISPANIC WHITE 6825 0.46052632
5: ALZHEIMERS DISEASE ASIAN & PACIFIC ISLANDER 285 0.04049446
---
這裏是我的代碼做到這一點,這是相當難看:
library(data.table)
#load library, change to data table
COD.dt <- as.data.table(COD)
#function for adding the percent column
lala <- function(x){
#see if I have initialized data.table I'm going to append to
if(exists("started")){
p <- COD.dt[x ==`Cause of Death`]
blah <- COD.dt[x ==`Cause of Death`]$Count/sum(COD.dt[x ==`Cause of Death`]$Count)
p$PercentofDeath <- blah
started <<- rbind(started,p)
}
#initialize data table
else{
l <- COD.dt[x ==`Cause of Death`]
blah <- COD.dt[x ==`Cause of Death`]$Count/sum(COD.dt[x ==`Cause of Death`]$Count)
l$PercentofDeath <- (blah)
started <<- l
}
#if finished return
if(x == unique(COD.dt$`Cause of Death`)[length(unique(COD.dt$`Cause of Death`))]){
return(started)
}
}
#run function
h <- sapply(unique(COD.dt$`Cause of Death`), lala)
#remove from environment
rm(started)
#h is actually ends up being a list, the last object happen to be the one I want so I take that one
finalTable <- h$`VIRAL HEPATITIS`
所以,你可以看到。這段代碼非常難看,並且不適用。我希望從一些指導如何使這個更好。也許使用dpylr或其他一些函數?
最佳
嗚。這很好。我一直有意使用%>%運算符。非常感謝。 – njBernstein
不確定它是否提高了可讀性,但是對於magrittr,'PercentofDeath = Count%>%{./sum(。)}'在'mutate'中起作用。 – Frank
@Frank我會說這大大降低了可讀性。 – Gregor