在基R,可以計算出一個二進制向量與ave
:
Democrat$winner <- ave(Democrat$fraction_votes, Democrat$fips, FUN=function(i) i == max(i))
返回
Democrat
state state_abbreviation county fips party candidate votes fraction_votes winner
1 Alabama AL Autauga 1001 Democrat Bernie 544 0.182 0
2 Alabama AL Autauga 1001 Democrat Hillary 2387 0.800 1
3 Alabama AL Baldwin 1003 Democrat Bernie 2694 0.329 0
4 Alabama AL Baldwin 1003 Democrat Hillary 5290 0.647 1
5 Alabama AL Barbour 1005 Democrat Bernie 222 0.078 0
6 Alabama AL Barbour 1005 Democrat Hillary 2567 0.906 1
其如果需要,可以通過將ave
包裝在as.logical
中轉換爲邏輯。
這也是data.table
非常簡單。假設FIPS是唯一的國有縣ID:
library(data.table)
# convert to data.table
setDT(Democrat)
# get logical vector that proclaims winner if vote fraction is maximum
Democrat[, winner := fraction_votes == max(fraction_votes), by=fips]
返回
Democrat
state state_abbreviation county fips party candidate votes fraction_votes winner
1: Alabama AL Autauga 1001 Democrat Bernie 544 0.182 FALSE
2: Alabama AL Autauga 1001 Democrat Hillary 2387 0.800 TRUE
3: Alabama AL Baldwin 1003 Democrat Bernie 2694 0.329 FALSE
4: Alabama AL Baldwin 1003 Democrat Hillary 5290 0.647 TRUE
5: Alabama AL Barbour 1005 Democrat Bernie 222 0.078 FALSE
6: Alabama AL Barbour 1005 Democrat Hillary 2567 0.906 TRUE
數據
Democrat <-
structure(list(state = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Alabama", class = "factor"),
state_abbreviation = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "AL", class = "factor"),
county = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("Autauga",
"Baldwin", "Barbour"), class = "factor"), fips = c(1001L,
1001L, 1003L, 1003L, 1005L, 1005L), party = structure(c(1L,
1L, 1L, 1L, 1L, 1L), .Label = "Democrat", class = "factor"),
candidate = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Bernie",
"Hillary"), class = "factor"), votes = c(544L, 2387L, 2694L,
5290L, 222L, 2567L), fraction_votes = c(0.182, 0.8, 0.329,
0.647, 0.078, 0.906)), .Names = c("state", "state_abbreviation",
"county", "fips", "party", "candidate", "votes", "fraction_votes"
), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")
我們可以得到您的數據設置的例子嗎? –
[編輯]你的文章! –
好吧,那裏是 –