我有一個數據幀的一些列名的數字處理中的R dplyr數值列名:使用可變
> names(spreadResults)
[1] "PupilMatchingRefAnonymous" "GENDER" "URN"
[4] "KS2Eng" "KS2Mat" "EVERFSM_6"
[7] "0001" "0003" "0009"
[10] "0015"
我想在每一個都是數字列名的運行報告:
for(DiscID in colnames(spreadResults[7:length(spreadResults)]))
{
#DiscIDcol <- match(DiscID,names(spreadResults))
colID <- as.name(DiscID)
print(colID)
print(DiscID)
#get data into format suitable for creating tables
temp <- spreadResults %>% select(GENDER, EVERFSM_6, colID) %>%
filter_(!is.na(colID)) %>%
group_by_(GENDER, EVERFSM_6, colID) %>%
summarise(n = n()) %>%
ungroup()
}
,但我得到:
`0001`
[1] "0001"
Error: All select() inputs must resolve to integer column positions.
The following do not:
* colID
但是,如果我用反勾``和明確命名列
temp <- spreadResults %>% select(GENDER, EVERFSM_6, `0001`)
這很好。有沒有辦法用變量來處理列名?我知道你可以在select()中使用匹配(DiscID),但匹配(...)在group_by,spread等中不起作用。
我正在處理的數據幀的前五行()
structure(list(
PupilMatchingRefAnonymous = c(12345L, 12346L, 12347L, 12348L, 12349L),
GENDER = structure(c(2L, 2L, 1L, 1L, 1L), .Label = c("F", "M"), class = "factor"),
URN = c(123456L, 123456L, 123456L, 123456L, 123456L),
KS2Eng = c(4L, 3L, 4L, 5L, 3L),
KS2Mat = c(4L, 5L, 4L, 4L, 3L),
EVERFSM_6 = c(1L, 1L, 0L, 0L, 1L),
`0001` = c(66, 44, NA_real_, 55, 66),
`0003` = c(22, NA_real_, NA_real_, NA_real_, NA_real_),
`0009` = c(NA_real_, 66, NA_real_, NA_real_, NA_real_),
`0015` = c(33, NA_real_, 55, NA_real_, NA_real_)),
.Names = c("PupilMatchingRefAnonymous", "GENDER", "URN", "KS2Eng", "KS2Mat", "EVERFSM_6",
"0001", "0003", "0009", "0015"),
row.names = c(NA, 5L), class = "data.frame")
所需的輸出:
GENDER EVERFSM_6 0001 n
(fctr) (int) (dbl) (int)
1 F 0 55 1
2 F 1 66 1
3 M 1 44 1
4 M 1 66 1
最簡單的事情是可能改變的列名有一個主角。 – Thomas
這不是數字(儘管這通常會是一種痛苦);這是關於非標準的評估。 'dplyr'函數默認使用未引用的列名,所以如果您想要傳遞其他東西,則需要使用以下劃線('select_')結尾的SE版本。 – alistaire
正在運行spreadResults < - 重命名(spreadResults,「n0001」='0001'),然後再次運行代碼,仍會在n0001上引發相同的錯誤。我可以重命名但沒有區別 – pluke