我有以下數據幀:重新編碼的因子水平
forStack
AGE BMI time A B ID
1 59 23.8 0 (0,75] (4,14.9] 9000099
2 69 29.8 0 (96.4,100] (-Inf,0] 9000296
3 71 22.7 0 (75,89.3] (4,14.9] 9000622
4 56 32.4 0 (0,75] (14.9,68] 9000798
5 72 30.7 0 (0,75] (14.9,68] 9001104
6 75 23.5 0 (96.4,100] (0,4] 9001400
dput (forStack)
structure(list(AGE = c(59, 69, 71, 56, 72, 75), BMI = c(23.8,
29.8, 22.7, 32.4, 30.7, 23.5), time = c(0, 0, 0, 0, 0, 0), A = structure(c(2L,
5L, 3L, 2L, 2L, 5L), .Label = c("(-Inf,0]", "(0,75]", "(75,89.3]",
"(89.3,96.4]", "(96.4,100]", "(100, Inf]"), class = "factor"),
B = structure(c(3L, 1L, 3L, 4L, 4L, 2L), .Label = c("(-Inf,0]",
"(0,4]", "(4,14.9]", "(14.9,68]", "(68, Inf]"), class = "factor"),
ID = c(9000099, 9000296, 9000622, 9000798, 9001104, 9001400
)), .Names = c("AGE", "BMI", "time", "A", "B", "ID"), row.names = c(NA,
6L), class = "data.frame")
變量A
和B
是因素表示四分位數:
forStack$A
[1] (0,75] (96.4,100] (75,89.3] (0,75] (0,75] (96.4,100]
Levels: (-Inf,0] (0,75] (75,89.3] (89.3,96.4] (96.4,100] (100, Inf]
forStack$B
[1] (4,14.9] (-Inf,0] (4,14.9] (14.9,68] (14.9,68] (0,4]
Levels: (-Inf,0] (0,4] (4,14.9] (14.9,68] (68, Inf]
我想重新編碼A
和B
值兩級因素如下:
對於A
,上限因子水平(96.4,100]
和(100, Inf]
應當被重新編碼爲0電平,其他級別 - 級別1
對於B
的最低因子水平(-Inf,0]
和(0,4]
應當被重新編碼爲0電平,其他級別 - 級別1
因此,數據幀應該看起來像:
forStack
AGE BMI time A B ID
1 59 23.8 0 1 1 9000099
2 69 29.8 0 0 0 9000296
3 71 22.7 0 1 1 9000622
4 56 32.4 0 1 1 9000798
5 72 30.7 0 1 1 9001104
6 75 23.5 0 0 0 9001400
什麼是最有效的方法呢? 非常感謝你提前
太感謝你了,阿南達Mahto和MNEL!你的回答非常有幫助,如何接受他們兩個? – DSSS 2013-04-29 06:12:20
這也很好。 +1 – A5C1D2H2I1M1N2O1R2T1 2013-04-29 06:20:44