2017-04-06 150 views
0

假設我有一個df包含ID,性別和幾個數值變量。見下面找到R中每行最大最大值和第二大最大值R

set.seed(123) 
    ID <- c(1,2,3,4,5,6,7,8,9,10) 
    gender <- c("m", "m", "m", "f", "f", "m", "m", "f", "f", "m") 
    x1 <- rnorm(10, 0, 1) 
    x2 <- rnorm(10, 0, 1) 
    x3 <- rnorm(10, 0, 1) 
    x4 <- rnorm(10, 0, 1) 
    x5 <- rnorm(10, 0, 1) 
    df <- data.frame(ID, gender, x1, x2, x3, x4, x5) 

的目標是創建兩列:最大值1和最大值2,其中

MAX1是(X1,X2,X3,X4,X5)最大的最大的變量名。

MAX2是(X1,X2,X3,X4,X5)的第二大最大

所以我需要找到每一行MAX1和MAX2在DF

EX的變量名:用於ID = 1,MAX1 =「X2」和MAX2 =「×4」

+0

非常靠近[高效路到串連最名稱的最-7-最高列的每行(http://stackoverflow.com/questions/41052568 /高效的路到連擊最名稱的最-7最高的列每行/ 41053043) – thelatemail

回答

1

下面是一個簡單的解決方案:

maxes <- t(sapply(1:nrow(df), function(i) { 
    names(sort(df[i,3:7], decreasing=T)[1:2]) 
})) 
colnames(maxes) <- c("MAX1","MAX2") 
df <- cbind(df, maxes) 

    ID gender   x1   x2   x3   x4   x5 
1 1  m -0.56047565 1.2240818 -1.0678237 0.42646422 -0.69470698 
2 2  m -0.23017749 0.3598138 -0.2179749 -0.29507148 -0.20791728 
3 3  m 1.55870831 0.4007715 -1.0260044 0.89512566 -1.26539635 
4 4  f 0.07050839 0.1106827 -0.7288912 0.87813349 2.16895597 
5 5  f 0.12928774 -0.5558411 -0.6250393 0.82158108 1.20796200 
6 6  m 1.71506499 1.7869131 -1.6866933 0.68864025 -1.12310858 
7 7  m 0.46091621 0.4978505 0.8377870 0.55391765 -0.40288484 
8 8  f -1.26506123 -1.9666172 0.1533731 -0.06191171 -0.46665535 
9 9  f -0.68685285 0.7013559 -1.1381369 -0.30596266 0.77996512 
10 10  m -0.44566197 -0.4727914 1.2538149 -0.38047100 -0.08336907 
     MAX1  MAX2 MAX1 MAX2 
1 1.224082 0.4264642 x2 x4 
2 0.3598138 -0.2079173 x2 x5 
3 1.558708 0.8951257 x1 x4 
4 2.168956 0.8781335 x5 x4 
5 1.207962 0.8215811 x5 x4 
6 1.786913 1.715065 x2 x1 
7 0.837787 0.5539177 x3 x4 
8 0.1533731 -0.06191171 x3 x4 
9 0.7799651 0.7013559 x5 x2 
10 1.253815 -0.08336907 x3 x5