2013-07-15 43 views
1

我有一個包含觀察得分一羣人數據集,這樣的:重新編碼變量複製到多個新值

person_id <- c(1:50) 
person_score <- rep(1:10,5) 
people <- data.frame(person_id, person_score) 

我需要創建一組重新編碼所觀察到的分數值新變量。我有一組變量屬於「鑰匙」,用於將觀察到分數爲新的變量,例如:

observed <- c(1,2,3,4,5,6,7,8,9,10) 
score1 <- c(10,14,17,18,20,21,22,26,28,31) 
score2 <- c(6,9,11,14,17,18,20,24,25,26) 
score3 <- c(11,13,15,17,19,21,23,25,27,29) 
score4 <- c(43,44,45,46,47,48,49,50,51,52) 
scores <- data.frame(observed,score1,score2, score3, score4) 

...其中,所述第一值對應於觀察到的得分= 1,第二個值對應觀察分數= 2等等。

我需要創建一個對應score1,score2,score3四個新的變量,得分4 我能想到做手工重新編碼,如下所示的,但它是非常緩慢而乏味:

people$value1[person_score == 1] <- 10 
people$value1[person_score == 2] <- 14 

...等等的score1

people$value2[person_score == 1] <- 6 
people$value2[person_score == 2] <- 9 

...等等的score2

people$value3[person_score == 1] <- 11 
people$value3[person_score == 2] <- 13 

...等等的score3

people$value4[person_score == 1] <- 43 
people$value4[person_score == 2] <- 44 

...等等的score4

回答

1

我只想用match從分數中找到正確的行data.frame ...

idx <- match(people$person_score , scores$observed) 

people_new <- cbind(people , scores[ idx , -1 ]) 

head(people_new) 
# person_id person_score score1 score2 score3 score4 
#1   1   1  10  6  11  43 
#2   2   2  14  9  13  44 
#3   3   3  17  11  15  45 
#4   4   4  18  14  17  46 
#5   5   5  20  17  19  47 
#6   6   6  21  18  21  48 
+0

這似乎工作,除了我只得到前兩個新變量(score1和score2)。 – windy

+0

@windy您需要先將所有單獨的分數「綁定」到「分數」data.frame(您在「匹配」中使用),例如, '分數< - cbind(觀察,得分1,得分2,得分3,得分4) –

+0

太好了,這有效。非常感謝! – windy

0

您可以使用qdap package'slookup功能如下:

## person_id <- c(1:50) 
## person_score <- rep(1:10,5) 
## people <- data.frame(person_id, person_score) 
## 
## observed <- c(1,2,3,4,5,6,7,8,9,10) 
## score1 <- c(10,14,17,18,20,21,22,26,28,31) 
## score2 <- c(6,9,11,14,17,18,20,24,25,26) 
## score3 <- c(11,13,15,17,19,21,23,25,27,29) 
## score4 <- c(43,44,45,46,47,48,49,50,51,52) 
## scores <- data.frame(observed,score1,score2, score3, score4) 

library(qdap) 
people[, 3:6] <- lapply(scores[, -1], function(x) lookup(people$person_score, scores[, 1], x)) 

people 
## person_id person_score score1 score2 score3 score4 
## 1   1   1  10  6  11  43 
## 2   2   2  14  9  13  44 
## 3   3   3  17  11  15  45 
## 4   4   4  18  14  17  46 
## 5   5   5  20  17  19  47 
## 6   6   6  21  18  21  48 
## 7   7   7  22  20  23  49 
. 
. 
. 
## 50  50   10  31  26  29  52 
0

這僅僅是一個連接兩個data.frames的:你可以使用merge

merge(people, scores, by.x = "person_score", by.y = "observed", all.x = TRUE) 

sqldf

library(sqldf) 
sqldf(" 
    SELECT * 
    FROM  people 
    LEFT JOIN scores 
    ON  people.person_score = scores.observed 
")