2012-08-24 161 views
-1
Id   authId   sessionId       
139 "56763313.wrpy" "4233a31b52f92c6fe8af4f04f2116657" 
123 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
126 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
144 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
143 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
118 NA    "ae04ddacadaa3429ca77dab674a008bf" 
121 NA    "ae04ddacadaa3429ca77dab674a008bf" 
122 NA    "ae04ddacadaa3429ca77dab674a008bf" 
75 "5676614888888" "ca673b5e60a6f70963bf3017e3cb0780" 
276 "56711325.cc79" "f6075188c0f479d7a423744f6c8655b3" 
256 "56711325.cc79" "f6075188c0f479d7a423744f6c8655b3" 
275 "56711325.cc79" "f6075188c0f479d7a423744f6c8655b3" 
152 NA    "f6075188c0f479d7a423744f6c8655b3" 
158 NA    "f6075188c0f479d7a423744f6c8655b3" 
28 "221124184"  "fc71064548bb35d05293bd67d55f1693" 
31 "221124184"  "fc71064548bb35d05293bd67d55f1693" 

我想根據sessionId填補缺失的authId。我試圖做到這一點,而不使用循環。例如:根據另一列替換一列中的NA值

143 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
118 "221156400"  "ae04ddacadaa3429ca77dab674a008bf" 
+1

供參考:這是沒有必要的語言添加到標題。這就是標籤的用途。 – joran

回答

3

首先創建的authIdsessionId獨特的組合一個數據幀。然後找到sessionId任何authIdNA。運用獨特的表來查找sessionId的相關authId

df <- read.table(text="Id   authId   sessionId       
139 56763313.wrpy 4233a31b52f92c6fe8af4f04f2116657 
123 221156400  ae04ddacadaa3429ca77dab674a008bf 
126 221156400  ae04ddacadaa3429ca77dab674a008bf 
144 221156400  ae04ddacadaa3429ca77dab674a008bf 
143 221156400  ae04ddacadaa3429ca77dab674a008bf 
118 NA    ae04ddacadaa3429ca77dab674a008bf 
121 NA    ae04ddacadaa3429ca77dab674a008bf 
122 NA    ae04ddacadaa3429ca77dab674a008bf 
75 5676614888888 ca673b5e60a6f70963bf3017e3cb0780 
276 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
256 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
275 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
152 NA    f6075188c0f479d7a423744f6c8655b3 
158 NA    f6075188c0f479d7a423744f6c8655b3 
28 221124184  fc71064548bb35d05293bd67d55f1693 
31 221124184  fc71064548bb35d05293bd67d55f1693", header=T) 


# find unique combinations of authId and sessionID, but not when authId is NA 
uniques <- unique(df[c("authId", "sessionId")]) 
uniques <- uniques[!is.na(uniques$authId),] 

# replace authID's that are NA with the unique authId associated with the sessionId 
na.authId <- which(is.na(df$authId)) 
na.sessionId <- df$sessionId[na.authId] 
df$authId[na.indices] <- uniques$authId[match(na.sessionId, uniques$sessionId)] 


#  Id  authId      sessionId 
# 1 139 56763313.wrpy 4233a31b52f92c6fe8af4f04f2116657 
# 2 123  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 3 126  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 4 144  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 5 143  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 6 118  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 7 121  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 8 122  221156400 ae04ddacadaa3429ca77dab674a008bf 
# 9 75 5676614888888 ca673b5e60a6f70963bf3017e3cb0780 
# 10 276 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
# 11 256 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
# 12 275 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
# 13 152 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
# 14 158 56711325.cc79 f6075188c0f479d7a423744f6c8655b3 
# 15 28  221124184 fc71064548bb35d05293bd67d55f1693 
# 16 31  221124184 fc71064548bb35d05293bd67d55f1693 
+0

我可以用authId和sessionId的獨特組合創建一個數據幀。但我無法使用匹配功能。 – pandhale

+1

如果您希望得到更多的幫助,您必須詳細解釋哪些功能無法正常工作... –

+0

當我仔細查看了我的df時,我很抱歉。它在會話ID的列中也有NA值。 – pandhale

相關問題