2015-07-01 72 views
6

我想創建前3列('group','animal'和'full')的最後一列('desired_result')。以下是可重複使用的示例的代碼。從列表中刪除特定於行的項目

library(data.table) 
data = data.table(group = c(1,1,1,2,2,2), animal = c('cat', 'dog', 'pig', 'giraffe', 'lion', 'tiger'), desired_result = c('dog, pig', 'cat, pig', 'cat, dog', 'lion, tiger', 'giraffe, tiger', 'giraffe, lion')) 
data[, full := list(list(animal)), by = 'group'] 
data = data[, .(group, animal, full, desired_result)] 

data 
    group animal    full desired_result 
1:  1  cat   cat,dog,pig  dog, pig 
2:  1  dog   cat,dog,pig  cat, pig 
3:  1  pig   cat,dog,pig  cat, dog 
4:  2 giraffe giraffe,lion,tiger lion, tiger 
5:  2 lion giraffe,lion,tiger giraffe, tiger 
6:  2 tiger giraffe,lion,tiger giraffe, lion 

基本上,我想修改'full',所以它不包含相應的'動物'。我已經嘗試過使用這些列的列表和字符版本的各種lapply命令,但無法解決這個問題。

回答

3

這裏有一個可能的方法

data[, desired_result := { 
     temp <- unique(unlist(full)) 
     toString(temp[-match(animal, temp)]) 
     }, by = .(group, animal)] 
data 
# group animal    full desired_result 
# 1:  1  cat  cat,dog,pig  dog, pig 
# 2:  1  dog  cat,dog,pig  cat, pig 
# 3:  1  pig  cat,dog,pig  cat, dog 
# 4:  2 giraffe giraffe,lion,tiger lion, tiger 
# 5:  2 lion giraffe,lion,tiger giraffe, tiger 
# 6:  2 tiger giraffe,lion,tiger giraffe, lion 
3

另一種選擇:

data[, desired := .(Map(setdiff, list(animal), as.list(animal))), by = group] 

#or if starting from full 
data[, desired := .(Map(setdiff, full, animal))] 

(循環魔法使的第一個版本的工作)

+0

'dplyr':'library(dplyr); data%>%mutate(desired = Map(setdiff,full,animal))' –

+0

這將返回一個列表而不是字符向量(按照OP的期望輸出)。 –

+1

我閱讀OP,因爲他們不關心他們是否得到一個列表或一個字符串,並且轉換是微不足道的 – eddi

1

我找到了一種方法,以及!

通過將'動物'轉換爲列表,我可以使用mapply。

data$animal = strsplit(data$animal, ' ') 
data$check = mapply(function(x, y) {list(x[x != y]) }, data$full, data$animal) 

data 
group animal    full desired_result   check 
1:  1  cat  cat,dog,pig  dog, pig  dog,pig 
2:  1  dog  cat,dog,pig  cat, pig  cat,pig 
3:  1  pig  cat,dog,pig  cat, dog  cat,dog 
4:  2 giraffe giraffe,lion,tiger lion, tiger lion,tiger 
5:  2 lion giraffe,lion,tiger giraffe, tiger giraffe,tiger 
6:  2 tiger giraffe,lion,tiger giraffe, lion giraffe,lion 
+0

你的方法將返回一個列表而不是一個字符向量(根據你想要的輸出) –

+0

好吧,這將不得不被轉換和清洗,如果有必要。 – DataBandit