2017-08-30 73 views
0

我想基於字符串匹配創建新列。我能夠創建,但它也創建不匹配的列。舉個例子:避免創建與任何字符串不匹配的列

 x = data.frame(name = c("Java Hackathon", "Intro to Graphs", "Hands on 
      Cypher")) 
     toMatch <- c("Hackathon","Hands on","Test","java") 


     ##Sentence with phrases 
     phrases11 <- as.vector(toMatch) 
     res <- sapply(phrases11, grepl, x = as.character(x$name),ignore.case= 
     TRUE) 
     rownames(res) <- x$name 

     #replacement 
     ones <- which(res==1, arr.ind=T) 
     res[ones]<-colnames(res)[ones[,2]] 
     res 

     Output: 
         Hackathon Hands on  Test  java 
    Java Hackathon  "Hackathon" "FALSE" "FALSE" "java" 
    Intro to Graphs "FALSE"  "FALSE" "FALSE" "FALSE" 
    Hands on Cypher "FALSE"  "Hands on" "FALSE" "FALSE" 

我不希望創建「測試」列,因爲我有大量的匹配數據。所以基本上,我們可以在res <- sapply(phrases11, grepl, x = as.character(x$name), ignore.case = TRUE)中做一些代碼更改,以便它只應創建與'toMatch'向量匹配的列。還有其他方法嗎?

+0

用'do.call(cbind,過濾器嘗試(取反(is.null) ,setNames(lapply(phrase11,function(y){i1 < - grepl(y,x $ name,ignore.case = TRUE); if(any(i1))i1}),phrases11)))'' – akrun

回答

0

由於使用grepl()功能,讓你真的還是假的,可以消除與總和的列= 0:

A=sapply(toMatch,grepl,as.character(x$name),ignore.case=T) 
    A[,colSums(A)==1] 
    Hackathon Hands on java 
[1,]  TRUE FALSE TRUE 
[2,]  FALSE FALSE FALSE 
[3,]  FALSE  TRUE FALSE