2016-12-16 34 views
3

我想寫一個函數,使用dplyr來計算所有z的唯一值。當我有名爲z的變量時,我的函數可以正常工作。但是,如果變量名爲x,則會出現錯誤(代碼如下)。可以得到非標準的評估工作在dplyr filter_和count_但不distinct__

test.data<-data.frame(y=c(1:10), 
        x=c(letters[1:10])) 
test.data$x<-as.character(test.data$x) 
obsfunction<-function(z,y,data){ 
filter_(data, 
      !is.na(deparse(substitute(y))))%>% 
    distinct_(., deparse(substitute(z)))%>% #the line that breaks it 
    count_(.) 
} 
obsfunction(z=x,y,data=test.data) 

所以,上面的代碼不工作,給這個錯誤:

>Error in eval(substitute(expr), envir, enclos) : unknown column 'z' 

更改Z到X的函數(或重命名X作爲Z)使得它的工作,但我不」不想重新命名一切,尤其是考慮到不同名稱的作品。

我試過lazyeval :: interp和報價()每vignette,this questionthis question

distinct_(lazyeval::interp(as.name(z)))%>% 
>Error in as.name(z) : object 'x' not found 

distinct_(quote(z))%>% 
>Error in eval(substitute(expr), envir, enclos) : unknown column 'z' 

我在想什麼?我如何讓z接受x作爲列名?

回答

3

爲dplyr標準評價明白串,我想下面的代碼,並與其他測試數據,似乎工作。我首先提取使用變量名,然後構建表達字符串:

test.data<-data.frame(y=c(1:10), 
         x=c(letters[1:10])) 
test.data$x<-as.character(test.data$x) 

f <- function(z, y, data){ 
    z <- deparse(substitute(z)) 
    y <- deparse(substitute(y)) 
    res <- data %>% filter_(
     paste('!is.na(', y, ')', sep = '')) %>% 
     distinct_(z) %>% 
     count_(.) 
} 


x <- f(z = x, y, test.data) 
# # A tibble: 1 × 1 
#  n 
# <int> 
# 1 10 



test.data <- data.frame(
    y=c(1:4, NA, NA, 7:10), 
    x=c(letters[c(1:8, 8, 8)]), 
    stringsAsFactors = F) 

x <- f(z = x, y, test.data) 
# # A tibble: 1 × 1 
#  n 
# <int> 
# 1  6 
2

您可以使用match.call捕捉函數參數,並將其轉換爲字符傳遞給dplyr SE功能之前:

obsfunction<-function(z, y, data){ 
    cl = match.call() 
    y = as.character(cl['y']) 
    z = as.character(cl['z']) 

    data %>% filter_(paste('!is.na(', y, ')', sep = '')) %>% 
      distinct_(z) %>% 
      count_(.) 
} 

obsfunction(z = x, y = y, data = test.data) 

# A tibble: 1 × 1 
#  n 
# <int> 
#1 10 

obsfunction(x, y, test.data) 

# A tibble: 1 × 1 
#  n 
# <int> 
#1 10 
+0

嗨。我注意到你的答案在我的答案中給出了8個額外的'test.data',而我的解答給出了6個。我錯過了什麼嗎? – mt1022

+0

@ mt1022是的。我只注意到'filter_'函數工作不正常,'paste()'方法就是要走的路。感謝您的通知! – Psidom

1

另一個lazyeval/dplyr變型,其中各變量如式過去了,f_interp代用品uq(x)與傳遞給它的配方,類似於deparse(substitute(x))

library(dplyr) 
library(lazyeval) 

test.data<-data.frame(y=c(1:10), 
        x=c(letters[1:10])) 
test.data$x<-as.character(test.data$x) 


obsfunction<-function(z, y, data){ 
    data %>% filter_(f_interp(~!is.na(uq(y)))) %>% 
    distinct_(f_interp(~uq(z))) %>% count() 
} 

obsfunction(z=~x,~y,data=test.data) 

#A tibble: 1 × 1 
#  n 
# <int> 
#1 10 

test.data.NA <- data.frame(
    y=c(1:4, NA, NA, 7:10), 
    x=c(letters[c(1:8, 8, 8)]), 
    stringsAsFactors = FALSE) 


obsfunction(z=~x,~y,data=test.data.NA) 
# # A tibble: 1 × 1 
#  n 
#  <int> 
# 1  6 
相關問題