2014-08-27 27 views
8

使用dput爲data.table我有以下data.table對此我不能使用dput命令的輸出來創建它:不能R中

> ddt 
    Unit Anything index new 
1: A  3.4  1 1 
2: A  6.9  2 1 
3: A1  1.1  1 2 
4: A1  2.2  2 2 
5: B  2.0  1 3 
6: B  3.0  2 3 
> 
> 
> str(ddt) 
Classes ‘data.table’ and 'data.frame': 6 obs. of 4 variables: 
$ Unit : Factor w/ 3 levels "A","A1","B": 1 1 2 2 3 3 
$ Anything: num 3.4 6.9 1.1 2.2 2 3 
$ index : num 1 2 1 2 1 2 
$ new  : int 1 1 2 2 3 3 
- attr(*, ".internal.selfref")=<externalptr> 
- attr(*, "sorted")= chr "Unit" "Anything" 
> 
> 
> dput(ddt) 
structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
"A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
"Anything")) 
> 

在粘貼我獲得以下錯誤:

> dt = structure(list(Unit = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("A", 
+ "A1", "B"), class = "factor"), Anything = c(3.4, 6.9, 1.1, 2.2, 
+ 2, 3), index = c(1, 2, 1, 2, 1, 2), new = c(1L, 1L, 2L, 2L, 3L, 
+ 3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
+ -6L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x8948f68>, sorted = c("Unit", 
Error: unexpected '<' in: 
"3L)), .Names = c("Unit", "Anything", "index", "new"), row.names = c(NA, 
-6L), class = c("data.table", "data.frame"), .internal.selfref = <" 
> "Anything")) 
Error: unexpected ')' in ""Anything")" 

問題在哪裏?如何糾正?謝謝你的幫助。

回答

10

問題是dput會打印出外部指針地址(這是data.table內部使用的東西,並且會在需要時重新構建),這是您無法真正使用的。

如果您手動切除.internal.selfref部件,它將工作得很好,除了對data.table對一些操作的一次性投訴。

您可以爲此添加FR到data.table,但它需要修改data.table的基本功能,類似於當前處理rbind的方式。

4

我也發現這種行爲比較煩人。所以我創建了自己的dput函數,忽略了.internal.selfref屬性。

dput <- function (x, file = "", control = c("keepNA", "keepInteger", 
            "showAttributes")) 
{ 
    if (is.character(file)) 
    if (nzchar(file)) { 
     file <- file(file, "wt") 
     on.exit(close(file)) 
    } 
    else file <- stdout() 
    opts <- .deparseOpts(control) 
    # adding these three lines for data.tables 
    if (is.data.table(x)) { 
    setattr(x, '.internal.selfref', NULL) 
    } 
    if (isS4(x)) { 
    clx <- class(x) 
    cat("new(\"", clx, "\"\n", file = file, sep = "") 
    for (n in .slotNames(clx)) { 
     cat(" ,", n, "= ", file = file) 
     dput(slot(x, n), file = file, control = control) 
    } 
    cat(")\n", file = file) 
    invisible() 
    } 
    else .Internal(dput(x, file, opts)) 
} 
+0

感謝您的回答。你確定它不會影響所有其他對象的dput輸出嗎?人們總是可以將此函數重命名爲dputdt以僅用於data.table對象。 – rnso 2015-01-30 16:16:44

+3

爲什麼這麼複雜?並不是簡單的'dput = function(x,...){if(is.data.table(x)){setattr(x,'.internal.selfref',NULL)}; base :: dput(x,...)}'工作?或者可能更好,用'inherits'替​​換'is.data.table' – eddi 2016-04-20 14:14:45

0

如果您已經dput文件,你不覺得很像dget之前手動編輯,你可以使用下面的

data.table.parse<-function (file = "", n = NULL, text = NULL, prompt = "?", keep.source = getOption("keep.source"), 
          srcfile = NULL, encoding = "unknown") 
{ 
    keep.source <- isTRUE(keep.source) 
    if (!is.null(text)) { 
    if (length(text) == 0L) 
     return(expression()) 
    if (missing(srcfile)) { 
     srcfile <- "<text>" 
     if (keep.source) 
     srcfile <- srcfilecopy(srcfile, text) 
    } 
    file <- stdin() 
    } 
    else { 
    if (is.character(file)) { 
     if (file == "") { 
     file <- stdin() 
     if (missing(srcfile)) 
      srcfile <- "<stdin>" 
     } 
     else { 
     filename <- file 
     file <- file(filename, "r") 
     if (missing(srcfile)) 
      srcfile <- filename 
     if (keep.source) { 
      text <- readLines(file, warn = FALSE) 
      if (!length(text)) 
      text <- "" 
      close(file) 
      file <- stdin() 
      srcfile <- srcfilecopy(filename, text, file.mtime(filename), 
           isFile = TRUE) 
     } 
     else { 
      text <- readLines(file, warn = FALSE) 
      if (!length(text)) { 
      text <- "" 
      } else { 
      text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-Fa-f]+>)","",text,perl=TRUE) 
      } 
      on.exit(close(file)) 
     } 
     } 
    } 
    } 
    # text <- gsub("(, .internal.selfref = <pointer: 0x[0-9A-F]+>)","",text) 
    .Internal(parse(file, n, text, prompt, srcfile, encoding)) 
} 
data.table.get <- function(file, keep.source = FALSE) 
    eval(data.table.parse(file = file, keep.source = keep.source)) 
dtget <- data.table.get 

那麼你的dget呼叫變爲dtget。請注意,由於內聯解析,這將使dtget慢於dget,因此僅在可能檢索data.table類型對象的情況下才使用它。