2014-03-02 42 views
5

我需要操作上稀疏數據框資金考慮的ID如何使用data.table時

require(data.table) 
sentEx = structure(list(abend = c(1, 1, 0, 0, 2), aber = c(0, 1, 0, 0, 
0), über = c(1, 0, 0, 0, 0), überall = c(0, 0, 0, 0, 0), überlegt = c(0, 
0, 0, 0, 0), ID = structure(c(1L, 1L, 2L, 2L, 2L), .Label = c("0019", 
"0021"), class = "factor"), abgeandert = c(1, 1, 1, 0, 0), abgebildet = c(0, 
0, 1, 1, 0), abgelegt = c(0, 0, 0, 0, 3)), .Names = c("abend", 
"aber", "über", "überall", "überlegt", "ID", "abgeandert", "abgebildet", 
"abgelegt"), row.names = c(1L, 2L, 16L, 17L, 18L), class = "data.frame") 

sentEx # How it looks 
    abend aber über überall überlegt ID abgeandert abgebildet abgelegt 
1  1 0 1  0  0 0019   1   0  0 
2  1 1 0  0  0 0019   1   0  0 
16  0 0 0  0  0 0021   1   1  0 
17  0 0 0  0  0 0021   0   1  0 
18  2 0 0  0  0 0021   0   0  3 

沒有「umlaute」,以避免怪異umlaute錯誤它工作正常:

sentEx.dt <- data.table(sentEx[,-c(3,4,5)])[, lapply(.SD, sum), by=ID] 
(sentExSum <- as.data.frame(sentEx.dt)) # Need again as dataframe, which looks like: 
    ID abend aber abgeandert abgebildet abgelegt 
1 0019  2 1   2   0  0 
2 0021  2 0   1   2  3 

但除此之外,我得到這個錯誤:

sentEx.dt <- data.table(sentEx)[, lapply(.SD, sum), by=ID] 
# Error in gsum(`über`) : object 'über' not found 
     sentExSum <- as.data.frame(sentEx.dt) 

一些額外的信息(因爲問題似乎與系統有關 - 請參閱c omments):

sessionInfo() 
R version 3.0.2 (2013-09-25) 
Platform: x86_64-w64-mingw32/x64 (64-bit) 

locale: 
[1] LC_COLLATE=German_Germany.1252 LC_CTYPE=German_Germany.1252 LC_MONETARY=German_Germany.1252 LC_NUMERIC=C     
[5] LC_TIME=German_Germany.1252  

attached base packages: 
[1] stats  graphics grDevices utils  datasets methods base  

other attached packages: 
[1] data.table_1.9.2 

loaded via a namespace (and not attached): 
[1] plyr_1.8.1  Rcpp_0.11.0 reshape2_1.2.2 stringr_0.6.2 tools_3.0.2 

還請命令:

require(data.table); test.data.table() 
Running C:/Users/Krohana/Documents/R/win-library/3.0/data.table/tests/tests.Rraw 
Loading required package: reshape 
Loading required package: hexbin 
Loading required package: xts 
Loading required package: bit64 
Test 167.2 not run. If required call library(hexbin) first. 
Don't know how to automatically pick scale for object of type ITime. Defaulting to continuous 
Don't know how to automatically pick scale for object of type ITime. Defaulting to continuous 
Tests 487 and 488 not run. If required call library(reshape) first. 
Test 841 not run. If required call library(xts) first. 
Tests 897-899 not run. If required call library(bit64) first. 
All 1220 tests in inst/tests/tests.Rraw completed ok in 24.321sec on Sun Mar 02 17:57:26 2014 ts/tests.Rraw completed ok in 24.638sec on Sun Mar 02 17:55:45 2014 

要求commands2:

> Encoding(names(sentEx)) 
[1] "unknown" "unknown" "UTF-8" "UTF-8" "UTF-8" "unknown" "unknown" "unknown" "unknown" 
> options(datatable.verbose=TRUE) 
> options(datatable.verbose=TRUE); options(datatable.optimize=1L); 
+0

我無法在我的Mac上用data.table 1.8.10重現此操作。 – Roland

+0

我剛剛更新到1.9.2,我仍然無法重現該錯誤。請將'dput(sentEx)'的輸出添加到您的問題中。 – Roland

+0

@Roland你好,謝謝!我將最終結果的名稱改爲'sentExSum',現在它應該是可重複的。在第二步的'sentEx'輸入被覆蓋之前,錯誤不再發生。 – alex

回答

4

我也不繁殖。但是,Arun發現了另一個「對象未發現錯誤」,我希望這個也是。

現在v1.9.3,提交1212從新聞:

o An error "object [name] not found" could occur in some circumstances, particularly after a previous error. Reported with non-ASCII characters in a column name, a red herring we hope since non-ASCII characters are supported in column names in data.table. Fix implemented and tests added.

如果它再次發生,請告訴我們。您的測試已逐字添加到測試套件中,謝謝。

+0

非常感謝。這似乎是因爲我的Windows。 – alex

+1

@alex您爲什麼認爲這是因爲Windows?你是否說過你已經嘗試使用這個修補程序的v1.9.3,但它仍然不起作用? –