2013-05-22 32 views
0

我有一個奇怪的問題R.到達內存限制R中

我有一個大data.table dataTs1:

Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables: 
$ TimeStamp: chr "01MAR13:07:15:00" "01MAR13:07:16:00" "01MAR13:07:18:00" ... 
$ col1  : chr "ALL1" "ALL1" "ALL1" "ALL1" ... 
$ col2  : int NA NA NA NA NA NA NA NA NA NA ... 
$ col3  : int 4 4 4 4 4 4 4 4 4 4 ... 
$ col4  : int 621 810 4 4 8 1 3 1 1 1 ... 

我裝使用fread功能此表。

內存分配似乎沒問題。

> memory.size(max=TRUE) 
[1] 82.94 

我想修改類的第一線,以POSIX所以我寫了:

dataTs1 $時間戳< - strptime(dataTs1 $時間戳,「%d%B%Y:%H :%M:%S「)

而且這條線,我得到達到16G的我的記憶極限......但是當我寫:

test <- 1:length(dataTs1$TimeStamp) 
dataTs1$TimeStamp <- test 

它完美的工作,沒有任何內存過載。

我對R很新,我很感激如果你能幫我弄清楚我在這裏做錯了什麼。

THX


編輯:

其實我得到一個奇怪的警告,有時當我沒有得到一個內存過載:

>dataTs1[,TimeStamp:=strptime(TimeStamp,"%d%b%y:%H:%M:%S")] 
Warning messages: 
1: In `[<-.data.table`(x, j = name, value = value) : 
    Supplied 9 items to be assigned to 419172 items of column 'TimeStamp' (recycled leaving remainder of 6 items). 
2: In `[<-.data.table`(x, j = name, value = value) : 
    Coerced 'list' RHS to 'character' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 419172 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'character' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please. 
> str(dataTs1) 
Classes ‘data.table’ and 'data.frame': 419172 obs. of 5 variables: 
$ TimeStamp: chr "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ "c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N"| __truncated__ ... 
$ V6FCDSB : chr "ALL1" "ALL1" "ALL1" "ALL1" ... 
$ V6FCDTD : int NA NA NA NA NA NA NA NA NA NA ... 
$ _TYPE_ : int 4 4 4 4 4 4 4 4 4 4 ... 
$ N  : int 621 810 4 4 8 1 3 1 1 1 ... 
- attr(*, ".internal.selfref")=<externalptr> 
+0

您使用的是哪個版本的R?曾經有'strptime'的內存泄漏。 – James

+0

您應該通過引用來指定:'dataTs1 [,TimeStamp:= strptime(TimeStamp,「%d%b%y:%H:%M:%S」)]' – Roland

+0

@James我使用3.0.0版本 –

回答