2014-10-30 69 views
1

我在1月和12月的月份中有很多異常值,因此我現在要排除它們。這裏是我的data.tableFilter data.table日期類型列是否包含月份

> str(statistics2) 
Classes 'data.table' and 'data.frame': 1418 obs. of 4 variables: 
$ status: chr "hire" "normal" "hire" "hire" ... 
$ month : Date, format: "1993-01-01" "1993-01-01" ... 
$ NOBS : int 37459 765 12 16 24 17 2 12 2 11 ... 

我試圖創建檢查月份的情況,但我得到了下面的錯誤。

format(statistics2['month'], "%m") 
Error in `[.data.table`(statistics2, "month") : 
    typeof x.month (double) != typeof i.month (character) 

回答

1

好吧,如果statistics2是一個data.frame

statistics2 <- data.frame(status=c("hire","normal","hire"), 
    month=as.Date(c("1993-01-01","1993-06-01", "1993-12-01")), 
    NOBS=c(37459,765,12) 
) 

那麼你應該使用

format(statistics2[["month"]], "%m") 
# [1] "01" "06" "12" 

(注意雙括號 - 否則你返回一個列表,它format()無法正確解釋)。

如果statistics2是data.table

statistics2dt <- data.table(statistics2) 

然後我還以爲statistics2dt['month']將返回不同的錯誤,但在這種情況下,正確的語法是

format(statistics2dt[, month], "%m") 
# [1] "01" "06" "12" 

(不包括引號和逗號)

0

您可以使用lubridate提取月份並從數據框中排除這些月份:

require(lubridate) 

rm(list = ls(all = T)) 

set.seed(0) 
months <- round(runif(100, 1, 12), digits = 0) 
years <- round(runif(100, 2013, 2014), digits = 0) 
day <- round(runif(100, 2, 25), digits = 0) 

dates <- paste(years, months, day, sep = "-") 

dates <- as.Date(dates, "%Y-%m-%d") 
NOBS <- round(runif(100, 1, 1000), digits = 0) 

statistics2 <- cbind.data.frame(dates, NOBS) 

months <- month(statistics2$dates) 

excJanDec <- statistics2[-which(months %in% c(1, 12)) ,] 
2

由於您的問題具體詢問data.table,有一組內置入data.table包lubridate樣函數(加載包並鍵入?month,例如)。您不需要format(...)lubridate

library(data.table) 
DT <- data.table(status=c("hire","normal","hire"), 
       month=as.Date(c("1993-01-01","1993-06-01", "1993-12-01")), 
       NOBS=c(37459,765,12)) 
DT 
# status  month NOBS 
# 1: hire 1993-01-01 37459 
# 2: normal 1993-06-01 765 
# 3: hire 1993-12-01 12 

DT[!(month(month) %in% c(1,12))] 
# status  month NOBS 
# 1: normal 1993-06-01 765 
相關問題