日期與溫度數據幀：在R中每天查找最高溫度

我在R中有一個數據幀，我從Rs中的csv上傳並試圖找到每天的最高溫度。 data.frame的格式是col（1）是Date（YYYY-MM-DD HH：mm格式），col（2）是該日期/時間的溫度。我嘗試將數據分類爲子集，從頭到尾（年份，那一年的月份，那幾個月的日子），但發現它非常複雜。日期與溫度數據幀：在R中每天查找最高溫度

這是該數據幀的樣本：

    Date Unit Temp 
1 2012-10-21 21:14:00 C 82.5 
2 2012-10-21 21:34:00 C 37.5 
3 2012-10-21 21:54:00 C 20.0 
4 2012-10-21 22:14:00 C 26.5 
5 2012-10-21 22:34:00 C 20.0 
6 2012-10-21 22:54:00 C 19.0

來源

2013-06-18 user2498712

使用'dput'或'head'來發布您的某些數據框以獲得特定答案。請參閱：http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – harkmug

我會創建這是一年中的天（DOY）一列，然後使用aggregate功能查找每個DOY的最高溫度。例如：

例如，假設你的data.frame被稱爲Data，而Data有兩列：第一列命名爲「日期」，第二列命名爲「溫度」。我會做以下事情：

Data[,"DoY"] <- format.Date(Data[,"Date"], format="%j") #make sure that Data[,"Date"] is already in a recognizable format-- e.g., see as.POSIXct() 
MaxTemps <- aggregate(Data[,"Temperature"], by=list(Data[,"DoY"]), FUN=max) # can add na.rm=TRUE if there are missing values

MaxTemps應該包含每天觀察到的最高溫度。但是，如果有多個年的數據集中使得例如，169天（今天）重複一次以上（例如，今日，1年前），你可以做到以下幾點：

Data[,"DoY"] <- format.Date(Data[,"Date"], format="%Y_%j") #notice the date format, which will be unique for all combinations of year and day of year. 
MaxTemps <- aggregate(Data[,"Temperature"], by=list(Data[,"DoY"]), FUN=max) # can add na.rm=TRUE if there are missing values

我希望這個對你有用！

來源

2013-06-18 21:01:56 rbatt

沒有可再生的例子不是一件容易的事。

也就是說，您可以使用lubridate（日期管理）和plyr（拆分應用）來解決此問題。

讓我們創建第一

set.seed(123) 
tmp <- data.frame(Date = seq(as.POSIXct("2013-06-18 10:00"), 
        length.out = 100, by = "6 hours"), 
        Unit = "C", 
        Temp = rnorm(n = 100, mean = 20, sd = 5)) 
str(tmp) 
## 'data.frame': 100 obs. of 3 variables: 
## $ Date: POSIXct, format: "2013-06-18 10:00:00" ... 
## $ Unit: Factor w/ 1 level "C": 1 1 1 1 1 1 1 1 1 1 ... 
## $ Temp: num 17.2 18.8 27.8 20.4 20.6 ... 


write.csv(tmp, "/tmp/tmp.csv", row.names = FALSE) 
rm(tmp)

類似你這樣的數據現在，我們可以計算最大

require(lubridate) 
require(plyr) 

### NULL is to not import the second column which is the unit 
tmp <- read.csv("/tmp/tmp.csv", 
       colClasses = c("POSIXct", "NULL", "numeric")) 


tmp <- transform(tmp, jday = yday(Date)) 


ddply(tmp, .(jday), summarise, max_temp = max(Temp)) 

## jday max_temp 
## 1 169 27.794 
## 2 170 28.575 
## 3 171 26.120 
## 4 172 22.004 
## 5 173 28.935 
## 6 174 18.910 
## 7 175 24.189 
## 8 176 26.269 
## 9 177 24.476 
## 10 178 23.443 
## 11 179 18.960 
## 12 180 30.845 
## 13 181 23.900 
## 14 182 26.843 
## 15 183 27.582 
## 16 184 21.898 
...................

來源

2013-06-18 21:02:33 dickoa

對不起，我應該從我的數據框中添加一些示例，現在讓我來做。我是R和stackflow的新手！ – user2498712

@ user2498712根據您的數據結構更新我的答案。嘗試看它是否有效 – dickoa

我收到以下錯誤消息： > ddply（tmp，。（jday），summarize，max_temp = max（Temp））屬性錯誤< - 屬性（col）： 'names'屬性[9]必須與矢量[4]的長度相同[4] – user2498712

我會假設你有一個名爲df與變量date和temp的數據幀。這段代碼沒有經過測試，但它可能有效，但有點運氣。

library(lubridate) 
df$justday <- floor_date(df$date, "day") 

# for just the maxima, you could use this: 
tapply(df$temp, df$justday, max) 

# if you would rather have the results in a data frame, use this: 
aggregate(temp ~ justday, data=df)

來源

2013-06-18 21:10:18

函數apply.daily在程序包xts中的功能完全符合您的需求。

install.packages("xts") 
require('xts') 

tmp <- data.frame(Date = seq(as.POSIXct("2013-06-18 10:00"), 
    length.out = 100, by = "6 hours"), 
    Unit = "C", 
    Temp = rnorm(n = 100, mean = 20, sd = 5)) # thanks to dickoa for this code 

head(tmp) 
data <- xts(x=tmp[ ,3], order.by=tmp[,1]) 
attr(data, 'Unit') <- tmp[,'Unit'] 
attr(data, 'Unit') 

dMax <- apply.daily(data, max) 
head(dMax)

來源

2013-07-19 10:53:24 sfuj

日期與溫度數據幀：在R中每天查找最高溫度

回答

相關問題