2016-05-30 107 views
0

我試圖在包含24小時值(每天1個天頂角值)的數據集中提取12天內約31天的每日最小天頂角。它看起來像這樣:R中的每日最小值

JulianDay Azimuth Zenith Date (YYMMDD HH:MM:SS) 
2455928 174.14066 70.04650 2012-01-01 13:00:00 
2455928 188.80626 70.30747 2012-01-01 14:00:00 
2455928 203.03458 73.12297 2012-01-01 15:00:00 
2455928 216.28061 78.20131 2012-01-01 16:00:00 
2455928 228.35929 85.10759 2012-01-01 17:00:00 
.... 
2456293 146.33844 77.03456 2012-12-31 11:00:00 
2456293 159.80472 72.38003 2012-12-31 12:00:00 

是否有一個函數可以提取每天(即365個輸出)的最大和最小太陽天頂角?

回答

3

你可以通過一天歸類總結,這裏有一種方法,假設你的數據幀被稱爲df

library(data.table) 
setDT(df)[, .(maxZenith = max(Zenith), minZenith = min(Zenith)), .(JulianDay)] 

如果你想使用的,而不是JulianDayDate列,這樣做:

setDT(df)[, .(maxZenith = max(Zenith), minZenith = min(Zenith)), .(as.Date(Date))] 

假設您將Date (YYMMDD HH:MM:SS)重命名爲Date。儘管允許,但僅供參考,不認爲在列名中包含空格是一種好的做法。

3

在鹼R

my.data <- read.table(text = ' 

JulianDay Azimuth Zenith Date.YYMMDD Date.HHMMSS 
2455928 174.14066 70.04650 2012-01-01 13:00:00 
2455928 188.80626 70.30747 2012-01-01 14:00:00 
2455928 203.03458 73.12297 2012-01-01 15:00:00 
2455928 216.28061 78.20131 2012-01-01 16:00:00 
2455928 228.35929 85.10759 2012-01-01 17:00:00 
2455929 160.00000 70.04650 2012-01-02 13:00:00 
2455929 188.80626 70.30747 2012-01-02 14:00:00 
2455929 203.03458 73.12297 2012-01-02 15:00:00 
2455929 216.28061 78.20131 2012-01-02 16:00:00 
2455929 228.35929 85.10759 2012-01-02 17:00:00 
', header = TRUE) 

with(my.data, aggregate(Azimuth ~ JulianDay, FUN = function(x) c(Min = min(x), Max = max(x)))) 

一個與aggregate問題是,輸出不是一種形式,易於使用。這需要一點後處理的:

my.min.max <- with(my.data, aggregate(my.data$Azimuth, by = list(my.data$JulianDay), 
        FUN = function(x) c(MIN = min(x), MAX = max(x)))) 

# to convert output of aggregate into a data frame: 

my.min.max2 <- do.call(data.frame, my.min.max) 

# combine output from aggregate with original data set 

colnames(my.min.max2) <- c('JulianDay', 'my.min', 'my.max') 

my.data2 <- merge(my.data, my.min.max2, by = 'JulianDay') 
my.data2 

# JulianDay Azimuth Zenith Date.YYMMDD Date.HHMMSS my.min my.max 
#1 2455928 174.1407 70.04650 2012-01-01 13:00:00 174.1407 228.3593 
#2 2455928 188.8063 70.30747 2012-01-01 14:00:00 174.1407 228.3593 
#3 2455928 203.0346 73.12297 2012-01-01 15:00:00 174.1407 228.3593 
#4 2455928 216.2806 78.20131 2012-01-01 16:00:00 174.1407 228.3593 
#5 2455928 228.3593 85.10759 2012-01-01 17:00:00 174.1407 228.3593 
#6 2455929 160.0000 70.04650 2012-01-02 13:00:00 160.0000 228.3593 
#7 2455929 188.8063 70.30747 2012-01-02 14:00:00 160.0000 228.3593 
#8 2455929 203.0346 73.12297 2012-01-02 15:00:00 160.0000 228.3593 
#9 2455929 216.2806 78.20131 2012-01-02 16:00:00 160.0000 228.3593 
#10 2455929 228.3593 85.10759 2012-01-02 17:00:00 160.0000 228.3593 

您可以使用by也有,但是從by輸出也需要一些後處理:

by.min.max <- as.data.frame(do.call("rbind", by(my.data$Azimuth, my.data$JulianDay, 
          FUN = function(x) c(Min = min(x), Max = max(x))))) 

by.min.max <- cbind(JulianDay = rownames(by.min.max), by.min.max) 

my.data2 <- merge(my.data, by.min.max, by = 'JulianDay') 
my.data2 

您還可以使用tapply

my.data$Date_Time <- as.POSIXct(paste(my.data$Date.YYMMDD, my.data$Date.HHMMSS), 
           format = "%Y-%d-%m %H:%M:%S") 

ty.min.max <- as.data.frame(do.call("rbind", tapply(my.data$Azimuth, my.data$JulianDay, 
          FUN = function(x) c(Min = min(x), Max = max(x))))) 

ty.min.max <- cbind(JulianDay = rownames(ty.min.max), ty.min.max) 

my.data2 <- merge(my.data, ty.min.max, by = 'JulianDay') 
my.data2 

您還可以使用splitsapply的組合:

sy.min.max <- t(sapply(split(my.data$Azimuth, my.data$JulianDay), 
       function(x) c(Min = min(x), Max = max(x)))) 

sy.min.max <- data.frame(JulianDay = rownames(sy.min.max), sy.min.max, 
         stringsAsFactors = FALSE) 

my.data2 <- merge(my.data, sy.min.max, by = 'JulianDay') 
my.data2 

您還可以使用的splitlapply組合:

ly.min.max <- lapply(split(my.data$Azimuth, my.data$JulianDay), 
        function(x) c(Min = min(x), Max = max(x))) 

ly.min.max <- as.data.frame(do.call("rbind", ly.min.max)) 

ly.min.max <- cbind(JulianDay = rownames(ly.min.max), ly.min.max) 

my.data2 <- merge(my.data, ly.min.max, by = 'JulianDay') 
my.data2 

您還可以使用ave,雖然我還沒有想出如何在一個ave語句中使用兩個功能:

my.min <- ave(my.data$Azimuth, my.data$JulianDay, FUN = min) 
my.max <- ave(my.data$Azimuth, my.data$JulianDay, FUN = max) 

my.data2 <- data.frame(my.data, my.min, my.max) 
my.data2 
1

With dplyr

library(dplyr) 
df %>% 
    group_by(JulianDay) %>% #if you need `Date` class, use `as.Date(JulianDay)` 
    summarise(MaxZenith = max(Zenith), minZenith = min(Zenith)) 

其中'JulianDay'是(YYMMDD HH:MM:SS)的更名列名稱

相關問題