2017-03-19 225 views
0

我有兩個數據幀。計算特定日期的平均值

我想從第二個數據框的特定日期定義前5天的sp變量的平均值。

例如,從1997.05.05(這將在1997.05.05到1997.05.01之間)和1997.05.27到1997.05.31之間的平均值的平均值計算有價值的日子(在此情況3)。

下面是變量:

sp < - c(7,9,9,4,2,5,2,9,NA,14,NA,NA,NA,NA,NA,14,25,NA,11,10,12,NA,9,NA,6,8,6,1,NA,7,NA) 

Date <- c("1997-05-01","1997-05-02","1997-05-03","1997-05-04","1997-05-05", 
       "1997-05-06","1997-05-07","1997-05-08","1997-05-09","1997-05-10", 
       "1997-05-11","1997-05-12","1997-05-13","1997-05-14","1997-05-15", 
       "1997-05-16","1997-05-17","1997-05-18","1997-05-19","1997-05-20", 
       "1997-05-21","1997-05-22","1997-05-23","1997-05-24","1997-05-25", 
       "1997-05-26","1997-05-27","1997-05-28","1997-05-29","1997-05-30", 
       "1997-05-31") 

data1 <- data.frame(sp, Date) 

DateX <- c("1997-05-05","1997-05-15","1997-05-31") 

data2 <- data.frame(DateX) 

怎麼做最好?幫助將不勝感激。

這是我預期的結果(在第二個數據幀,數據2):

1. DateX spMean 
2. 1997-05-05 6.2 
3. 1997-05-15 NA 
4. 1997-05-31 4.6 

回答

2

我已經做了幾個類型更改爲您最初的代碼。給下面的一個鏡頭......我使用lapply使用第二個對象中的日期對data1對象運行快速功能。

sp <- c(7,9,9,4,2,5,2,9,NA,14,NA,NA,NA,NA,NA,14,25,NA,11,10,12,NA,9,NA,6,8,6,1,NA,7,NA) 

Date <- as.Date(c("1997-05-01","1997-05-02","1997-05-03","1997-05-04","1997-05-05", 
      "1997-05-06","1997-05-07","1997-05-08","1997-05-09","1997-05-10", 
      "1997-05-11","1997-05-12","1997-05-13","1997-05-14","1997-05-15", 
      "1997-05-16","1997-05-17","1997-05-18","1997-05-19","1997-05-20", 
      "1997-05-21","1997-05-22","1997-05-23","1997-05-24","1997-05-25", 
      "1997-05-26","1997-05-27","1997-05-28","1997-05-29","1997-05-30", 
      "1997-05-31")) 

data1 <- data.frame(sp, Date) 

DateX <- as.Date(c("1997-05-05","1997-05-15","1997-05-31")) 

data2 <- data.frame(DateX) 

#Add column for mean, NA values return NA 
data2$spMean_na <- lapply(DateX, 
       function(m) mean(data1$sp[data1$Date >= m - 5 & data1$Date <= m])) 

#Add column for mean, remove NA values 
data2$spMean_na_omit <- lapply(DateX, 
          function(m) mean(data1$sp[data1$Date >= m - 5 & data1$Date <= m], 
              na.rm = TRUE)) 

> data2 
     DateX spMean_na spMean_na_omit 
1 1997-05-05  6.2   6.2 
2 1997-05-15  NA    14 
3 1997-05-31  NA   5.5 

我想你可能需要改變你的預期結果。第29行的sp值爲NA,並且在1997-05-31的5天內。所以它應該按照您的要求返回NA,因爲我理解它們。

+0

嗨,尼克。非常感謝。我認爲我沒有寫得正確......我也希望它能夠在日期內只有2,3或4行是數值時進行平均。你有什麼想法如何將這個包含在函數中? – Gustavo

+0

@Gustavo,我根據您的要求做了更改。讓我知道,如果這是你在找什麼。 –

+0

是的,令人難以置信!非常感謝你!! – Gustavo