2016-03-19 71 views
-3

我已經在這個格式如何計算不同行之間的數據差異?

PrecipMM   Date 
    122.7   2004-01-01 
    54.2   2005-01-01 
    31.9   2006-01-01 
    100.5   2007-01-01 
    144.9   2008-01-01 
    96.4   2009-01-01 
    75.3   2010-01-01 
    94.8   2011-01-01 
    67.6   2012-01-01 
    93.0   2013-01-01 
    184.6   2014-01-01 
    101.0   2015-01-01 
    149.3   2016-01-01 
    50.2   2004-02-01 
    46.2   2005-02-01 
    57.7   2006-02-01 

我要計算所有的precipMM在同一個月的不同年份的差異有月度數據。

我的夢想輸出是這樣的:

PrecipMM   Date   PrecipMM_diff 
    122.7   2004-01-01   NA 
    54.2   2005-01-01   -68.5 
    31.9   2006-01-01   -22.3 
    100.5   2007-01-01   68.6 
    144.9   2008-01-01   44.4 
    96.4   2009-01-01   -48.5 
    75.3   2010-01-01   -21.2 
    94.8   2011-01-01   19.5 
    67.6   2012-01-01   -27.2 
    93.0   2013-01-01   25.4 
    184.6   2014-01-01   91.6 
    101.0   2015-01-01   -83.6 
    149.3   2016-01-01   48.3 
    50.2   2004-02-01   NA 
    46.2   2005-02-01   -4.0 
    57.7   2006-02-01   11.5 

我覺得diff()可以做到這一點,但我不知道怎麼樣。

回答

0

我認爲你可以這樣做lag結合group_bydplyr。這是如何:

library(dplyr) 
library(lubridate) # makes dealing with dates easier 

# Load your example data 
df <- structure(list(PrecipMM = c(4.4, 66.7, 48.2, 60.9, 108.1, 109.2, 
101.7, 38.1, 53.8, 71.9, 75.4, 67.1, 92.7, 115.3, 68.9, 38.9), 
    Date = structure(5:20, .Label = c("101.7", "108.1", "109.2", 
    "115.3", "1766-01-01", "1766-02-01", "1766-03-01", "1766-04-01", 
    "1766-05-01", "1766-06-01", "1766-07-01", "1766-08-01", "1766-09-01", 
    "1766-10-01", "1766-11-01", "1766-12-01", "1767-01-01", "1767-02-01", 
    "1767-03-01", "1767-04-01", "38.1", "38.9", "4.4", "48.2", 
    "53.8", "60.9", "66.7", "67.1", "68.9", "71.9", "75.4", "92.7" 
    ), class = "factor")), class = "data.frame", row.names = c(NA, 
-16L), .Names = c("PrecipMM", "Date")) 

results <- df %>% 
    mutate(years = year(Date), months = month(Date)) %>% 
    group_by(months) %>% 
    arrange(years) %>% 
    mutate(lagged.rain = lag(PrecipMM), rain.diff = PrecipMM - lagged.rain) 

results 
# Source: local data frame [16 x 6] 
# Groups: months [12] 
# 
# PrecipMM  Date years months lagged.rain rain.diff 
#  (dbl)  (fctr) (dbl) (dbl)  (dbl)  (dbl) 
# 1  4.4 1766-01-01 1766  1   NA  NA 
# 2  92.7 1767-01-01 1767  1   4.4  88.3 
# 3  66.7 1766-02-01 1766  2   NA  NA 
# 4  115.3 1767-02-01 1767  2  66.7  48.6 
# 5  48.2 1766-03-01 1766  3   NA  NA 
# 6  68.9 1767-03-01 1767  3  48.2  20.7 
# 7  60.9 1766-04-01 1766  4   NA  NA 
# 8  38.9 1767-04-01 1767  4  60.9  -22.0 
# 9  108.1 1766-05-01 1766  5   NA  NA 
# 10 109.2 1766-06-01 1766  6   NA  NA 
# 11 101.7 1766-07-01 1766  7   NA  NA 
# 12  38.1 1766-08-01 1766  8   NA  NA 
# 13  53.8 1766-09-01 1766  9   NA  NA 
# 14  71.9 1766-10-01 1766  10   NA  NA 
# 15  75.4 1766-11-01 1766  11   NA  NA 
# 16  67.1 1766-12-01 1766  12   NA  NA 
+0

我編輯了我的數據的格式,我認爲它更容易計算。你有什麼新想法嗎? –

+0

@ J.Zhao這種方法不適用於你的新數據嗎?另外,使用'dput'來共享您的數據會很有幫助,以便其他人可以輕鬆地將它讀入到R中。 –

+0

It works !! Thx很多。這真是一個簡單的方法。 –

相關問題