計算「長」數據框中的百分比變化

我有一個數據框，其中包含具有日期列的國家的GDP值。下面的代碼重新對兩個國家（FR和DE）和六年（2005- 2010年）的樣本數據集：計算「長」數據框中的百分比變化

df <- structure(list(geo = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
       2L, 2L, 2L, 2L, 2L), .Label = c("DE", "FR"), class = "factor"), 
       date = structure(c(12784, 13149, 13514, 13879, 14245, 14610, 
       12784, 13149, 13514, 13879, 14245, 14610), class = "Date"), 
       GDP = c(2147975, 2249584.4, 2373993.1, 2382892.6, 2224501.8, 
       2371033.2, 1557584.8, 1621633.2, 1715655.4, 1713157.1, 1636336.3, 
       1707966.5)), .Names = c("geo", "date", "GDP"), row.names = c(NA, 
       -12L), class = "data.frame")

現在我想計算的附加欄，顯示的百分比差異，較去年同期。我嘗試以下方法：

library(quantmod) 
# provides the Delt() function to calculate percent differences 

df$dtGDP <- as.numeric(Delt(df$GDP))

這是錯誤的，因爲它利用從2010年DE值計算2005年FR的值是否有「每個因子水平」應用功能的方法是什麼？

來源

2012-11-20 Tungurahua

這是一個非常典型的「分申請，結合」問題，你可能會發現SO上的豐富答案。 – BenBarnes

@BenBarnes我仍然喜歡下面的迪文答案！ – Ali

其實@BenBarnes可能是正確的。如果你搜索'tapply'和'ave'，你可能會發現很多與我的非常相似的例子。（另一方面，你會發現許多plyr-package函數的實例，它們本質上是同構的，並且與這個函數是同構的。） –

> df$dtGDP <-with(df, ave(GDP, geo, FUN=Delt)) 
> df 
    geo  date  GDP  dtGDP 
1 DE 2005-01-01 2147975   NA 
2 DE 2006-01-01 2249584 0.047304741 
3 DE 2007-01-01 2373993 0.055302971 
4 DE 2008-01-01 2382893 0.003748747 
5 DE 2009-01-01 2224502 -0.066469970 
6 DE 2010-01-01 2371033 0.065871558 
7 FR 2005-01-01 1557585   NA 
8 FR 2006-01-01 1621633 0.041120329 
9 FR 2007-01-01 1715655 0.057979943 
10 FR 2008-01-01 1713157 -0.001456178 
11 FR 2009-01-01 1636336 -0.044841655 
12 FR 2010-01-01 1707966 0.043774742

來源

2012-11-21 00:55:12

非常棒！我執行了兩行，並且您執行了單個命令 – Ali

試試這個：

foo <- aggregate(GDP~geo, df, function(x) list(Delt(x))) 
df <- cbind(df, dtGDP = as.numeric(unlist(foo[,-1]))) 
df

假設您已經運行此：

library(quantmod) 
df <- structure(list(geo = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 
       2L, 2L, 2L, 2L, 2L), .Label = c("DE", "FR"), class = "factor"), 
       date = structure(c(12784, 13149, 13514, 13879, 14245, 14610, 
       12784, 13149, 13514, 13879, 14245, 14610), class = "Date"), 
       GDP = c(2147975, 2249584.4, 2373993.1, 2382892.6, 2224501.8, 
       2371033.2, 1557584.8, 1621633.2, 1715655.4, 1713157.1, 1636336.3, 
       1707966.5)), .Names = c("geo", "date", "GDP"), row.names = c(NA, 
       -12L), class = "data.frame")

來源

2012-11-20 23:12:17 Ali

計算「長」數據框中的百分比變化

回答

相關問題