2013-10-01 32 views
0

我有一個數據集,其中包含多個品牌和多個列的收入數據。每個專欄名稱都有09-12的Brandname.yr。有些品牌只有10-12或11-12的數據。我已經開發了以下代碼來計算每個品牌中每個客戶的CAGR。但是,當我執行代碼時,我的CAGrs數量是幾千(即4800.74)。有人對CAGR計算爲什麼沒有正確執行有任何建議嗎?基於R中列標題的自動CAGR計算

Data Example。

ClientID Rv.Brand1.09 Rv.Brand1.10 Rv.Brand1.11 Rv.Brand1.12 Rv.Brand2.09 Rv.Brand2.10 Rv.Brand2.11 Rv.Brand2.12 
1 6991979 6931508 5071305 4944208 2079843 2990803 2111142 1977724 
2 0 0 0 0 0 0 0 0 
3 0 29425 0 0 0 29425 0 0 
4 0 0 0 0 0 0 0 0 
5 0 0 0 0 0 0 0 0 




library(data.table) 
dataset <- data.table(mdb) 
# Getting the list of column names for which CAGR needs to be calculated 
Instance09 = gsub(
    colnames(dataset)[ 
    grepl(colnames(dataset), pattern = ".09") 
    ], 
    pattern = ".09", 
    replacement = "" 
) 

Instance10 = gsub(
    colnames(dataset)[ 
    grepl(colnames(dataset), pattern = ".10") 
    ], 
    pattern = ".10", 
    replacement = "" 
) 

Instance11 = gsub(
    colnames(dataset)[ 
    grepl(colnames(dataset), pattern = ".11") 
    ], 
    pattern = ".11", 
    replacement = "" 
) 

Instance12 = gsub(
    colnames(dataset)[ 
    grepl(colnames(dataset), pattern = ".12") 
    ], 
    pattern = ".12", 
    replacement = "" 
) 

Instance0912 <- intersect(Instance09,Instance12) 

Instance1012 <- intersect(Instance10,Instance12) 

Instance1112 <- intersect(Instance11,Instance12) 

Instance1012 <- Instance1012[!Instance1012 %in% Instance0912] 

Instance1112 <- Instance1112[!Instance1112 %in% Instance0912] 

for (i in Instance0912) 
{ 
    #calculating CAGR for each i 
    #dataset is a data.table and not a data.frame 
    dataset[, 
      paste0("CAGR",i):= (get(paste0(i,".12"))/get(paste0(i,".09"))^(1/3)) - 1 
      ] 

} 

for (i in Instance1012) 
{ 
    #calculating CAGR for each i 
    #dataset is a data.table and not a data.frame 
    dataset[, 
      paste0("CAGR",i):= (get(paste0(i,".12"))/get(paste0(i,".10"))^(1/2)) - 1 
      ] 

} 

for (i in Instance1112) 
{ 
    #calculating CAGR for each i 
    #dataset is a data.table and not a data.frame 
    dataset[, 
      paste0("CAGR",i):= (get(paste0(i,".12"))/get(paste0(i,".11"))^1) - 1 
      ] 

} 
+0

這是一個_minimal_例如;也就是說,這裏的所有代碼真的需要證明你的問題嗎?哪一行代碼生成4800.74;也就是說,你能包含輸出嗎? –

回答

0

你需要,當你計算CAGR,喜歡本作的第一種情況下把一組額外的括號:

for (i in Instance0912) 
{ 
#calculating CAGR for each i 
#dataset is a data.table and not a data.frame 
dataset[, 
     paste0("CAGR",i) := ((get(paste0(i,".12"))/get(paste0(i,".09")))^(1/3)) - 1 
     ] 

} 

你應該在grepl(小心),在正則表達式點.匹配任何字符。您需要將其編寫爲\.以字面匹配一個點。在具體情況下不是問題,只需要注意它。

我也建議考慮在相當長的格式重新組織你的數據,並可能定義一個函數來計算的年複合增長率(以避免拼寫出不同月份相同的計算技術錯誤。