2017-02-25 83 views
3

學習R,不知道如何解決這個問題。將列添加到每個Quantmod符號

library(quantmod) 
library(xts) 

# get market data 
Nasdaq100_Symbols <- c("AAPL", "AAL") 
getSymbols(Nasdaq100_Symbols) 

# merge them together 
nasdaq100 <- data.frame(as.xts(merge(AAPL, AAL))) 
#tail(nasdaq100[,1:12],2) 

#make percent difference column 
nasdaq100$PD <- (((nasdaq100$AAPL.High - nasdaq100$AAPL.Open)/nasdaq100$AAPL.Open) * 100) 

我想添加一個百分比差異列,但上面的代碼將只爲AAPL符號(或任何符號使用),而不是爲每個符號PD列工作。

您是否必須在與xts合併之前以某種方式添加該列,或者我可以告訴R爲新合併框架中的每個符號創建它?

編輯:我做數據的訓練,所以我需要的所有符號標題,如:

  AAPL.Ope AAPL.High AAPL.Volume AAL.Open AAL.High 

1/3/2007 86.29 86.58  309579900 53.89 56.92 
1/4/2007 84.05 85.95  211815100 56.3 59.15 
1/5/2007 85.77 86.2  208685400 58.83 59.15 

回答

6

以我的經驗,它通常更有道理,讓您的財務數據爲xts對象,爲今後的操作與其他技術指標等,除非你打算運行在預測模型說caret在這種情況下轉換爲data.frame可能是有道理的。

考慮保持數據的符號作爲容器的元素,如

update_sym_md <- function(sym, env = .GlobalEnv) { 
    x <- get(sym, env) 
    pd <- setNames((Hi(x) - Op(x))/Op(x), "PD") 
    merge(x, pd) 
} 

# Adjust env for location of xts symbol data 
l.syms <- lapply(Nasdaq100_Symbols, update_sym_md, env = .GlobalEnv) 

lapply(l.syms, head) 
# [[1]] 
# AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted   PD 
# 2007-01-03  86.29  86.58 81.90  83.80 309579900  10.85709 0.003360760 
# 2007-01-04  84.05  85.95 83.82  85.66 211815100  11.09807 0.022605556 
# 2007-01-05  85.77  86.20 84.40  85.05 208685400  11.01904 0.005013373 
# 2007-01-08  85.96  86.53 85.28  85.47 199276700  11.07345 0.006630991 
# 2007-01-09  86.45  92.98 85.15  92.57 837324600  11.99333 0.075534942 
# 2007-01-10  94.75  97.80 93.45  97.00 738220000  12.56728 0.032190006 
# 
# [[2]] 
# AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted   PD 
# 2007-01-03 53.89 56.92 53.89  56.30 2955600  54.80361 0.0562256273 
# 2007-01-04 56.30 59.15 53.65  58.84 2614500  57.27610 0.0506217238 
# 2007-01-05 58.83 59.15 57.90  58.29 1656300  56.74072 0.0054394015 
# 2007-01-08 57.30 60.48 57.04  57.93 2163200  56.39028 0.0554974006 
# 2007-01-09 59.44 60.20 57.56  57.90 2098600  56.36108 0.0127860366 
# 2007-01-10 60.03 60.04 57.34  58.93 3892200  57.36371 0.0001666167 

另外,如果你想價格的回報/原料價格全面符號比較一個XTS對象,而不是在數據.frame,你可能會發現qmao包有用。

例如:

install.packages("qmao", repos="http://R-Forge.R-project.org", type = "source") 
library(qmao) 

pf <- makePriceFrame(Nasdaq100_Symbols) 
head(pf, 3) 
#    AAPL  AAL 
# 2007-01-03 10.85709 54.80361 
# 2007-01-04 11.09807 57.27610 
# 2007-01-05 11.01904 56.74072 
rf <- makeReturnFrame(Nasdaq100_Symbols) 
head(rf) 

#     AAPL   AAL 
# 2007-01-03   NA   NA 
# 2007-01-04 0.021952895 0.0441273684 
# 2007-01-05 -0.007146715 -0.0093913155 
# 2007-01-08 0.004926208 -0.0061951917 
# 2007-01-09 0.079799692 -0.0005179716 
# 2007-01-10 0.046745798 0.0176329011 

更新響應從OP評論:

要加入所有的數據到一行,試試這個:

(旁白:如果你」我們將在這個數據框架上使用非線性預測模型,請確保您考慮首先在每一行的證券中縮放您的數據點。)

x.cbind <- do.call(cbind, l.syms) 
head(x.cbind) 
# AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted   PD AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted   PD.1 
# 2007-01-03  86.29  86.58 81.90  83.80 309579900  10.85709 0.003360760 53.89 56.92 53.89  56.30 2955600  54.80361 0.0562256273 
# 2007-01-04  84.05  85.95 83.82  85.66 211815100  11.09807 0.022605556 56.30 59.15 53.65  58.84 2614500  57.27610 0.0506217238 
# 2007-01-05  85.77  86.20 84.40  85.05 208685400  11.01904 0.005013373 58.83 59.15 57.90  58.29 1656300  56.74072 0.0054394015 
# 2007-01-08  85.96  86.53 85.28  85.47 199276700  11.07345 0.006630991 57.30 60.48 57.04  57.93 2163200  56.39028 0.0554974006 
# 2007-01-09  86.45  92.98 85.15  92.57 837324600  11.99333 0.075534942 59.44 60.20 57.56  57.90 2098600  56.36108 0.0127860366 
# 2007-01-10  94.75  97.80 93.45  97.00 738220000  12.56728 0.032190006 60.03 60.04 57.34  58.93 3892200  57.36371 0.0001666167 

df.cbind <- data.frame("time" = index(x.cbind), coredata(x.cbind)) 
head(df.cbind) 
# time AAPL.Open AAPL.High AAPL.Low AAPL.Close AAPL.Volume AAPL.Adjusted   PD AAL.Open AAL.High AAL.Low AAL.Close AAL.Volume AAL.Adjusted   PD.1 
# 1 2007-01-03  86.29  86.58 81.90  83.80 309579900  10.85709 0.003360760 53.89 56.92 53.89  56.30 2955600  54.80361 0.0562256273 
# 2 2007-01-04  84.05  85.95 83.82  85.66 211815100  11.09807 0.022605556 56.30 59.15 53.65  58.84 2614500  57.27610 0.0506217238 
# 3 2007-01-05  85.77  86.20 84.40  85.05 208685400  11.01904 0.005013373 58.83 59.15 57.90  58.29 1656300  56.74072 0.0054394015 
# 4 2007-01-08  85.96  86.53 85.28  85.47 199276700  11.07345 0.006630991 57.30 60.48 57.04  57.93 2163200  56.39028 0.0554974006 
# 5 2007-01-09  86.45  92.98 85.15  92.57 837324600  11.99333 0.075534942 59.44 60.20 57.56  57.90 2098600  56.36108 0.0127860366 
# 6 2007-01-10  94.75  97.80 93.45  97.00 738220000  12.56728 0.032190006 60.03 60.04 57.34  58.93 3892200  57.36371 0.0001666167 

爲了更好地理解qmao函數是如何工作的,爲什麼不看文檔中的例子並從那裏開始呢? ?makeReturnFrame看看源代碼,以真正瞭解發生了什麼事情(和學習良好的編碼風格成爲在同一時間提供更好的[R程序員)

+0

我做的數據訓練的,所以我希望所有符號在頂部。無法真正弄清楚qmao是如何工作的,我只能用它來拉動那個數據字段。 – Alteredorange

+0

@Alteredorange當您希望所有的交易品種位於頂部時,我認爲您的意思是您希望在數據框的每一欄中顯示證券的數據,因此每一行都是使用橫截面證券價格數據的預測模型中的一個觀察值。我已通過回答進行編輯以顯示執行此操作的一種方法。 – FXQuantTrader

+0

這似乎正是我需要的!還有一個問題,是否有任何方法來命名列「symbol.PD」,所以它們將是AAPL.PD和AAL.PD,而不是PD和PD.1。 – Alteredorange

1

當我使用quantmod小號getSymbols功能我最常做的,是我自己寫的小包裝功能,沿着這樣的:

library(quantmod) 
# 1. write the wrapper function 
my_wrapper <- function(tickers, from, to) { 
    result_list <- lapply(tickers, function(ticker) { 
    tmp <- getSymbols(ticker, from = from, to = to, auto.assign = F) 
    tmp_df <- data.frame(date = index(tmp), 
         ticker = ticker, 
         open = as.numeric(Op(tmp)), 
         high = as.numeric(Hi(tmp)), 
         close = as.numeric(Cl(tmp)), 
         adj = as.numeric(Ad(tmp))) 
    }) 

    result_df <- do.call(rbind, result_list) 
    return(result_df) 
} 

# 2. download and inspect the data 
my_df <- my_wrapper(c("AAPL", "AAL"), from = "2010-01-01", to = "2016-12-31") 
summary(my_df) 
# date    ticker   open    high   close    adj   
# Min. :2010-01-04 AAPL:1762 Min. : 3.99 Min. : 4.06 Min. : 4.00 Min. : 3.894 
# 1st Qu.:2011-09-30 AAL :1762 1st Qu.: 17.16 1st Qu.: 17.46 1st Qu.: 17.23 1st Qu.: 16.770 
# Median :2013-07-04    Median : 72.94 Median : 73.45 Median : 73.02 Median : 41.920 
# Mean :2013-07-02    Mean :168.48 Mean :170.10 Mean :168.40 Mean : 49.208 
# 3rd Qu.:2015-04-06    3rd Qu.:318.11 3rd Qu.:320.39 3rd Qu.:318.23 3rd Qu.: 72.969 
# Max. :2016-12-30    Max. :702.41 Max. :705.07 Max. :702.10 Max. :127.966 

然後,計算我會建議使用dplyrdata.table或其他一些數據幀manipulat的差異離子包。這裏我使用dplyr。

# 3. Calculate the difference using dplyr 
library(dplyr) 

my_rets <- my_df %>% group_by(ticker) %>% mutate(pd = (high - open)/open) 

my_rets 
# Source: local data frame [3,524 x 7] 
# Groups: ticker [2] 
# 
#   date ticker open high close  adj   pd 
#  <date> <fctr> <dbl> <dbl> <dbl> <dbl>  <dbl> 
# 1 2010-01-04 AAPL 213.43 214.50 214.01 27.72704 0.0050133440 
# 2 2010-01-05 AAPL 214.60 215.59 214.38 27.77498 0.0046132153 
# 3 2010-01-06 AAPL 214.38 215.23 210.97 27.33318 0.0039649549 
# 4 2010-01-07 AAPL 211.75 212.00 210.58 27.28265 0.0011806659 
# 5 2010-01-08 AAPL 210.30 212.00 211.98 27.46403 0.0080837473 
# 6 2010-01-11 AAPL 212.80 213.00 210.11 27.22176 0.0009398731 
# 7 2010-01-12 AAPL 209.19 209.77 207.72 26.91211 0.0027725991 
# 8 2010-01-13 AAPL 207.87 210.93 210.65 27.29172 0.0147206905 
# 9 2010-01-14 AAPL 210.11 210.46 209.43 27.13366 0.0016657655 
# 10 2010-01-15 AAPL 210.93 211.60 205.93 26.68020 0.0031764188 
# # ... with 3,514 more rows 

P.你在這裏找到一個很好的dplyr介紹:https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html

2

我想補充使用比較新的和有趣的包裝tidyquant一個解決方案非常適合這樣的任務。您可以使用所有Tidyverse的東西,並且可以利用xt,quantmod和TTR的定量功能!看看有很多例子的vignette

library(tidyquant) 
c("AAPL", "AAL") %>% 
    tq_get(get = "stock.prices") %>% 
    group_by(symbol.x) %>% 
    tq_mutate(ohlc_fun = OHLCV, mutate_fun = OpCl,col_rename='diff') %>% 
    select(- c(low,volume)) # I deselect low and volume to show the added colum ‘diff' 

Source: local data frame [5,110 x 7] 
Groups: symbol.x [2] 

    symbol.x  date open high close adjusted   diff 
     <chr>  <date> <dbl> <dbl> <dbl> <dbl>  <dbl> 
1  AAPL 2007-01-03 86.29 86.58 83.80 10.85709 0.0033607603 
2  AAPL 2007-01-04 84.05 85.95 85.66 11.09807 0.0226055559 
3  AAPL 2007-01-05 85.77 86.20 85.05 11.01904 0.0050133730 
4  AAPL 2007-01-08 85.96 86.53 85.47 11.07345 0.0066309913 
5  AAPL 2007-01-09 86.45 92.98 92.57 11.99333 0.0755349424 

UPDATE:有人問到符號添加到列名

假設你保存上述數據幀到變量stocks

lapply(unique(stocks$symbol.x), function(x) stocks[stocks$symbol.x == x,]) %>% 
lapply(function(x) { 
    names(x) <- paste0(x$symbol.x[1],'.',colnames(x)) 
    x 
}) 
[[1]] 
Source: local data frame [2,555 x 7] 
Groups: symbol.x [1] 

    AAPL.symbol.x AAPL.date AAPL.open AAPL.high AAPL.close 
      <chr>  <date>  <dbl>  <dbl>  <dbl> 
1   AAPL 2007-01-03  86.29  86.58  83.80 
2   AAPL 2007-01-04  84.05  85.95  85.66 
3   AAPL 2007-01-05  85.77  86.20  85.05 
4   AAPL 2007-01-08  85.96  86.53  85.47 
5   AAPL 2007-01-09  86.45  92.98  92.57 
6   AAPL 2007-01-10  94.75  97.80  97.00 
7   AAPL 2007-01-11  95.94  96.78  95.80 
8   AAPL 2007-01-12  94.59  95.06  94.62 
9   AAPL 2007-01-16  95.68  97.25  97.10 
10   AAPL 2007-01-17  97.56  97.60  94.95 
# ... with 2,545 more rows, and 2 more variables: 
# AAPL.adjusted <dbl>, AAPL.diff <dbl> 

[[2]] 
Source: local data frame [2,555 x 7] 
Groups: symbol.x [1] 

    AAL.symbol.x AAL.date AAL.open AAL.high AAL.close 
      <chr>  <date> <dbl> <dbl>  <dbl> 
1   AAL 2007-01-03 53.89 56.92  56.30 
2   AAL 2007-01-04 56.30 59.15  58.84 
3   AAL 2007-01-05 58.83 59.15  58.29 
4   AAL 2007-01-08 57.30 60.48  57.93 
5   AAL 2007-01-09 59.44 60.20  57.90 
6   AAL 2007-01-10 60.03 60.04  58.93 
7   AAL 2007-01-11 59.18 61.20  61.20 
8   AAL 2007-01-12 61.20 62.50  60.81 
9   AAL 2007-01-16 60.81 62.10  61.96 
10   AAL 2007-01-17 60.96 61.89  58.65 
# ... with 2,545 more rows, and 2 more variables: 
# AAL.adjusted <dbl>, AAL.diff <dbl> 
+0

好吧,我得到了這個工作,甚至突變蹣跚進入和xts,但我正在做數據訓練,所以我需要所有符號作爲標題。能夠做到這一點嗎? 'AAL.Volume AAL.Adjusted 2007-01-03 2955600 54.80361' '2007-01-04 2614500 57.27610' '2007-01-05 1656300 56.74072' '2007-01-08 2163200 56.39028' '2007- 01-09 2098600 56.36108' '2007-01-10 3892200 57.36371' – Alteredorange

+0

@Alteredorange,查看更新的解決方案。這是你的意思是「標題中的符號」? – hvollmeier

+0

請注意'tidyquant'包已經更新。列名「symbol.x」現在將是「符號」。這在v0.4.0中進行了更改。 –

相關問題