2014-03-05 25 views
0

我和股票信息的數據幀的工作條件,在這裏是什麼樣子:總和因子上的另一個因素

> str(test) 
'data.frame': 211717 obs. of 19 variables: 
$ Symbol  : Factor w/ 3378 levels "AACC","AACE",..: 1 1 1 1 1 1 1 1 1 1 ... 
$ MktCategory : Factor w/ 3 levels "","NNM","SCM": 2 2 2 2 2 2 2 2 2 2 ... 
$ TSO   : num 37205115 37205115 37205115 37205115 37205115 ... 
$ TSO_Date  : Factor w/ 200 levels "","1/1/2006",..: 137 137 137 137 137 137 137 137 137 137 ... 
$ X.OfMP  : int 56 56 56 56 56 56 56 56 56 56 ... 
$ MPID   : Factor w/ 670 levels "","ABLE","ABNA",..: 608 459 533 618 550 635 307 146 387 482 ... 
$ MP_type  : Factor w/ 4 levels "","C","M","NR": 2 3 4 3 3 3 3 4 3 4 ... 
$ Total_Vol  : int 32900 0 2949 758522 41316 706131 29300 16898 362569 1490 ... 
$ Total_Rank : int 18 0 35 2 17 3 21 26 5 40 ... 
$ Total_Pct  : int 0 0 0 14 0 13 0 0 7 0 ... 
$ Block_Vol  : int 0 0 0 60800 20000 34900 19200 16600 0 0 ... 
$ Block_Rank : int 0 0 0 2 6 4 7 9 0 0 ... 
$ Block_Pct  : int 0 0 0 15 5 9 5 4 0 0 ... 
$ YTD_Total_Vol : num 81615 2929 10684 1949230 190874 ... 
$ YTD_Total_Rank: int 28 59 44 3 17 5 30 27 12 67 ... 
$ YTD_Total_Pct : int 0 0 0 9 0 7 0 0 2 0 ... 
$ YTD_Block_Vol : int 0 0 0 197420 80000 390600 60900 73787 55994 0 ... 
$ YTD_Block_Rank: int 0 0 0 5 13 3 16 14 17 0 ... 
$ YTD_Block_Pct : int 0 0 0 6 3 12 2 2 2 0 ... 

所以我知道如何通過與符號總和體積(Total_Vol)聚合函數:

volbystock<-aggregate(test$Total_Vol,by=list(test$Symbol),FUN=sum) 

但我想分析只有幾個MPID值的音量。當MPID是另一個列表中的MPID之一時,我只想添加符號的Total_Vol。換句話說,我只想補充一定符號的Total_Vol如果相應MPID是下列之一:

> use_MPID<-c("GSCO","LATS","TACT","INCA","LATS","LQNT","ITGI") 

回答

0

它看起來像你可以只子集的data.frame,通過使用:

use_MPID <- c("GSCO","LATS","TACT","INCA","LATS","LQNT","ITGI") 
relevant.symbols <- which(test$MPID %in% use_MPID) 
volbystock <- aggregate(test$Total_Vol[relevant.symbols], 
    by=list(test$Symbol[relevant.symbols]), 
    FUN=sum) 

這是否解決您的問題?

編輯

更妙的是,你可以使用子集可選參數,以提供合適的計算公式一起:

use_MPID <- c("GSCO","LATS","TACT","INCA","LATS","LQNT","ITGI") 
volbystock <- aggregate(formula=test$Total_Vol ~ test$Symbol, 
    subset=(test$MPID %in% use_MPID), 
    FUN=sum) 
1

使用dply你可以做這樣的事情:

# load dplyr  
library(dplyr) 

# create a vector of MPIDs you are interested on 
use_MPID <- c("GSCO","LATS","TACT","INCA","LATS","LQNT","ITGI") 

# create a fake dataset just for representation 
test <- data.frame(cbind(c("ci", "di", "bi", "bi"), c("GSCO","LATS","TACT","INCA"), c(35, 110, 201, 435))) 
names(test) <- c("Symbol", "MPID", "TotalVol") 

# use dplyr to summarise your dataset 
volbystock <- test %.% 
    group_by(Symbol) %.% 
    select(Symbol, MPID, TotalVol) %.% 
    filter(MPID %in% use_MPID)