2013-03-15 11 views
1

我是新來的R和我在下面的方式 的數據,這些都是兩列分組值在單獨的列中的R

Broker_ID_Buy   Broker_ID_Sell 

    638      423 
    546      728 
    423      321 
    546      423 

,並繼續有出現在買入各地28個不同的經紀人或不同時間出售位置

我需要安排這樣

Broker_ID_638   Broker_ID_423   Broker_ID_546 

BP       SP     IP 
IP       IP     BP 
IP       BP     IP 
IP       SP     BP 

其中BP =買入保值,SP =銷售位置,IP =空閒寶數據sition

我想用這三種不同的狀態,使用Markov鏈

+1

因此,同一時間(第一張表中的一行)只能有一個買入和一個賣出的經紀人,對嗎? – digEmAll 2013-03-15 17:53:18

回答

2

這似乎讓你在正確的棒球場預測:

library(reshape2) 
x <- data.frame(BP = c(638,546,423,546), SP = c(423, 728, 321, 423)) 
x$index <- 1:nrow(x) 
x.m <- melt(x, id.vars = "index") 
out <- dcast(index ~ value, data = x.m, value.var="variable") 
out[is.na(out)] <- "IP" 
out 
    #--- 
    index 321 423 546 638 728 
1  1 IP SP IP BP IP 
2  2 IP IP BP IP SP 
3  3 SP BP IP IP IP 
4  4 IP SP BP IP IP 
1

這裏的另一種可能的解決方案:

# create your table 
txt <- 
"Broker_ID_Buy,Broker_ID_Sell 
638,423 
546,728 
423,321 
546,423" 
dt1 <- read.csv(text=txt) 

# turn "Time, Broker_ID_Buy, Broker_ID_Sell" data.frame 
# into "Time, Broker_ID, Position" 
buyers <- data.frame(Time=1:nrow(dt1), 
         Broker_ID=dt1$Broker_ID_Buy, 
         Position="BP", 
         stringsAsFactors=F) 
sellers <- data.frame(Time=1:nrow(dt1), 
         Broker_ID=dt1$Broker_ID_Sell, 
         Position="SP", 
         stringsAsFactors=F)   
longDT <- rbind(buyers,sellers) 

# pivot the brocker ids on the columns 
wideDT <- reshape(data=longDT,direction="wide", 
        timevar="Broker_ID", idvar="Time", v.names="Position") 

# well-format column names and turn NAs into "IP" 
names(wideDT) <- sub(x=names(wideDT),pattern="Position.","Broker_ID_") 
wideDT[is.na(wideDT)] <- "IP" 

結果:

> wideDT 
    Time Broker_ID_638 Broker_ID_546 Broker_ID_423 Broker_ID_728 Broker_ID_321 
1 1   BP   IP   SP   IP   IP 
2 2   IP   BP   IP   SP   IP 
3 3   IP   IP   BP   IP   SP 
4 4   IP   BP   SP   IP   IP