2015-08-24 92 views
2

R newb。我的數據的小代表。動態計數的發生

TeamHome <- c("LAL", "HOU", "SAS", "LAL") 
TeamAway <- c("IND", "SAS", "LAL", "HOU") 
df <- data.frame(cbind(TeamHome, TeamAway)) 
df 

    TeamHome TeamAway 
    LAL  IND 
    HOU  SAS 
    SAS  LAL 
    LAL  HOU 

想象這些成千上萬的遊戲一個賽季的前四場比賽。對於主隊和客隊,我想要計算在家中,在路上和總數上的累計比賽數量。因此,主隊和客隊都有3個新欄目。我想獲得這樣的事情(在這種情況下,我只計算主隊新的變量):

TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames 
1  LAL  IND       1      0     1 
2  HOU  SAS       1      0     1 
3  SAS  LAL       1      1     2 
4  LAL  HOU       2      1     3 

要計算第一列(HomeTeamGamesPlayedatHome)我成功做到這一點的:

df$HomeTeamGamesPlayedatHome <- ave(df$TeamHome==df$TeamHome, df$TeamHome, FUN=cumsum) 

但感覺過於複雜,我也無法用這種方法計算其他列。

我也想過用公式表計算出現的數量:

table(df$TeamHome) 

,但它只是計算總數,我想在任何給定時間點的結果。 謝謝!

+0

好問題,upvote for reproducable example and desired output – user2673238

回答

2
library(dplyr) 
df <- df %>% group_by(TeamHome) %>% 
    mutate(HomeGames = seq_along(TeamHome)) 
lst <- list() 
for(i in 1:nrow(df)) lst[[i]] <- sum(df$TeamAway[1:i] == df$TeamHome[i]) 
df$HomeTeamGamesPlayedRoad <- unlist(lst) 
df %>% mutate(HomeTeamTotalgames = HomeGames+HomeTeamGamesPlayedRoad) 
    TeamHome TeamAway HomeGames HomeTeamGamesPlayedRoad HomeGames 
1  LAL  IND   1      0   1 
2  HOU  SAS   1      0   1 
3  SAS  LAL   1      1   2 
4  LAL  HOU   2      1   3 

HomeGamesseq_along由行迭代創建。 HomeTeamGamesPlayedRoad創建一個循環檢查TeamAway中的值,直到幷包括當前遊戲。最後一行是另外兩個創建的總和。

+0

它的工作表示感謝!我期待着一些不那麼複雜的事情,但是它的工作。 – Sburg13

+0

hi pierre。非常感謝幫忙。想象一下,我有一個額外的第三列,主隊得分爲PTS,第四個爲客隊得分後的PTS。我怎樣才能擴展這個公式來總結主隊在家中和在路上得分的積分?非常感謝 – Sburg13

+1

最好再問一個跟進問題。並添加此問題作爲參考鏈接。 –

1

甲環解決方案:

TeamHome <- c("LAL", "HOU", "SAS", "LAL") 
TeamAway <- c("IND", "SAS", "LAL", "HOU") 
df <- data.frame(TeamHome,TeamAway,HomeTeamGamesPlayedatHome=ave(TeamHome==TeamHome, TeamHome, FUN=cumsum)) 

for (i in 1:nrow(df)) { 
     curdf<-df[1:i,];v<-ave(curdf$TeamAway==as.character(curdf$TeamHome[i]), curdf$TeamAway, FUN=cumsum) 
     df$HomeTeamGamesPlayedRoad[i] <- sum(v) 
} 
df$HomeTeamTotalgames <- df$HomeTeamGamesPlayedatHome + df$HomeTeamGamesPlayedRoad 

     TeamHome TeamAway HomeTeamGamesPlayedatHome HomeTeamGamesPlayedRoad HomeTeamTotalgames 
1  LAL  IND       1      0     1 
2  HOU  SAS       1      0     1 
3  SAS  LAL       1      1     2 
4  LAL  HOU       2      1     3