我有一系列來自一系列曲棍球比賽的得分數據,並且我處於分析階段。我試圖在每場比賽中每10分鐘畫出主隊的領先優勢。根據得分數據定期計算球隊領先優勢
這裏是哪裏我到目前爲止已經得到我的數據集的例子:
library(tidyverse)
# Generate example data ordered by gameid and event_ts
game <- tibble(event_type = "goal", event_ts = runif(n = 1000, min = 0, max = 60),
team = sample(c("home", "away"), size = 1000, replace = TRUE, prob = c(0.55,0.45)),
gameid = sample(100:300, size = 1000, replace = TRUE)) %>%
arrange(gameid, event_ts)
我知道,我可以用summarise
每場比賽的最終比分。下面是一個假設兩隊得分至少一個目標在每場比賽一個簡單的例子:
game %>%
group_by(gameid, team) %>%
summarise(goals = n()) %>%
spread(key = team, value = goals) %>%
mutate(away = ifelse(is.null(away), 0, away))
我想在整個遊戲10分鐘間隔計算出主隊的領先優勢(正或負)。這需要總結那時發生的所有得分。這裏有一個我想要得到的結構的例子:
finished_demo <- tibble(
gameid = sort(rep_len(seq(100, 300, 1), 1206)),
timestamp = rep(seq(10, 60, 10), 201),
home_lead = round(runif(
n = 1206, min = -5, max = 7
))
) %>% arrange(gameid, timestamp)
'庫(tidyverse);遊戲%>%mutate(event_ts = ceiling(event_ts/10)* 10)%>%complete(event_ts,gameid,team)%>%group_by(gameid,team,event_ts)%>%summarize(score = coalesce(sum %>總結(ts = list(event_ts),score = list(cumsum(得分)))%>%unnest()%>%spread(團隊,分數)%> %mutate(home_lead = home - away)' – alistaire