2
下面是我的問題與示例數據的簡化版本:用比例因變量擬合數據
每年,我在我的院子裏找到40個球。它們中有一定比例是紅色的。我想模擬一段時間內紅球的比例。
library(tidyverse)
library(modelr)
# generate some proportion data that changes by year
data = tibble(
year = 2011:2020,
reds = 1:10, # red balls
total = 40, # total number of balls
propRed = reds/total # proportion of red balls each year
)
# fit to a model
model = glm(propRed ~ year, XXX_WHAT_GOES_HERE_XXX, data)
# graph the model's prediction and the data
tibble(year = 2000:2030) %>%
modelr::add_predictions(model, "propRed") %>%
ggplot() +
aes(y=propRed, x=year) +
geom_line() +
geom_point(data=data)
這可以是一個邏輯迴歸。使用類似'glm(cbind(reds,total - reds)〜year,family ='binomial',data = data')調用'glm'' – bouncyball
不清楚你在問什麼。此外,您可能需要在交叉驗證中發佈。 – www
@bouncyball:我運行了'tibble(year = 2000:2030)%> predict.glm(model,。)',它預測了不應該有的負值。 – sharoz