。警告消息: glm.fit:然而,當試圖用邏輯迴歸模型,我遇到下面的警告信息的算法沒有收斂。此外,似乎這些預測根本不起作用(不是從原來的Y變量(make or miss)改變而來)。我將在下面提供我的代碼。我從這裏得到的數據:Shot Data.在NBA Logistic迴歸拍我使用NBA打出的數據和我在嘗試使用不同的迴歸技術拍攝的預測模型數據
nba_shots <- read.csv("shot_logs.csv")
library(dplyr)
library(ggplot2)
library(data.table)
library("caTools")
library(glmnet)
library(caret)
nba_shots_clean <- data.frame("game_id" = nba_shots$GAME_ID, "location" =
nba_shots$LOCATION, "shot_number" = nba_shots$SHOT_NUMBER,
"closest_defender" = nba_shots$CLOSEST_DEFENDER,
"defender_distance" = nba_shots$CLOSE_DEF_DIST, "points" = nba_shots$PTS,
"player_name" = nba_shots$player_name, "dribbles" = nba_shots$DRIBBLES,
"shot_clock" = nba_shots$SHOT_CLOCK, "quarter" = nba_shots$PERIOD,
"touch_time" = nba_shots$TOUCH_TIME, "game_result" = nba_shots$W
, "FGM" = nba_shots$FGM)
mean(nba_shots_clean$shot_clock) # NA
# this gave NA return which means that there are NAs in this column that we
# need to clean up
# if the shot clock was NA I assume that this means it was the end of a
# quarter and the shot clock was off.
# For now I'm going to just set all of these NAs equal to zero, so all zeros
# mean it is the end of a quarter
# checking the amount of NAs
last_shots <- nba_shots_clean[is.na(nba_shots_clean$shot_clock),]
nrow(last_shots) # this tells me there is 5567 shots taken when the shot
# clock was turned off at the end of a quarter
# setting these NAs equal to zero
nba_shots_clean[is.na(nba_shots_clean)] <- 0
# checking to see if it worked
nrow(nba_shots_clean[is.na(nba_shots_clean$shot_clock),]) # it worked
# create a test and train set
split = sample.split(nba_shots_clean, SplitRatio=0.75)
nbaTrain = subset(nba_shots_clean, split==TRUE)
nbaTest = subset(nba_shots_clean, split==FALSE)
# logistic regression
nbaLogitModel <- glm(FGM ~ location + shot_number + defender_distance +
points + dribbles + shot_clock + quarter + touch_time, data=nbaTrain,
family="binomial", na.action = na.omit)
nbaPredict = predict(nbaLogitModel, newdata=nbaTest, type="response")
cm = table(nbaTest$FGM, nbaPredict > 0.5)
print(cm)
這給了我下面的輸出,它告訴我的預測沒有做任何事情,因爲它是和以前一樣。
FALSE TRUE
0 21428 0
1 0 17977
我真的很感謝任何指導。
嘗試讀取此:https://stats.stackexchange.com/questions/5354/logistic-regression-model-does-not-converge – staove7
嘗試提供最小[再現的示例](HTTP:/帶有樣本輸入數據的/stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)。如果我們無法運行代碼,那麼要幫助你很難。 – MrFlick
@MrFlick通過鏈接提供的csv文件不夠好? – Chris95