使用一個簡單的例子logistic迴歸模型擬合到mtcars
數據集和代數描述here,我可以使用產生熱圖與決策邊界:
library(ggplot2)
library(tidyverse)
data("mtcars")
m1 = glm(am ~ hp + wt, data = mtcars, family = binomial)
# Generate combinations of hp and wt across their observed range. Only
# generating 50 values of each here, which is not a lot but since each
# combination is included, you get 50 x 50 rows
pred_df = expand.grid(
hp = seq(min(mtcars$hp), max(mtcars$hp), length.out = 50),
wt = seq(min(mtcars$wt), max(mtcars$wt), length.out = 50)
)
pred_df$pred_p = predict(m1, pred_df, type = "response")
# For a given value of hp (predictor1), find the value of
# wt (predictor2) that will give predicted p = 0.5
find_boundary = function(hp_val, coefs) {
beta_0 = coefs['(Intercept)']
beta_1 = coefs['hp']
beta_2 = coefs['wt']
boundary_wt = (-beta_0 - beta_1 * hp_val)/beta_2
}
# Find the boundary value of wt for each of the 50 values of hp
# Using the algebra in the linked question you can instead find
# the slope and intercept of the boundary, so you could potentially
# skip this step
boundary_df = pred_df %>%
select(hp) %>%
distinct %>%
mutate(wt = find_boundary(hp, coef(m1)))
ggplot(pred_df, aes(x = hp, y = wt)) +
geom_tile(aes(fill = pred_p)) +
geom_line(data = boundary_df)
生產:
請注意,這隻考慮了模型的固定效應,所以如果您想以某種方式考慮隨機效應,這可能會更復雜。
有趣的問題!例如,如果您提供示例數據,人們可以更輕鬆地獲得幫助。一個類似的邏輯迴歸模型適用於R中的示例數據集之一以及您的圖的代碼。我認爲你可以通過找到擬合線性預測變量(在對數賠率標度上)爲0的點來找到邊界線,所以你可以使用一些非常基本的代數來找出「predictor2」的值的方程,該方程將滿足給予'predictor1'的一些價值。 – Marius
事實上,我剛剛發現有人在這裏寫下代數:https://stats.stackexchange.com/a/159977/5443 – Marius