2017-08-06 52 views
0

我是比較新的生存分析,並已用於下文稱爲「電信」的樣本一些標準的電信客戶流失數據例如:從survreg預測繪製生存曲線

telco <- read.csv(text = "State,Account_Length,Area_Code,Intl_Plan,Day_Mins,Day_Calls,Day_Charge,Eve_Mins,Eve_Calls,Eve_Charge,Night_Mins,Night_Calls,Night_Charge,Intl_Mins,Intl_Calls,Intl_Charge,CustServ_Calls,Churn 
IN,65,415,no,129.1,137,21.95,228.5,83,19.42,208.8,111,9.4,12.7,6,3.43,4,TRUE 
RI,74,415,no,187.7,127,31.91,163.4,148,13.89,196,94,8.82,9.1,5,2.46,0,FALSE 
IA,168,408,no,128.8,96,21.9,104.9,71,8.92,141.1,128,6.35,11.2,2,3.02,1,FALSE 
MT,95,510,no,156.6,88,26.62,247.6,75,21.05,192.3,115,8.65,12.3,5,3.32,3,FALSE 
IA,62,415,no,120.7,70,20.52,307.2,76,26.11,203,99,9.14,13.1,6,3.54,4,FALSE 
NY,161,415,no,332.9,67,56.59,317.8,97,27.01,160.6,128,7.23,5.4,9,1.46,4,TRUE") 

我已經運行:

library(survival) 

dependentvars = Surv(telco$Account_Length, telco$Churn) 

telcosurvreg = survreg(dependentvars ~ -Churn -Account_Length, dist="gaussian",data=telco) 

telcopred = predict(telcosurvreg, newdata=telco, type="quantile", p=.5) 

...得到每個客戶的預測壽命。

我正在努力的是如何可視化爲此的生存曲線。有沒有辦法(最好在ggplot2中)從我擁有的數據中做到這一點?

+0

本帖(HTTPS://rstudio-pubs-static.s3.amazonaws。 com/5588_72eb65bfbe0a4cb7b655d2eee0751584.html)可能會幫助你。 – G5W

+1

該帖子提交了與此問題相同的錯誤。不要在Surreg功能外部建立Surv對象。 –

回答

0

這是一個base R版本,繪製預測的生存曲線。我已經改變了formula所以曲線的每一行

> # change setup so we have one covariate 
> telcosurvreg = survreg(
+ Surv(Account_Length, Churn) ~ Eve_Charge, dist = "gaussian", data = telco) 
> telcosurvreg # has more than an intercept 
Call: 
survreg(formula = Surv(Account_Length, Churn) ~ Eve_Charge, data = telco, 
    dist = "gaussian") 

Coefficients: 
(Intercept) Eve_Charge 
227.274695 -3.586121 

Scale= 56.9418 

Loglik(model)= -12.1 Loglik(intercept only)= -12.4 
    Chisq= 0.54 on 1 degrees of freedom, p= 0.46 
n= 6 
> 
> # find linear predictors 
> vals <- predict(telcosurvreg, newdata = telco, type = "lp") 
> 
> # use the survreg.distributions object. See ?survreg.distributions 
> x_grid <- 1:400 
> sur_curves <- sapply(
+ vals, function(x) 
+  survreg.distributions[[telcosurvreg$dist]]$density(
+  (x - x_grid)/telcosurvreg$scale)[, 1]) 
> 
> # plot with base R 
> matplot(x_grid, sur_curves, type = "l", lty = 1) 

這裏不同的是結果

enter image description here