## 6.3 Credit Cards

We have a dataset with information on ten thousand customers. The objective is to predict which customers will stop paying their credit card debt.

Fuente: James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013) An Introduction to Statistical Learning with applications in R, www.StatLearning.com, Springer-Verlag, New York

Probabilidad de Impago: Datos
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
default No No No No No No No No No No No No No No No No No No No No
balance 729.53 817.18 1073.55 529.25 785.66 919.59 825.51 808.67 1161.06 0.00 0.00 1220.58 237.05 606.74 1112.97 286.23 0.00 527.54 485.94 1095.07
137 174 202 207 210 242 244 264 342 346 350 358 407 440 441 488 541 546 577 582
default Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
balance 1487.00 2205.80 1774.69 1889.60 1899.39 1572.86 1964.48 1530.35 1642.82 1991.65 1550.45 1328.89 1700.60 1118.70 1119.10 1981.45 1717.07 1465.21 1763.58 1770.97
9981 9982 9983 9984 9985 9986 9987 9988 9989 9990 9991 9992 9993 9994 9995 9996 9997 9998 9999 10000
default No No No No No No No No No No No No No No No No No No No No
balance 770.02 739.42 623.53 506.63 875.24 842.95 401.33 1092.91 0.00 999.28 372.38 658.80 1111.65 938.84 172.41 711.56 757.96 845.41 1569.01 200.92

Probabilidad de Impago: Resumen de Datos
No Variable Stats / Values Freqs (% of Valid) Graph Missing
1 default
[numeric]
Min : 0
Mean : 0
Max : 1
0 : 9667 (96.7%)
1 : 333 ( 3.3%)
0
(0.0%)
2 balance
[numeric]
Mean (sd) : 835.4 (483.7)
min < med < max:
0 < 823.6 < 2654.3
IQR (CV) : 684.6 (0.6)
9227 distinct values 0
(0.0%)

Using simple Linear Regression

Using Logistic Regression

$\log\left[ \frac { \widehat{P( \operatorname{default} = \operatorname{1} )} }{ 1 - \widehat{P( \operatorname{default} = \operatorname{1} )} } \right] = -10.65 + 0.01(\operatorname{balance})$
 Observations 10000 Dependent variable default Type Generalized linear model Family binomial Link logit
 χ²(1) 1324.2 Pseudo-R² (Cragg-Uhler) 0.49 Pseudo-R² (McFadden) 0.453 AIC 1600.45 BIC 1614.87
Est. S.E. z val. p
(Intercept) -10.651 0.361 -29.492 0.000
balance 0.005 0.000 24.953 0.000
Standard errors: MLE

Exercise

• Interpret the estimated coefficients
• Is this a good model?