7.3 Parameter Estimation

If \(Y\) is a binary variable, a logistic model is:

\[ {P}(Y=1 \mid \mathbf{x})=\dfrac{\exp({\beta_{0}+\sum_{k=1}^{K} \beta_{k} x_{k}})}{1+\exp({\beta_{0}+\sum_{k=1}^{K} \beta_{k} x_{k}})} \]

  • \(\beta_k\) is the partial regression coefficient of the predictor \(x_k.\)

  • Indicates the mean change in the logarithm of odds by increasing the variable \(x_k,\) by one unit, keeping all other variables constant.

  • For each unit that \(x_k,\) is increased, the odds are multiplied by \(exp(\beta_k)\).

The parameters of coefficients \(\boldsymbol{\beta}=\{\beta_{0}, \beta_{1},\ldots, \beta_{K}\}\) are estimated using the Maximum Likelihood method.

That is, to estimate the coefficients of a logistic regression, numerical algorithms are used to maximize the function:

\[ \begin{aligned} L(y ;(\mathbf{x}, \boldsymbol{\beta})) &=\prod_{i=1}^{n}\left({P}\left(Y_{i}=1 \mid \mathbf{x}_{i}, \boldsymbol{\beta}\right)\right)^{y_{i}}\left(1-{P}\left(Y_{i}=1 \mid \mathbf{x}_{i}, \boldsymbol{\beta}\right)\right)^{1-y_{i}} \\ & \prod_{i=1}^{n}\left(\frac{e^{\beta_{0}+\sum_{k=1}^{K} \beta_{k} x_{i, k}}}{1+e^{\beta_{0}+\sum_{k=1}^{K} \beta_{k} x_{k, i}}}\right)^{y_{i}}\left(\frac{1}{1+e^{\beta_{0}+\sum_{k=1}^{K} \beta_{k} x_{i, k}}}\right)^{1-y_{i}} \end{aligned} \]

  • \(\beta_0\) is the expected value of the logarithm of odds when all predictors are zero. It can be transformed to probability with \((1+exp(\beta_0)/(1+exp(\beta_0))\). The result corresponds to the expected probability of belonging to class 1 when all predictors are zero.

  • The coefficients \(\beta_{k}\) indicate the change in the \((\log \left( \dfrac{\pi}{1-\pi} \right)\) caused by the change by one unit in the value of \(x_{k}\), while the \(\exp(\beta_{k})\) defines the change in the odds ratio, \((\dfrac{\pi}{1-\pi}),\) caused by the change by one unit in the value of $x_{k}.

  • If \(\beta_{k}\) is positive, \(\exp(\beta_{k})\) will be greater than 1, that is, \(\dfrac{\pi}{1-\pi}\) will increase.

  • If \(\beta_{k}\) is negative, \(\exp(\beta_{k})\) will be smaller than 1, and \(\dfrac{\pi}{1-\pi}\) will decrease.

  • The change in the probability \(\pi\) caused by a one-unit change in the value of \(x_{k}\) is \(\beta_{k}\left(\dfrac{\pi}{1-\pi}\right),\) i.e., it depends not only on the coefficient, but also on the probability level from which the change is measured.

Since the relationship between \({P}(Y=1)\) and \(\mathbf{x}\) is not linear, the regression coefficients \(\beta_k\) do not represent the change in the probability of \(Y\) associated with increasing by one unit of \(x_k\).

How much the probability of \(Y\) increases per unit of \(x_k\) depends on the value of \(x_k\) , i.e., the position on the logistic curve in which it is located.

This is a very important difference with respect to the interpretation of the coefficients of a linear regression model.