Member-only story

Understanding Logistic Regression in R

Jake Jing
5 min readAug 18, 2022

--

1. Background

To be honest, logistic regression (LR) can be quite confusing, since it involves too many new terms, including odds, odds ratio, log odds ratio, log-odds/logit and sigmoid function. People with various backgrounds often explain LR in different ways. In statistics, LR is defined as a regression model where odds and odds ratio are first introduced and log-odds/logit transformation is explained later. But in machine learning, LR is mostly used for classification tasks (e.g., spam vs. not spam) and they (only) highlight the sigmoid activation function plus binary cross-entropy loss. They do not even explain what is odds ratio and log odds ratio in machine learning tutorials of LR.

Before we get started, I will summarize some key points in LR. If you already understand them, you are ready to skip to the next section.

  • You may know that odds calculate the ratio of probability of success over failure, p/(1-p);
  • The coefficients of the LR measures the change in log odds or logit, log(p/(1-p));
  • The intercept of LR measures the log odds of y being at 1 while fixing the other predictors x_i as 0;
  • The slope of LR measures the difference in the log odds for…

--

--

Jake Jing
Jake Jing

Written by Jake Jing

Programming, Data science & Deep learning!

No responses yet