“All models are wrong, but some are useful.”

George E. P. Box

After reading this chapter you will be able to:

  • Understand the concept of a model.
  • Describe two ways in which regression coefficients are derived.
  • Estimate and visualize a regression model using R.
  • Interpret regression coefficients and statistics in the context of real-world problems.
  • Use a regression model to make predictions.

Why fit statistical (regression) models?

You have some data \(X_1,\ldots,X_p,Y\): the variables \(X_1,\ldots,X_p\) are called predictors, and \(Y\) is called a response. You’re interested in the relationship that governs them.

So you posit that \(Y|X_1,\ldots,X_p \sim P_\theta\), where \(\theta\) represents some unknown parameters. This is called regression model for \(Y\) given \(X_1,\ldots,X_p\). Goal is to estimate parameters. Why?

  • To assess model validity, predictor importance (inference)
  • To predict future \(Y\)’s from future \(X_1,\ldots,X_p\)’s (prediction)