Skip to the content.

(2024/2025) Applied Statistics

The notes are taken from the books required for the course:

You can view/download the PDF here. In the notes folder, you can also see the source code.

For any issue, use the appropriate section.

Course Syllabus

According to the official course syllabus:

  1. Introduction to statistical learning. The bias-variance tradeoff. Maximum Likelihood Estimation.
  2. Curse of dimensionality and dimension reduction. Principal Component Analysis and its probabilistic counterpart. PCA by singular value decomposition.
  3. Unsupervised classification. Hierarchical clustering, K-means clustering, Gaussian mixture models and the EM algorithm.
  4. Supervised classification. Linear and Quadratic discriminant analysis.
  5. Linear Models. Simple and multiple linear regression. Fitting the model via ordinary least squares, assessing the accuracy of the coefficient estimates, assessing the accuracy of the model, prediction intervals. Qualitative predictors and interactions.
  6. Logistic regression
  7. Model selection and regularization: subset selection, shrinkage methods (ridge regression and lasso), dimension reduction methods
  8. Resampling methods. Cross-validation. The bootstrap.
  9. Tree-based methods. Classification and regression trees. Bagging, random forests.