Statistical Inference

This course aims to estimate and draw conclusions about the characteristics of one or multiple populations based on samples. It introduces students to the design, implementation (using Python), and interpretation of various statistical tests to provide a foundation for selecting data analysis models.

Prerequisites

It is necessary to have prior knowledge in Probability. Therefore, it is advisable to have a solid understanding of the following:

Educational Goals

This course is designed to provide a quick introduction to statistical inference for data engineers. As such, it meets different educational objectives:

Course Program

The course is divided in multiple parts of varying length (each symbol approximately corresponds to 1.5 hour)

- Tutorial class on the concepts of statistical inference ( Slides)
- Tutorial class on tests of conformity ( Slides)
  • Principles of Mean, Variance, and Proportion Conformity Tests
  • Distributions: Student’s t-distribution, Chi-squared, Bernoulli

- Tutorial class on tests of homogeneity ( Slides)
  • Principles of Mean, Variance, and Proportion Homogeneity Tests
  • Distributions: Student’s t-distribution, Fisher
  • Tests: Fisher, Levene, Student, Welch

- Tutorial exercices
- Lab exercise: The Italian Grand Prix ( Jupyter notebook)
  • Descriptive statistics
  • Conformity tests
  • Data visualization

- Tutorial class on paired samples tests ( Slides)
  • Contingency tables, Mc Nemar's test statistic

- Tutorial class on independence tests ( Slides)
  • Chi square test of independence

- Lab exercise: UFO Sightings ( Jupyter notebook)
  • Extracting statistical traits
  • Independence tests
  • Data visualization

Course Materials

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
Creative Commons License