Syllabus, Course Overview, and Diagnostic Test

Published

January 21, 2026

The plan for today is to review the syllabus, read the overview of the course below, and work on the diagnostic test. Please give yourself about 30 minutes to complete it. Of course, if you take longer we will not know, but you should think about why it is taking you longer, and review those topics that you think you might have forgotten.

Once you are done with the test, please upload it to Gradescope. Since this does not count towards your course grade, it is due on Thursday night and there is no extension possible. If you turn it in, you get \(0.25\%\) extra credit towards your final course average. If you don’t turn it in, it’s okay, these things happen. Don’t stress.

Course Overview

Probability vs Statistics

A schematic drawing showing the directions of probability and statistics.

When we study probability, we have some data generating process (perhaps we are sampling from a known distribution, or we are sampling at random from a box with \(N\) items etc), and we want to know something about the random sample that we generate, and its properties.

Now, when we study inference, we are given the sample, and we want to know more about the process that generated the data. That is, we want to infer something about the population, given the data that we observe.

Estimation

Simple Random Sampling

We might make no distributional assumptions about the population, and then try to estimate the population mean and variance and other parameters. This is the material we will study in Chapter 7, estimates from survey sampling. In this chapter we treat the estimate as a random variable that has a sampling distribution. This just means that each time we take a sample (of some fixed size) we get a different observed value of our random variable. All such possible values give us a distribution of the random variable (for example, the sample mean) which is called its sampling distribution.
While investigating these estimates, we will talk about the concepts of bias, standard error, and mean squared error, confidence intervals, and how we apply the central limit theorem. Here are two visualizations of the CLT, with the top row showing the sampling distribution of the sample mean of random samples of \(Unif(0,1)\) random variables for various \(n\). The bottom row does the same for a \(Gamma(2,1)\) distribution. You can see how even though we begin with distributions that are very far from a normal distribution, for larger \(n\), the distribution of the sample mean is approximately normal!

Resampling methods: We will use the bootstrap to estimate confidence bounds for our parameters. When possible, we will compare these confidence intervals to the ones obtained by classical methods.

Estimating distributional parameters

On the other hand, depending on what we know about the population, we might build models with distributional assumptions: for example, the number of goals in the soccer World Cup could be modeled using the Poisson distribution.
Or perhaps we might consider classification problems, maybe whether an item is defective or not. In this case we might use the Bernoulli distribution.
If we assume some distribution, then we will want to estimate the parameters of that distribution. Chapter 8 is about parameter estimation, in which we have observed data, and try to fit probability laws to this data. We will learn about maximum likelihood estimation, and the method of moments, see which method is preferred and why. We will develop the ideas of Bayesian inference, and define an efficient estimator, and prove the famous Cramér-Rao inequality.
We will also derive properties of the distribution of our Maximum Likelihood Estimator \(\hat{\theta}_{\mathrm{mle}}\), \[ \hat{\theta}_{\mathrm{mle}} = \operatorname*{arg\,max}_{\theta \in \Theta} \hat{L}_n(\theta;\mathbf{y}) \]
where \[ \sqrt{n}(\hat{\theta}_{\mathrm{mle}}-\theta_0) \rightarrow \mathcal{N}(0, \mathcal{I}^{-1}) \]

and \(\mathcal{I}\) is the Fisher information. All this is in Chapter 8 of our text, which is perhaps the most mathematical of the material we will cover.

Hypothesis Testing

This is the topic of Chapter 9. Hypothesis tests is a method of statistical inference in which we decide if the observed data supports a particular hypothesis. We will learn about the Neyman-Pearson paradigm. We also look at various tests, including goodness of fit, which we will also look at in Chapter 11. We will define the power of a test, and also look at the duality of hypothesis tests and confidence intervals.
We will further develop the ideas of hypothesis testing in Chapter 11, when we look at classical and nonparametric methods for two-sample tests, and in Chapters 12 and 13 we will see how ANOVA is used to compare multiple groups. Chapters 9 and 13 deal with categorical data, so we will look at these together.
We will also look at more computationally intensive resampling methods such as permutation tests.

Linear Models

Chapter 14 is about linear least squares. We will go over the simple regression model in some detail, but only briefly cover the more general treatment.

The Bayesian Paradigm

Most of the time, we will use a classical, frequentist approach to our analysis, in which parameters are fixed quantities, and probability statements are made only about sampled data and estimates.
In Bayesian statistics, the parameters are random, and we assume a distribution that reflects our beliefs about them. Then we update our beliefs given the observed data.
We compute posterior distributions (post-data) using Bayes’ Rule.

Image from Nieves “An Actual Introduction to Bayesian Statistics (2021)”, cantorsparadise.com

References

Blitzstein, Joseph K., and Jessica Hwang. 2019. Introduction to Probability. CRC Press.

Pimentel, Sam. 2024. “STAT 135 Lecture Slides.” Lecture slides (shared privately).

Rice, John A. 2006. Mathematical Statistics and Data Analysis. 3rd ed. Duxbury Press.

Wasserman, Larry. 2004. All of Statistics: A Concise Course in Statistical Inference. New York: Springer.