

Sampling Distributions II
Megan Ayers
Math 141 | Spring 2026
Monday, Week 6
Midterm studyguide posted on course website
We’ll work on the study guide in lab this week in groups - it would be helpful to take a look before lab but you don’t need to prepare any work
Discuss the framework for random sampling
Investigate properties of the “sampling distribution”
Week 4 feedback (if time)
Like with regression, need to distinguish between the population and the sample
We are interested in the value of a parameter in a population, and use a statistic from a sample to estimate the parameter.
Key question: How accurate is \(\widehat{p}\) as an estimate of \(p\)?
Sub-question: If we take many samples, how much will \(\widehat{p}\) vary?
The distribution of all possible values of \(\widehat{p}\) (for a fixed sample size) is called the Sampling Distribution
Steps to Construct a Sampling Distribution:
Decide on a sample size, \(n\).
Determine all\(^*\) possible samples of size \(n\) from the population.
Compute the sample statistic in each sample.
Steps to Construct an Approximate Sampling Distribution:
Decide on a sample size, \(n\).
Randomly select a sample of size \(n\) from the population.
Compute the sample statistic in that sample.
Put the sample back in.
Repeat Steps 2 - 4 many (1000+) times.
Last Class: got a “snapshot” of the sampling distribution for the mean card value from a sample of 10 cards.

“Snapshot” of sampling distribution for mean card value from a sample of 10 cards:

Why just a “snapshot”?
Q: How many elements are there in the sampling distribution? \[{52 \choose 10} = 15,820,024,220 \ \text{Samples}\]
We only took \(119\) samples (combining both sections)…
Consider a population of 4 students, where the (true) population proportion of March birthdays is \(p = \frac{1}{4} = 0.25\)
| name | bday_month |
|---|---|
| Joel | march |
| Maria | january |
| Arthur | april |
| Klaus | september |
| name | bday_month |
|---|---|
| Joel | march |
| Maria | january |
| Arthur | april |
| Klaus | september |
| sample | phat |
|---|---|
| Joel + Maria | 0.5 |
| Joel + Arthur | 0.5 |
| Joel + Klaus | 0.5 |
| Maria + Arthur | 0.0 |
| Maria + Klaus | 0.0 |
| Arthur + Klaus | 0.0 |
| name | bday_month |
|---|---|
| Joel | march |
| Maria | january |
| Arthur | april |
| Klaus | september |
| sample | phat |
|---|---|
| Joel + Maria + Arthur | 1/3 |
| Joel + Maria + Klaus | 1/3 |
| Joel + Arthur + Klaus | 1/3 |
| Maria + Arthur + Klaus | 0 |
For most sample statistics and sufficiently large sample sizes (\(n \geq 30\) is a rule of thumb), the sampling distribution will be approximately bell-shaped, even if the population is not
Both distributions will have the same center.
But, the sampling distribution will have lower variability than the population distribution.


What we want to know:

What we have:

What sampling distributions tell us about what we have:

The standard error (\(se\)) of a sample statistic measures variability between different samples.
For \(\approx\) bell-shaped distributions, about 95% of observations fall within two standard errors of the population’s mean \(\mu\).
Very useful implication!


Even though we just have one sample, if we know \(se\), we know how far away from the parameter our statistic is likely to fall!
Variability of the sampling distribution generally decreases as sample size increases


When \(n \geq 30\), the sampling distribution is usually bell-shaped. But sometimes, you need a large sample size.




What did we learn about sampling distributions?
Centered around the true population parameter.
As the sample size increases, the standard error (SE) of the statistic decreases.
As the sample size increases, the shape of the sampling distribution becomes more bell-shaped and symmetric.
If I am estimating a parameter in a real example, why won’t I be able to construct the sampling distribution?
How can we learn from the sampling distribution if we only have one sample?

