

Hypothesis Testing II
Megan Ayers
Math 141 | Spring 2026
Monday, Week 8
Midterm revisions
Practice framing research questions in terms of hypotheses
Practice defining null distributions
1 vs 2 sided tests
If you scored less than 80% on the midterm you have the opportunity to get some points back. Parameters:
Present research question and identify hypotheses
Describe Null distribution
Obtain data, calculate relevant Test Statistic
Calculate the P-value
Use the P-value to make a conclusion on the research question

Psychologists Bem and Honorton conducted extrasensory perception studies:
Out of 329 reported trials, the “receivers” correctly identified the object 106 times.
Psychologists Bem and Honorton conducted extrasensory perception studies:
Out of 329 reported trials, the “receivers” correctly identified the object 106 times.
Discuss with neighbor(s):
1. What is the relevant null hypothesis (\(H_0\)) and alternative hypothesis (\(H_a\))?
2. Describe Null distribution. How might we simulate it? (Hint: Recall the card guessing example)
3. What is our relevant Test Statistic?
4. What does the P-value represent here? How might we calculate it, given step 2?
5. How would we use the P-value to make a conclusion on the research question?
Steps to simulate this:
set.seed(123)
guesses <- c(0, 0, 0, 1) # Like a 4-sided dice with sides 0, 0, 0, 1
null_stats <- data.frame(correct = guesses) %>%
rep_sample_n(size = 329, replace = TRUE,
reps = 10000) %>%
group_by(replicate) %>%
summarize(p_hat = mean(correct))
ggplot(null_stats, aes(x = p_hat)) + geom_histogram(color = "white")
Steps to simulate this:
set.seed(123)
guesses <- c(0, 0, 0, 1) # Like a 4-sided dice with sides 0, 0, 0, 1
null_stats <- data.frame(correct = guesses) %>%
rep_sample_n(size = 329, replace = TRUE,
reps = 10000) %>%
group_by(replicate) %>%
summarize(p_hat = mean(correct))
ggplot(null_stats, aes(x = p_hat)) + geom_histogram(color = "white")
Steps to simulate this:
set.seed(123)
guesses <- c(0, 0, 0, 1) # Like a 4-sided dice with sides 0, 0, 0, 1
null_stats <- data.frame(correct = guesses) %>%
rep_sample_n(size = 329, replace = TRUE,
reps = 10000) %>%
group_by(replicate) %>%
summarize(p_hat = mean(correct))
ggplot(null_stats, aes(x = p_hat)) + geom_histogram(color = "white")
Steps to simulate this:
set.seed(123)
guesses <- c(0, 0, 0, 1) # Like a 4-sided dice with sides 0, 0, 0, 1
null_stats <- data.frame(correct = guesses) %>%
rep_sample_n(size = 329, replace = TRUE,
reps = 10000) %>%
group_by(replicate) %>%
summarize(p_hat = mean(correct))
ggplot(null_stats, aes(x = p_hat)) + geom_histogram(color = "white")

We got a pretty small p-value (\(0.0013\)). Hooray! It is reasonable to reject the null hypothesis.
But really, do we believe that ESP is real?
Two important words in data analysis:
Reproducibility: If I give you the raw data and my write-up, you will get to the exact same final numbers that I did.
Quarto Documents, we are learning a reproducible workflow.Replicability: If you follow my study design but collect new data (i.e. repeat my study on new subjects), you will come to the same conclusions that I did.
Say we’re flipping a coin 20 times, and (for some reason) we are worried that it’s not a “fair” coin
Let \(p=\) probability of getting a heads with this coin
Q: What is the null hypothesis?
Possible alternative hypotheses:
We’ve been thinking about the first two cases (1 sided tests). Let’s see what changes if \(H_a: p \neq 0.5\) (2 sided test).
Present research question and identify hypotheses (\(H_0\) and \(H_a\))
Describe “Null” distribution
Obtain data, calculate relevant “Test Statistic”
Calculate the “P-value”
Use the P-value to make a conclusion on the research question

Note the effect that our choice of \(H_a\) had on our p-values!
Can you tell if a mouse is in pain by looking at its facial expression? A recent study created a “mouse grimace scale” and tested to see if there was a positive correlation between scores on that scale and the degree of pain (based on injections of a weak and mildly painful solution). The study’s authors believe that if the scale applies to other mammals as well, it could help veterinarians test how well painkillers and other medications work in animals.
Q: Write out \(H_0\) and \(H_a\) qualitatively (in words).
Q: Write out \(H_0\) and \(H_a\) in terms of population parameters. (Hint: recall that we use \(r\) for correlation)
Q: Describe how we’d expect the data to behave under \(H_0\).
Can a simple smile have an effect on punishment assigned following an infraction? In a 1995 study, Hecht and LeFrance examined the effect of a smile on the leniency of disciplinary action for wrongdoers. Participants in the experiment took on the role of members of a college disciplinary panel judging students accused of cheating. For each suspect, along with a description of the offense, a picture was provided with either a smile or neutral facial expression. A leniency score was calculated based on the disciplinary decisions made by the participants.
Q: Write out \(H_0\) and \(H_a\) qualitatively (in words).
Q: Write out \(H_0\) and \(H_a\) in terms of population parameters (Hint, we can write the population mean leniency score for smilers as \(\mu_S\)).
Q: Describe how we’d expect the data to behave under \(H_0\).
grimace column of our datagrimace and pain in the shuffled dataset.seed(111)
df <- data.frame(pain = runif(100, 0, 10),
grimace = rnorm(100, 10, 2))
null_stats <- df %>% select(grimace) %>%
rep_sample_n(size = 100, replace = FALSE,
reps = 5000) %>%
add_column(pain = rep(df$pain, times = 5000)) %>%
group_by(replicate) %>%
summarize(stat = cor(grimace, pain))
ggplot(null_stats, aes(x = stat)) +
geom_histogram(color = "white", fill = "gray60") +
geom_vline(aes(xintercept = 0), lty = 2,
col = "black", lwd = 1) +
theme_classic()
treatment column of our dataset.seed(111)
df <- data.frame(treatment = rep(c("smile", "neutral"), times = 30),
score = round(runif(60, 0, 10)))
null_stats <- df %>% select(treatment) %>%
rep_sample_n(size = 60, replace = FALSE, reps = 5000) %>%
add_column(score = rep(df$score, times = 5000)) %>%
group_by(replicate, treatment) %>%
summarize(ybar = mean(score)) %>%
pivot_wider(names_from = treatment, values_from = ybar) %>%
mutate(stat = smile - neutral)
ggplot(null_stats, aes(x = stat)) +
geom_histogram(color = "white", fill = "gray60") +
geom_vline(aes(xintercept = 0), lty = 2, col = "black", lwd = 1) +
theme_classic()
