• 14B. \(\chi^2\) summary

Links to:

Summary. Chatbot tutor. Questions. Glossary. R packages. R functions. More resources.

Chapter summary

The χ² test-statistic quantifies the difference between observed and expected counts of categorical variables. By comparing an observed \(\chi^2\) value to its null distribution (generated by simulation, permutation, or math) we can caclulate p-values and conduct NHST.

Chatbot tutor

Please interact with this custom chatbot (link here). I have made to help you with this chapter. I suggest interacting with at least ten back-and-forths to ramp up and then stopping when you feel like you got what you needed from it.

Practice Questions

Try these questions! By using the R environment you can work without leaving this “book”. I even pre-loaded all the packages you need!

SETUP: Sometimes female Latrodectus hasselti (redback spiders) eat their mates. Is there anything in it for the males? Maydianne Andrade tested the idea that eating a male might prevent her from re-mating with a second male – that is if she’s too preoccupied eating/digesting her mate that she’s not looking to mate again. She observed whether a female accepted a second male after the first male either escaped or was eaten. link.

I have these data:

In long format with three columns (first_male, second_male, and count), and four columns in a tibble called long_spider.
In wide format as a contingency table with three columns (first_male, Accepted, and Rejected (both refer to second male)), with eaten and escaped as count data, in a tibble called widespider. first_male Accepted Rejected

Q1) This study is

Q2) Plot the data. Which trend is apparent?

There is no obvious trend Following cannibalistic mating females are more likely to accept a second suitor Following cannibalistic mating females are less likely to accept a second suitor

Q3) Assuming independence, how many cases do you expect when the first male is eaten and the second male is accepted?

\(P_{first male eaten}\) =9/32.
\(P_{second male accpeted}\) =25/32.

\(\frac{9}{32} \times \frac{25}{32} \times 32 = 7.03\)

Q4) What is the (two-tailed) null hypothesis?

There is no association being eaten & your mate rejecting a second male Being eaten is not associated with a lower chance of your mate rejecting a second male There is an association being eaten & your mate rejecting a second male

Q5) What is the (two-tailed) alternative hypothesis?

wide_spider |> 
  select(-1)|> # remve the label column that does not have numbers
  chisq.test()

Warning in stats::chisq.test(x, y, ...): Chi-squared approximation may be
incorrect


    Pearson's Chi-squared test with Yates' continuity correction

data:  select(wide_spider, -1)
X-squared = 11.28, df = 1, p-value = 0.0007836

Q6) Given our test,

We reject the null We fail to reject the null We accept the null We fail to accept the null

📊 Glossary of Terms

Glossary: χ² (Chi-Squared) Tests

Chi-Squared (χ²) Statistic: A measure of how far observed counts differ from expected counts under a null model. Larger χ² values indicate greater deviation from expectation.
Chi-Squared Distribution: A probability distribution that describes the sampling distribution of the χ² statistic when the null hypothesis is true. It depends only on the number of degrees of freedom.
Degrees of Freedom (df): The number of independent values that can vary in a dataset after certain constraints are applied. For a χ² test, this typically equals the number of categories minus one (for goodness-of-fit) or (rows − 1) × (columns − 1) for contingency tables.
Goodness-of-Fit Test: A χ² test used to assess whether observed frequencies across categories differ significantly from expected frequencies based on a specific theoretical distribution.
Contingency (or Independence) Test: A χ² test used to evaluate whether two categorical variables are independent. Observed frequencies in a contingency table are compared to the frequencies expected if there were no association.

R Packages Introduced

There a re no new packages, but we continue to use infer: this time for permutation.

🛠️ Key R Functions

Here’s the matching summary for the chi-square functions:

chisq.test(): Performs a chi-square test (ether a goodness-of-fit, or a contingency test). Returns the χ² statistic, degrees of freedom, and p-value.

chisq.test(table(species, visited), correct = FALSE)

pchisq(): Looks up p-values or cumulative probabilities from the theoretical chi-square distribution. Used to compute p-values from a \(\chi^2\) calculation/ Example:

pchisq(chi2_stat, df = 1, lower.tail = FALSE)

Additional resources

Videos:

Crash course statistics \(\chi^2\)