The ΟΒ² test-statistic quantifies the difference between observed and expected counts of categorical variables. By comparing an observed \(\chi^2\) value to its null distribution (generated by simulation, permutation, or math) we can caclulate p-values and conduct NHST.
Chatbot tutor
Please interact with this custom chatbot (link here). I have made to help you with this chapter. I suggest interacting with at least ten back-and-forths to ramp up and then stopping when you feel like you got what you needed from it.
Practice Questions
Try these questions! By using the R environment you can work without leaving this βbookβ. I even pre-loaded all the packages you need!
SETUP: Sometimes female Latrodectus hasselti (redback spiders) eat their mates. Is there anything in it for the males? Maydianne Andrade tested the idea that eating a male might prevent her from re-mating with a second male β that is if sheβs too preoccupied eating/digesting her mate that sheβs not looking to mate again. She observed whether a female accepted a second male after the first male either escaped or was eaten. link.
I have these data:
In long format with three columns (first_male, second_male, and count), and four columns in a tibble called long_spider.
In wide format as a contingency table with three columns (first_male, Accepted, and Rejected (both refer to second male)), with eaten and escaped as count data, in a tibble called widespider. first_male Accepted Rejected
Q1) This study is
Q2) Plot the data. Which trend is apparent?
Q3) Assuming independence, how many cases do you expect when the first male is eaten and the second male is accepted?
Q5) What is the (two-tailed) alternative hypothesis?
wide_spider |>select(-1)|># remve the label column that does not have numberschisq.test()
Warning in stats::chisq.test(x, y, ...): Chi-squared approximation may be
incorrect
Pearson's Chi-squared test with Yates' continuity correction
data: select(wide_spider, -1)
X-squared = 11.28, df = 1, p-value = 0.0007836
Q6) Given our test,
π Glossary of Terms
Glossary: ΟΒ² (Chi-Squared) Tests
Chi-Squared (ΟΒ²) Statistic: A measure of how far observed counts differ from expected counts under a null model. Larger ΟΒ² values indicate greater deviation from expectation.
Chi-Squared Distribution: A probability distribution that describes the sampling distribution of the ΟΒ² statistic when the null hypothesis is true. It depends only on the number of degrees of freedom.
Degrees of Freedom (df): The number of independent values that can vary in a dataset after certain constraints are applied. For a ΟΒ² test, this typically equals the number of categories minus one (for goodness-of-fit) or (rows β 1) Γ (columns β 1) for contingency tables.
Goodness-of-Fit Test: A ΟΒ² test used to assess whether observed frequencies across categories differ significantly from expected frequencies based on a specific theoretical distribution.
Contingency (or Independence) Test: A ΟΒ² test used to evaluate whether two categorical variables are independent. Observed frequencies in a contingency table are compared to the frequencies expected if there were no association.
R Packages Introduced
There a re no new packages, but we continue to use infer: this time for permutation.
π οΈ Key R Functions
Hereβs the matching summary for the chi-square functions:
chisq.test(): Performs a chi-square test (ether a goodness-of-fit, or a contingency test). Returns the ΟΒ² statistic, degrees of freedom, and p-value.
pchisq(): Looks up p-values or cumulative probabilities from the theoretical chi-square distribution. Used to compute p-values from a \(\chi^2\) calculation/ Example: