Welcome to Stats in the Wild! Most people think of statistics as a dry list of formulas, but in reality, it is the language of evidence. This semester, Professor Baker is challenging you to stop being a passive consumer of data and become an active researcher. You won't just be solving for $x$ in a vacuum; you will be conducting a complete statistical study to answer a question that actually matters to you.
The Four Pillars of Data Science
To ensure the integrity of your research, your project will move through four critical stages:
- Inquiry: Asking a question that can actually be measured (e.g., "Do athletes have faster reaction times?").
- Collection: Gathering data without "stacking the deck." You need to minimize bias to ensure your sample represents the population.
- Visualization: Transforming a messy list of numbers into a story using histograms and box plots.
- Inference: Using a hypothesis test to decide if your results are a real phenomenon or just a lucky coincidence.
The Mathematics of Evidence
Once you have collected your data (aiming for a sample size of $n \geq 30$ to satisfy the Central Limit Theorem), you will dive into the math. Your "Testing Checklist" includes:
- Descriptive Statistics: Calculate the mean ($\bar{x}$) and standard deviation ($s$).
- The Test Statistic: Calculate your $t$-score or $z$-score depending on your data type.
- The P-Value: Think of this as the "Fluke Factor." A P-value of $0.03$ means there is only a $3\%$ chance your results happened by pure luck.
Ultimately, you will answer the most important question in science: Is the effect I'm seeing statistically significant, or is it just noise? Remember, in science, failing to reject the null hypothesis is just as valid as rejecting it—rigor matters more than being "right"!