Introduction
Upon entering the study of inferential statistics, it is imperative to understand that we may never conduct a study on EVERY individual. It just isn’t possible. The entire concept of inferential statistics is to collect a small sample, hopefully one that represents our entire population well, but on a smaller scale, and then use our theorems and calculations on that sample to estimate or think about something about our population. Thus, Topic 5.4 of AP Statistics is an important one—we will talk about estimators. It is a tool to help us make an educated guess about a population.
This topic is crucial in AP Statistics because it forms the foundational thinking of all statistical inference. Get this topic down, and you’ll be ready to conquer the rest of AP Statistics!
What is an Estimator?
An estimator is a rule or formula that tells us how to calculate an estimate of a population parameter using sample data. Like a machine, we put our input into it (the sample data), and our machine (the estimator) will crank out a product for you to grasp (the estimate).
For example, if we want to estimate the population mean , we use the sample mean as our estimator. Note the distinction between the means of the sample and population. The formula is the estimator, and the numerical value we get from our specific sample is the point estimate.
Point Estimates
A point estimate is a single numerical value that serves as our best guess for a population parameter. The key distinction is:
- Estimator: The general method or formula (e.g., , ). Think of this as the THING that helps you look at the population.
- Point Estimate: The specific numerical result from applying that formula to sample data (e.g., , ). Think of this as the VALUE that helps you look at the population.
Biased vs. Unbiased Estimators
One of the most important properties of an estimator is whether it is biased or unbiased.
Unbiased Estimators
An estimator is unbiased if, on average across all possible samples of the same size, the value of the estimator equals the population parameter. Mathematically, we say an estimator is unbiased if its expected value equals the parameter:
This doesn't mean that any single sample will give us the exact parameter value. Rather, if we could take infinitely many samples and calculate the estimator for each one, the average of all those estimates would equal the true parameter.
Key insight: An unbiased estimator does not systematically underestimate or overestimate the population parameter. Some individual samples might produce estimates that are too high, others too low, but on average, they balance out to the true value.
Examples of Unbiased Estimators
- Sample mean is an unbiased estimator of population mean
- Sample proportion is an unbiased estimator of population proportion
- Sample variance is an unbiased estimator of population variance (note the denominator!)
Why Some Estimators Are Biased
In AP Statistics Unit 5 & 6, we focus heavily on categorical data and proportions. The sample proportion is our primary point estimator for the population proportion .
Calculating Sample Proportion
The sample proportion is calculated as:
where:
- = number of successes in the sample
- = sample size
Example: In a random sample of 200 students, 78 prefer online learning. The sample proportion is:
This value of is our point estimate for the true proportion of all students who prefer online learning.
Why is Unbiased
The sample proportion is an unbiased estimator of because:
If we repeatedly took random samples of the same size from the population and calculated for each sample, the average of all those sample proportions would equal the true population proportion .
Interpreting Point Estimates in Context
When stating a point estimate, always include proper context:
Poor: "The estimate is 0.62"
Good: "Based on this sample, we estimate that (or ) of all registered voters in the state support the proposition."
The interpretation should identify:
- What parameter is being estimated
- The population being studied
- The specific context of the problem
Understanding Variability in Estimates
An important concept is that point estimates vary from sample to sample. If we took a different random sample of the same size, we would likely get a different point estimate. This variability is natural and expected—it's why we need inference procedures (coming in later topics) to quantify our uncertainty.
For now, recognize that:
- A point estimate is our single best guess
- Different samples yield different estimates
- An unbiased estimator doesn't guarantee accuracy for any one sample, but rather accuracy on average across all possible samples
