Stats, Sampling, and Research Design on the SAT

One of the biggest changes to the new SAT is the addition of a good number of statistics and research design questions. Although most SAT math is now algebra, statistics have surpassed geometry as the second most tested math type.

The problem? Students learn very little about statistics in school. And, even students who take AP stats usually take it AFTER they take the SAT.

So, what do students need to know?

Critically, all students should be able to read all kinds of graphs and charts: pie charts, line charts, bar charts, scatterplots, stem-and-leaf plots, box-and-whisker plots, frequency tables, and two-way frequency tables.

Students should know that when they read a chart or graph, they should make sure to read the title and the colum/row headers and/or axes labels. Often our assumptions about what's in a graph cloud our ability to see what is actually in a graph or chart.

Students should be able to extra data from these charts and graphs and then calculate and compare measures of central tendency from the data (mean, median, and mode) as well as the minimum, maximum, and range.  Students do not need to be able to calculate standard deviation, but they should understand it conceptually (and be able to say if a dataset has a large or small standard deviation).

Students should also be able to calculate probability -- including probability based on information provided in a chart or table.

These questions are all clearly math.  Where SAT goes a little off the rails is that it also tests research design strategies. Now, it's true that the California middle school math curriculum includes a day or two on sampling: random sampling vs. convenience sampling vs. snowball sampling, etc.  But almost none of our students really understands these lessons (perhaps because their teachers don't really understand or perhaps because the kids just don't know enough about research to get it, even with a great teacher).  I didn't learn any of this information until graduate school (specifically, when I took, and then TAed a graduate level research design class!).

So, I, personally, welcomed the changes to SAT. Understanding research is a good life skill. And, I feel comfortable teaching it.  But, it's been interesting to watch SAT tutors as they assimilate the new informatino and figure out the best ways to teach it. Our next few BoostBlog posts will revolve around the stats questions on the SAT.

As food for thought, considering the following question, one that is from a released SAT test, and that some students can figure out using common sense (but they still flail as they try to figure out why they know the answer), but that is baffling to others that don't have a sense of research design or sampling.

The members of a city council wanted to assess the opinions of all city residents about converting an open field into a dog park.  The council surveyed a sample of 500 city residents who own dogs. The survey showed that the majority of those sampled were in favor of the dog park. Which of the following is true about the city council's survey?

1. It shows that the majority of city residents are in favor of the dog park.
2. The survey sample should have included more residents who are dog owners.
3. The survey sample should have consisted entirely of residents who do not own dogs.
4. The survey sample is biased because it is not representative of all city residents.

What do you think?

When you pull a random sample of a population, you can generalize to the population you sampled from.  So a random sample of 500 dog owners would allow the city to say that the majority of dog owners favor the dog park. But the sample in the question above was not random -- and even if it was, it could not generalize to all city residents (because it didn't pull from a population of all city residents).  It's hard to say who the sample should have included (you include those from the population you want to talk about), but it's definitely biased.  And it's the worst kind of biased: it contains a bias that is related to the survey question. This is not a good survey question.  So, the answer is D.