Kalie's Test: A Workshop for Parents on Standardized Testing
Introduction Percentile and the Graph Factors in Test Scores Kalie's Story Testing and Your Child More Information

What does a test score mean?

As mentioned in the introduction, test scores give you a quick snapshot of what a student might know in a given subject area, usually presented in comparison to her peers. It doesn't tell you everything, of course. A standardized test in mathematics, like the one we're using in our example, doesn't tell you whether a child is smart, good at math, or likely to be good at math in the future. In fact there are many factors that can affect how a child achieved a given score on a given day.

Let's take another look at the interactive graph that we used in the last activity. You notice that when you moved the slider 'School skills' up, Kalie's percentile score increased significantly. School skills (Kalie's ability to learn in math class) is one factor that the histogram takes into account when figuring out Kalie's score. Raising her school skills improves her performance on the test. Let's look each at all of them, one at a time.

The sliders

First, you'll notice that all of the sliders are organized into groups. In the yellow box are factors related to Kalie, in the blue box are aspects of the test itself, in the purple box are attributes of the math curriculum Kalie's been exposed to, in the green box are standards and in the gray box is the performance of all the other kids.

Factors related to Kalie's knowledge, preparation and motivation

  • Out-of-school: Kalie's exposure to math learning at home and outside of math class
  • School skills: Kalie's ability to learn math in class, which could be represented by her grade in math
  • Test taking skills: Kalie's ability to use the structure of test questions to find the right answer when she's not completely sure
  • Luck, health: Kalie's luck and health on the day of the test; did she eat breakfast, for example?
  • Mindset: How does Kalie feel about tests? Does she care how she does?

Aspects of the test itself

  • Scope: How much material does the test cover?
  • Level: Is the test created at the right level for the students at Kalie's grade level?
  • Cultural bias: While developers of national tests have sophisticated methods for isolating and removing questions that favor students from a particular background, not all standardized tests go through such exacting processes. Also, some would argue that these methods can never eliminate all bias. This control is set by choosing a culture (a color) that the test favors and setting the slider to indicate how much bias exists.
  • Reliability: Because students get different scores when they take the same test more than once, most test scores are considered a little unreliable. In fact, test makers calculate exactly how reliable their tests are. Kalie, for instance, answers 15 questions correctly when the graph loads. The black bar above her head indicates how well she might have done if she took a similarly difficult test another time. She might have gotten anywhere from 13 to 17. Would that have made a big difference in her percentile score?

Curriculum

  • Scope: How much material is Kalie exposed to in the curriculum?
  • Standards-based: To what extent is the curriculum based on standards?
  • Teaching to the test: How closely is the curriculum tied to the test?

Standards

  • Scope: How much material do the standards cover?

Other kids

  • Other kids: On average, how do other students perform? If Kalie's performance improves in a year when everyone else also improves, her performance may still be average.

The overlapping squares

In the bottom right hand corner, you'll notice that there are 4 overlapping squares. When you make changes to the slider, the sizes and positions of the squares changes. Each square is affected by one group of sliders, and they're color coded to match. The yellow square represents Kalie's knowledge, so it's affected by all the sliders in yellow under 'Kalie.' How Kalie scores ultimately depends on how much her knowledge (the yellow square) overlaps with the knowledge represented in the test (the blue square). Keeping this in mind will help you understand why sliders have different types of effects.

Weightings

Not all sliders are created equal. We designed this model so that sliders have differing power in determining Kalie's score. For instance, improving her School Skills has much more effect than improving her test taking skills. Each slider has a different effect on Kalie's test scores. These weightings come from interviews we conducted with testing experts. Here are the weightings:

  • Out-of-school: 0.7
  • School skills: 0.9
  • Test taking skills: 0.4
  • Luck, health: 0.2
  • Mindset: 0.2
  • Scope: 0.5
  • Level: 1.0
  • Cultural bias: 0.6

Problems with this prototype

This interactive graph is still a bit of a prototype. Because of this not all features work exactly as they are intended. The most significant problems are these:

  • Returning sliders to previous states does not necessarily return Kalie's score to its previous states.
  • The order in which you change sliders matters, when it shouldn't.

To overcome these problems, we recommend moving any slider only one time. If you want to reset a slider, it's best to refresh the whole page in your browser and start over. Future versions of the model will overcome these and other deficiencies. If you come across any other problems or strange things please note them and let your workshop leader know; we intend to incorporate as much feedback as possible!

More information

For more information about the design of this model, please visit: http://ldt.stanford.edu/~migri/kalie/report/

     
About This Workshop Notes for the Workshop Leader