In Defense of Standardized Testing

This series of articles is primarily concerned with standardized tests in compulsory education (Iowa Test of Basic Skills, PISA, TIMSS, PIRLS, NAEP). These tests differ from college entrance exams (ACT, SAT) in that, except for some state achievement tests, the tests tend to be low or no stakes for both the students and schools. 

Many educators have an aversion to standardized testing, and this is not without reason. Teachers spend an inordinate amount of time preparing their students for many of these tests and beyond that, these tests have led to a narrowing of the curriculum. This happens in the misguided attempt to focus on reading and math by reducing the time spent on science, social studies, art, etc (sometimes drastically!). This is misguided because, while it makes sense that you could increase these scores by spending more time on said subjects, doing so actually reduces background knowledge, which, after decoding, is the key to comprehension. 

But It Gets Worse

Standardized tests have been intentionally used by educators to exclude minorities. For one example, you can look into the case of Larry P, a black student in California who was wrongly sent into special education. You can also read this article from Time Magazine for an overview of the negatives.

Other times, the blind spots of the test writers caused them to discriminate against girls as Garcia and Pearson (1994) note,

“When girls outscored boys on the 1916 version of the test designers, apparently operating under the assumption that girls could not be more intelligent than boys, concluded that the test had serious faults. When they revised the 1937 version, they eliminated those items on which girls outperformed boys. By contrast, they did not revise or eliminate items that favored urban over rural children or children of professional fathers over children day laborers (Mercer, 1989); these cultural differences apparently matched developers’ expectations of how intelligence and achievement ought to be distributed across groups (Kamin, 1974; Karier, 1973a, 1973b; Mercer, 1989).”

Whether these blind spots are willful or simply ignorant is irrelevant for our purposes. What is important is that we acknowledge that this type of discriminatory bias is still a possibility in standardized tests today. 

Content Bias

This is the type of bias that is most often pointed out in standardized tests. Content bias is simply when the content of the test favors one particular culture over another, typically favoring the majority culture. This, by default, disadvantages minorities and so it is important to be able to counter content bias if we want standardized tests to be meaningful.

Thankfully, modern standardized test creators take bias seriously.

They “have used a variety of techniques to create unbiased tests (Cole & Moss, 1989; Linn, 1983; Oakland & Matuszek, 1977). Among others, they have examined item selection procedures, examiner characteristics, and language used on the tests as possible sources of bias. One of the most common methods used to control for test bias is that of examining the concurrent or predictive validity of individual tests for different groups through correlational or regression analysis.” (Garcia and Pearson, 1994).

For more detail on what this looks like in practice, read this EdSurge article. Managing content bias will always be a challenge, even with knowledge of history, advanced statistical tools, and a good heart.

Perverse Incentives

Many standardized tests also suffer from the Cambell effect. This simply means that when tests are important (high-stakes) for students or teachers, then it is more likely for the results to be corrupted by any number of means. 

Think about it, when teachers and schools are assessed based on their students’ performance, they will do what they can to look good. And when your job is on the line, you may be driven to take certain….“shortcuts”.

This often leads to the aforementioned narrowing of the curriculum, which disproportionately affects students in impoverished areas. 

On top of this, there are numerous cases of outright illegal behavior. Schools engaged in the practice of scrubbing, unenrolling students or encouraging a temporary truancy. There have also been cases of students being held back in grade 9 and then, after repeating said year, they jump up to grade 11, conveniently skipping the standardized tests (Koretz, The Testing Charade, Ch 5).

And then there are the cases of traditional cheating. The most famous of which is the disaster in Atlanta where 11 educators were given felony convictions and 22 other teachers reached plea agreements. 

We know that cheating is unfortunately not an isolated problem, it has been estimated that, on the low end, at least 5% of these high-stakes standardized tests involve cheating in some fashion (Jacob & Levitt, 2003).

Discrepancies in Test Scores

Poor students tend to score lower than wealthy students. Minority students tend to score lower than white students. This certainly should raise some red flags because it shows that there are real problems somewhere, though not necessarily with the test itself. Once we work to reduce the variables and compare students of different ethnicities who share a similar socioeconomic status and language level, the achievement gap is greatly decreased, but still significant (Garcia & Pearson, 1994), showing that there is at least one other, but likely multiple significant problems, somewhere.

The challenge here is two-fold. Is the primary problem with the standardized tests themselves or with unequal schools, differing home situations, etc? Both?

The Importance of Standardization

In America, 80% of teachers are white (NCES, 2019). Even if you choose to assume the best, it is foolish to assume that the average teacher is knowledgeable about every culture and can adequately adjust for content bias.

Standardization allows for a level of control over the bias because you only need to provide oversight to one group, not millions of teachers. In addition the makers of standardized tests are specifically trained to create them and to analyze them for bias. This doesn’t mean they are perfect, but they are certainly better at making tests and adjusting for bias than the average teacher.

The main value provided by standardized tests is that they give data. Without this data, we would not be aware of the discrepancies in performance based on race or income mentioned above.

Now, we tend to use the data in order to make excuses. “These disparities exist because of economic inequality, we really need to fix that.” And, true enough. But economic inequality is not relevant for teachers to do their job. Our job is to teach students as they are. We need to get results with the students we have in the schools we’re at. If you use a student’s social situation to excuse their lack of learning, get out of education. Social situations provide context, not excuses. 

The data shows where teachers and schools are failing to educate their students. The data shows where problems are. We should use this to help schools help children. We should use this data as a tool to help us identify successful teaching methods. If we get rid of standardized assessments, we also get rid of this data. To do so is to choose to make ourselves blind, not a wise choice.

The scope of the problem is huge. Are there valid alternatives to standardized testing? (coming soon)

America fails too many of her students, but it isn’t all doom and gloom, though there is a fair share of it. Just take a look at how her students perform (coming soon).

Part 1: In Defense of Standardized Testing
Part 2: Alternatives to Standardized Testing
Part 3: Standardized Tests: NAEP, PIRLS, TIMSS, PARCC, PISA, ITBS, and CLT

García, G. E., & Pearson, P. D. (1994). Chapter 8: Assessment and Diversity. Review of Research in Education, 20(1), 337–391.
Jacob, Brian A. and Steven D. Levitt. “Rotten Apples: An Investigation Of The Prevalence And Predictors Of Teacher Cheating,” Quarterly Journal of Economics, 2003, v118(3,Aug), 843-878.
Koretz, D. (2017). The Testing Charade: Pretending to Make Schools Better. University of Chicago Press.