Why CSAP Sucks
By Christian Piatt

(Originally published in PULP)

If the ridiculous school supply and uniform bills weren’t enough to signal the beginning of school, there are plenty of other signs that the academic season is upon us: nervous-looking kids; slightly euphoric parents; bulging backpacks and the telltale crossing guards posted at strategic locations around town.

We also know it’s back-to-school time since we’re finally getting a glimpse of the CSAP test results from last year. The CSAPs – which stands for “Colorado Student Assessment Program” – is given to most students on most grades throughout the state, supposedly to track student progress. A love child of George Bush and his No Child Left Behind legislation, the CSAPs and similar testing batteries across the nation have drawn mixed reviews.

In general, the sentiment toward the tests is negative, but the problem is most folks agree we should have some sort of accountability for student achievement; the problem is that no one seems to have a clue about how to make the tests better.

For starters, the tests historically have compared apples to oranges, holding one third-grade class’ scores up against the third-graders that follow them the next year and so on from grade to grade. But aside from any kids who failed and had to repeat a grade, these are entirely different students, so it’s impossible to get much useful data this way.

Recently, the bureaucrats and administrators have wised up at least a little, and they’re now tracking cohorts. This means we get to see data from one group of students as they progress throughout their academic career. But this still has huge flaws, particularly in a highly mobile community like Pueblo. In some schools, where the mobility rate exceeds 100 percent, most of theses aren’t the same kids from beginning of year to end, let alone from one year to the next.

A more reasonable solution is to implement a longitudinal system that follows each individual students from kindergarten to graduation. This would require more consistency from state to state, but it’s really the only way to use the tests to tell if a particular student is where they need to be or not.

Another issue is the test’s sensitivity, on two levels. First, though some strides have been made to try and make the tests culturally sensitive, there are still issues surrounding the assumption of prior knowledge, much of which comes from a middle class, primarily Anglo background. Simply put, middle class kids have seen and done more than poorer kids, which gives them an advantage over kids who may have never left their home town.

A second sensitivity problem is more technical, primarily regarding the higher and lower extremes of the scale. In general, all we hear about is whether or not a kid performs at or above the “proficient” level, which constitutes two of the four possible quartiles within which scores can fall. Each school can see scores in a bit more detail, but for a child who began as a non-English speaker, or as functionally illiterate, a gain of a year or more may not even create a blip on the score chart. Some concessions are made for “special needs” students, but this hardly addresses the fundamental flaw, which is a test that is akin to taking a chainsaw into surgery.

Finally, there’s the problem of what the tests actually measure. The testing protocols, which are timed, try to tell if a child has mastered a set of skills necessary to solve a problem, whether it is a math proof or answers at the end of a reading passage. For the kids who get the right answers, all is well, but for the rest, the tests really tell us nothing.

For example, say a child misses all five questions at the end of a story passage. Though we can see they got all the wrong answers, did they fail because they didn’t understand the story? Maybe they misunderstood the questions? Or perhaps the directions for what to do in the first place? Did they read too slowly to even get to the questions? Did they have so many words they could not decode in the story that they lost the story’s point? Did they lack the vocabulary to comprehend three dozen words in the first few paragraphs?

We have no idea.

That’s because these are achievement tests, which do just that: measure overall achievement. If, however, we really wanted to mind some valuable data from this effort, we should be conducting diagnostic assessments. This not only tells you where a child does well, but where, across the board, they are weak. This helps teachers target the low points so that the entire end-result can come up, and so that some problems for which kids may compensate early in their school careers don’t suddenly blow up in their faces come junior high or high school.

Some are calling for the whole testing concept to be trashed, which would be a mistake. The problem isn’t that we’re testing our kids; it’s that we don’t know how or why. For now, though, the CSAPs and their counterparts in others states score well below proficient.