Evaluating the Evaluation: A look at the validity of 3-8 assessment testing

School Is In

— In 2006, I was part of the group that conducted the final grading of the New York State Grade 8 ELA Assessment Tests (or so we thought). The deputy commissioner in charge guaranteed that even though he felt that the scores might be too high for the “general public” to accept, they would not be lowered except “at the highest administrative level.”

When the scores were publicly released, they had been lowered. When I confronted the deputy commissioner with this fact, he told me (and some 25 other members of the New York State English Council) that they had not been lowered.

He then backed up that less than truthful statement with 10 minutes of explanation as to the process that had resulted in lower scores that had “not been lowered.”

In the world of psychometrics (think test number-crunching), “validity” generally refers to the degree to which test data corresponds accurately to the real world. Therefore, the validity of a test would be the degree to which the test measures what it claims to measure.

With that in mind, the purpose of New York state assessment testing is to evaluate the progress of students as compared to the New York State Learning Standards (soon to be Common Core Standards) for specific subjects and grade levels. The tests were originally an assessment of student progress and potential, but have now taken on an additional role as a tool for evaluating teachers and principals and as a means to evaluate school performance over time.

In order to validly test for given subject matter over time, a test must be consistently administered based on certain standard means of doing so.

Consider that each year:

the month of administration varies, so the amount of testable material covered varies

the difficulty of the test varies

