ATF: An Extended Response

I was unable to fully respond to the task force’s rationale for recommending adoption of the Smarter Balanced assessments in my written dissent for our report, so I am going to share some additional comments here.

The task force notes that the Smarter Balanced assessments were written specifically to address the Common Core. The Next Generation Iowa Assessments were also written to assess the Common Core standards and ITP would be able to develop questions to address the non-Common Core portions of the Iowa Core standards if requested to do so by the state.

The task force notes that the computerized assessments can make selected response results quickly available to educators, students, and parents. I am not sure there is much value in releasing partial scores, particularly to students and parents. It also isn’t clear how much faster the final scores will be released compared to the Iowa Assessments, as it is expected to take several weeks to score the performance task items.

The task force sees value in the performance task items in measuring higher order thinking skills. There are questions about just how much higher order thinking a standardized assessment can realistically measure. Consider this passage from page 15 of H. Wu’s Assessments for the Common Core Mathematics Standards:

Well-chosen constructed response items can assess sequential thinking, of course, but their presence on high-stakes tests is severely limited by the need to make the tests easily and cheaply gradable. Consequently, such constructed response items are constrained to be not too hard, and they are also usually broken down into smaller pieces so that each piece becomes easier to grade. This defeats the purpose of assessing sequential thinking.

Then consider, as an example, this sample Smarter Balanced assessments grade six math performance task item (click to make bigger):


If we look at the scoring guide, we can see an example of what H. Wu has described. The students have been led through the steps to solving the problem:

  • calculate the volume of the current box
  • label a net (diagram of a flattened box) with the dimensions of the current box
  • determine the surface area of the current box
  • determine whether proposed dimensions meet the company’s criteria (at this point, students would have to realize that they should do the first three steps of the problem again using the proposed dimensions, then compare the results to the criteria)
  • design a new box to meet the company’s criteria (at this point, students would have to realize that they should do the first three steps of the problem again using numbers that they have chosen until they come up with numbers that meet the company’s criteria)

This performance task requires less higher order thinking of students than it would if they had to figure out the steps–or even how to calculate the surface area of the box without being provided the diagram–for themselves. (See more sample performance task scoring guides and classroom activities here.) Are these performance task items worth the 4 to 4.5 hours they add to the administration time of the assessments (and subtract from instructional time)? These might be better than a selected response version of the same series of questions, but they fall short of more open ended classroom assignments that could allow students to generate their own questions (perhaps on something other than boxes) and to do authentic research, writing, and presenting about them.

The task force asserts that having a shared system of formative assessment practices and interim assessments allows for powerful collaboration that has the potential to transform teaching and learning for our students. I think that the task force may be overestimating the value of standardization. First, we already have common standards. Second, it is hard to see why teachers can’t talk to each other about how to teach division with fractions or persuasive writing just because they use different classroom quizzes or tests. I will allow that using Smarter Balanced interim assessments might make it easier for teachers to collaborate on how to teach students how to satisfy Smarter Balanced scoring rubrics, but I’m not convinced that is a worthy goal in and of itself.

The task force relies on SBAC’s assertion that a six hundred student middle school would only need thirty computers to test all of its students. I think this underestimates both the technological challenges of the move to statewide online testing and the logistical challenges of scheduling a large number of students for lengthy, yet un-timed, computer-based tests on a relatively small number of computers. I have written about this topic before, but the stories keep coming: Colorado school districts debate move to online state tests and Minnesota schools hit glitches with online testing (HT: Diane Ravitch).

Finally, I don’t share the task force’s optimism that the state will fund all of the necessary technology upgrades, the IT staff, the professional development, and the full suite assessments, or that those funds would come without a reduction to supplemental state aid. I would love to be proved wrong on this point, though, if our school district were to receive this funding windfall I’d still rather it were used to restore (and expand) programming (elementary orchestra, German language) lost in the last round of budget cuts.

Ultimately, for me, it comes back to the points I did make in my written dissent: we can’t know whether the Smarter Balanced assessments are worth the additional costs when we haven’t quantified those costs, and in any case, these assessments divert more time and money from instructional programming than necessary for accountability purposes.

Chris Liebig at A Blog About School has also made extensive comments about the task force recommendations in his Des Moines Register/Iowa City Press Citizen guest opinion and additional notes.