This piece is part of a New York Times debate featuring eight experts on the merit of using value-added tests to evaluate teachers.
With students everywhere being tested annually for academic progress, it may not be a surprise that the data would eventually be used to evaluate the effectiveness of their teachers.
Michelle A. Rhee, the schools chancellor in Washington, fired about 25 teachers this summer based in part on their poor ratings from a "value-added" analysis of scores -- an increasingly popular and controversial method of rating teacher performance.
The Los Angeles Times recently published value-added ratings for 6,000 third-, fourth- and fifth-grade teachers based on students’ English and math test scores over seven years. The analysis looks at individual students' past test performance and projects how they should do the next year. The difference between the child's actual and projected results is the estimated "value" that the teacher added during the year.
Critics say this approach paints an unfair picture, and can be dependent on the kinds of students a teacher is assigned. But supporters say it is merely one factor that should be considered in evaluating teachers and identifying those who need training and help. How should this information be used? What are the strengths and pitfalls of this kind of measurement? If it has flaws, can it be improved and made into a worthwhile tool?
Teacher evaluation was a fly-by operation when I was a high school English teacher 30 years ago, and it has improved little in most districts since. So I understand why there is such enthusiasm for evaluating teachers based on their students' test score gains, now that such data are available.
Evaluating and rewarding teachers primarily on the basis of state test score gains creates disincentives for teachers to take on struggling students.
Unfortunately, as useful as new value-added assessments are for large-scale research, studies repeatedly show that these measures are highly unstable for individual teachers. Among teachers who rank lowest in one year, fewer than a third remain at the bottom the next year, while just as many move to the top half. The top rankings are equally unstable. In fact, less than 20 percent of the variance in teachers' effectiveness ratings is predicted by their ratings the year before. This is why the National Research Council has said that this evaluation system "should not be used to make operational decisions because such estimates are far too unstable to be considered fair or reliable."
The reasons are simple. Test score gains are caused by many variables in addition to the teacher: students' learning and language background, attendance, supports at home, previous and current teachers, tutors, curriculum materials, class sizes and other school resources. Out-of-school time matters too. Summer learning loss accounts for more than half the achievement differential between high- and low-income students. Thus, researchers have found that the very same teacher looks more "effective" when she is teaching more advantaged students -- and less effective when she teaches more students who are low-income, new English learners, or who have special education needs.
Tragically, evaluating and rewarding teachers primarily on the basis of state test score gains creates disincentives for teachers to take on struggling students, just as accountability systems that rate doctors on their patients' mortality rates have caused surgeons to turn away patients who are very ill. While scores may play a role in teacher evaluation, they need to be viewed in context, along with other evidence of the teacher's practice.
Better systems exist -- like the career ladder evaluations in Denver and Rochester, the Teacher Advancement Program and the rigorous performance assessments used for National Board Certification, all of which link evidence of student learning to what teachers do in teaching curriculum to specific students. These systems also help teachers improve their practice -- accomplishing what evaluation, ultimately, should be designed to do.
A new book, Global Education Reform: How Privatization and Public Investment Influence Education Outcomes, provides a powerful analysis of these different ends of an ideological spectrum – from market-based experiments to strong state investments in public education.