There are enough smart people expressing their strong views about the causes and effects of the Chicago strike that I hesitate to clutter the blogosphere with additional commentary. But I do want to share a few thoughts:
1) My sense is that the strike was symptomatic of a deep problem in American education, which I would summarize as the demise of trust brought on by the accountability movement that may have run amok. For a slightly longer discussion, see my Education Week commentary, due out in next week’s edition and available online probably by Monday or Tuesday. It was written before the strike began, and will appear after the strike was settled; nevertheless, I think the issues I discuss there are relevant – not just to Chicago but to the ongoing debate about evaluation and accountability.
2) With somewhat uncanny coincidence, just as the strike was ending, the Carnegie Foundation for the Advancement of Teaching was convening a group of education policy cognoscenti for a two-day conversation about teacher evaluation. In Tony Bryk’s opening to the conference he presented some fascinating data, drawn from the work of Richard Ingersoll and colleagues at Penn and included in recent reports of National Commission on Teaching and America’s Future (NCTAF). It seems that the modal number of years teachers remain in their jobs dropped from 15 in 1987-88 to 1 in 2007-08. That’s the national statistic; here in D.C., for example, that translates to meaning that roughly 10 percent of teachers leave teaching after one year on the job and that fewer than 5 percent have more than five or six years of experience. For more on this issue, see the NCTAF report, “Who Will Teach? Experience Matters.”
3) Bryk presented another interesting statistic. Based on his calculations (I’m going to ask him for information about the data and methodology), roughly 40 percent of teachers deemed to be performing poorly according to so-called “value added” measures are misclassified. In other words, the “false positive” rate is about .4. Maybe the “false negative” rate is higher for other measures, e.g., standard teacher evaluations conducted by principals, in which case it could be argued that the value added approach is an improvement; but I didn’t hear Tony or others make that case explicitly, and it led me to wonder if 40 percent error is tolerated in other classifications (e.g., would the FDA approve a new diagnostic test with as high a false positive rate?), and whether this kind of finding will become the basis for major legal battles.
These data underscore the long-term significance of continued research into the teaching profession. Among the questions I’d include, these seem most salient:
- What is the connection between high stakes test-based evaluation, teacher morale, turnover, and instructional effectiveness?
- What level of misclassification error might be defensible given the potential benefit to society of moving truly incompetent teachers out of the workforce?
- What is the experience in other professions with respect to formal evaluation methods and the tolerance for misclassification error?
- Do other countries – in particular those that seem to be “out-performing” us – rely on human capital strategies similar to those currently in vogue here? (I refer you to the excellent work by Dr. Laura Engel and Dr. Jim Williams in our GSEHD Working Paper 2.3, “The Global Context of Practice and Preaching: Do High-Scoring Countries Practice What U.S. Discourse Preaches?”)
- Is there an argument to be made for using test-based measures such as value-added only as the first step or early warning indicator, i.e., as a way to identify teachers that may require closer scrutiny – and then investing in a more thorough, ongoing, and valid assessment process? (I am grateful to my colleague Richard Atkinson for suggesting this idea; we may write more about it in the future.)
- Finally (at least for this list), I believe these questions reflect a fundamental challenge: evaluations designed to facilitate individual or systemic improvements in performance may not be useful for high stakes personnel decisions (merit increases, promotion, firing), and vice-versa. The future of the accountability movement hinges on our willingness and ability to understand this challenge – and to shape the R & D agenda accordingly if we are serious about using research to inform public policy.
September 21, 2012