Tag Archives: assessment

PISA: The Morning After

Yesterday was PISA Day, an opportunity for concerned educators and citizens to think about the latest round of results from this important international comparative assessment

Not surprisingly, at least to those of us who follow the rhetoric and reality of comparative data, U.S. performance was basically unchanged from three years ago; some other countries (e.g., Poland, Germany) improved; some of the traditional “stars” (e.g., Finland) experienced a decline; and policy makers and commentators were quick to pronounce on the meaning of the results.

PISA is a remarkable program, in terms of the breadth of its coverage (65 education systems, including selected states in the US and Shanghai as separate from China) and the care taken to provide reliable estimates of the math, reading, and scientific literacy of samples of 15-year olds.  We have come a long way since the early days of international comparative assessment, in terms of sampling methods, psychometric quality, and reporting of results. 

Interpretation, though, remains a challenge.  For descriptive purposes, PISA provides a trove of interesting information, which, along with TIMSS, NAEP, and PIAAC (another OECD project), should be studied by anyone who cares about the ongoing pursuit of improved educational opportunity locally, nationally, and globally.  The more complicated task, though, is deriving sound policy inferences from these descriptive data.  There is no clear enough pattern of relationships to infer anything definitive about the relative success of various reforms in the U.S. and elsewhere; about the relationship of test performance to national economic outcomes; or about what exactly we should do next as we struggle to expand access and high quality educational opportunities for our students. 

For example, it seems that Massachusetts again did better than the overall U.S. average and on par with some of the biggest international “winners.”  Florida fared more poorly.  So people who like what Massachusetts has been doing must be pleased, and would be inclined therefore to like what PISA measures.  People who like Florida’s hard-charging accountability reforms are surely disappointed, and some of them must now be skeptical about whether PISA is the right tool to gauge the effects of reform.  It is no small irony that some of the harshest critics of PISA (and testing generally) are willing to use the latest results to vindicate their claims about the success or failure of various reform initiatives. 

Similarly, there is the implicit (and in some cases explicit) attempt to tie PISA scores to our current or future economic stature.  Here, too, the hazards of inferring cause and making predictions from purely correlational and descriptive data are profound.  As I’ve written elsewhere, the U.S. performed near or at the bottom on the First International Mathematics assessment in 1964, and indeed our economic productivity growth declined in the subsequent decade (when many of those high school kids who had taken the test were in the labor market).  But countries that significantly outperformed us on the math test, e.g., Britain and Japan, experienced an even more dramatic productivity growth slowdown, suggesting that those test results could not, alone, explain much about the economy or predict much about the future.  Average annual labor productivity increased by about 4.2% in the U.S., between 1979 and 2011, compared to 2.5% in Germany, which outperforms us on PISA.  Again, it’s not easy to infer simple relations from these data.  Does academic achievement matter to economic outcomes?  Yes, without a doubt.  But so much else matters, and probably more, that to draw quick inferences – and again sound loud alarm bells about our impending economic doom and, worse yet, blame everything on schools and teachers – from results on one assessment of one age group is a recipe for the ultimate erosion of respect for what is otherwise a useful tool for comparative study.  I’ve tried to make this argument elsewhere.

Finally, there is the question about what to do next.  On this, I was inspired by the wise comment of John Jackson, of the Schott Foundation for Public Education, whom I met at a dinner with a number of education policy makers and researchers.  John asked whether it was possible that we in the U.S. had essentially “maxed out” on the impact of reform in terms of its effects on PISA scores, and if so what should guide us as we continue to work on educational improvement.  That’s the right question, and though I believe there is useful information in PISA I also believe that answering it will require considerably more nuance in our understanding of the results.  (To get a sense of the complexity of these issues, see the recent work of Martin Carnoy and Richard Rothstein). 

For me, the most important issue to focus on is not where we stand on average, but rather how to cope with the ravaging effects of growing economic inequality on educational opportunity and the life chances of our youth.  In other words, we need to work on the variance more than the mean, to acknowledge the effects of poverty, and to concentrate on policies and programs that can restore opportunity (my colleagues Richard Murnane and Greg Duncan are co-authors of a book with that title, due out early next year).  A good place to start would be with a sustained program of research and policy that builds on the foundations of work such as Whither Opportunity?.  If we care about the American dream – and really want to reaffirm the nation’s commitment to upward mobility and improved quality of life for all our people – we should not distract ourselves with foolish attempts to use a single assessment as the guide to policy and practice.  PISA poses important questions; the answers aren’t as obvious. 

December 4, 2013

Leave a comment

Filed under Uncategorized

Proceed with Caution: New Report Falls Short in Complex Task of Evaluating Teacher Education

A report by the National Council for Teacher Quality (NCTQ), released today, raises questions and offers judgments about selected teacher education programs in the US.  Although perhaps intended as a tool to guide program improvement and, ultimately, the quality of teaching in the nation’s elementary and secondary schools, the report is deeply flawed and its findings need to be viewed with caution.

A few examples of the report’s errors are enough to cause concern.  First, the results are based on reviews of course requirements and course syllabi, which are not necessarily an accurate reflection of what is taught in teacher preparation programs; available literature on differences between intended and enacted curricula seems to have escaped NCTQ’s attention.  Furthermore, NCTQ does not link these proxy variables to observations of actual performance by teachers in elementary and secondary schools.   The NCTQ report relies heavily on intuition about these issues, but our children’s education deserves better:  we would not want to rate medical education based on a review of published course requirements in medical schools and without examining the actual content that was delivered in university classrooms and without including evidence of practice in clinical settings.

Second, only about 10% of the programs that were rated actually provided NCTQ with the requested data, and there is no explanation of how institutions that did not provide data were treated.  At GW, for example, we chose to not participate in the project, largely because we were uncertain about whether the methodology was attuned to the subtle differences between teacher preparation at the undergraduate and graduate levels.  One question, then, is what data did NCTQ use to rate our programs and how did they obtain those data?   On the other hand, NCTQ didn’t acknowledge the existence of one of our biggest programs, in Special Education.  Until these mysteries of commission and omission are solved, it is difficult for us to decide whether and how the report’s findings might contribute to our program improvement efforts.

Third, there are many errors of fact and interpretation.  For example, on the “selectivity” standard, NCTQ gives GW’s programs two stars (out of four), stating that “the program … does not require a grade point average of 3.0 or higher overall or in the last two years of undergraduate coursework that provides assurance that candidates have the requisite academic talent.”  There is little evidence that a 3.0 GPA is a proxy for such talent and sufficient to predict performance either in graduate school or, more importantly, in the workplace after graduation.  Still, at GW, 22 of the 24 students in our incoming 2013 cohort in secondary education (English, mathematics, physics, social studies, English Language Learners) had an average of 3.4.  For students who do not meet the 3.0, we offer provisional status but then require them to earn a 3.0 or better during the first nine credit hours of course work.  NCTQ obviously did not have this data, and apparently either did not read our bulletin or chose to ignore these subtleties.  As Linda Darling-Hammond has noted, Harvard, Stanford, and Columbia got low marks on this standard too, because they don’t require either a minimum GPA or a minimum GRE. Reports of other errors are starting to come in.  Teachers College (Columbia), for example, was rated poorly for two programs they don’t offer.

At GW we have much to be proud of in our teacher preparation programs: our alumni have been named “teacher of the year” in a number of school systems and have received many other awards and honors, our faculty are widely recognized for their dedication and skill, and our accreditors routinely praise us for the quality of our instructional programs based on assessments that include considerable attention to actual performance.

Evaluation of teacher education is at least as complex as the evaluation of teaching, and is worthy of the best and most rigorous methods.  Reports like this one ultimately trivialize the task and undermine efforts to ensure that our future teachers acquire the skills and knowledge needed for their lives in classrooms.  There is surely room for improvement in the world of teacher preparation – as there is in all professions – but the NCTQ report provides an inadequate basis upon which to design and implement positive reforms.

June 18, 2013


Filed under Uncategorized

EdWeek Commentary Condemns Misdirected Blame in Atlanta Cheating Scandal; Garners Mixed Reaction from Readers

Reactions to my recent Education Week commentary on the Atlanta cheating scandal have been interesting (see also this Washington Post blog post by Valerie Strauss that highlights my commentary). They mostly relate to the question I raised, whether the system is to blame or whether individuals, even when faced with strong pressure and incentives for opportunistic behavior, should act morally and legally.  And, of course, several of the commentaries rehearse the somewhat well known criticisms of testing.

As best I can tell though, readers didn’t take up the issue I mentioned regarding the National Assessment of Educational Progress (NAEP) scores, which apparently improved during Beverly Hall’s tenure in Atlanta.  If student performance was really getting better, then there was arguably less reason to engage in the alleged tampering.  We won’t know whether the people accused of the cheating considered any of this, or indeed if most of them knew about the NAEP results in the first place.  Meanwhile, suspicions have been raised about whether there had also been mischief in the NAEP sampling and scoring. For an especially lucid analysis of the NAEP results and how they relate to the Atlanta situation, I refer you to this excellent article by our friend Marshall (Mike) Smith, former Under Secretary of Education and Dean of the Stanford Graduate School of Education.

April 15, 2013

Leave a comment

Filed under Uncategorized

Pulling Rank

U.S. News and World Report has decided to reclassify GW as “unranked” in light of the University’s disclosure of an error in the way one statistic, percentage of incoming freshmen who graduated in the top 10% of their high school class, had been reported.  This unfortunate move on the part of USNWR is explained by its director of data research here.

I’d like to offer a few thoughts about this story to our students and their families, and to our faculty, staff, national council members, current and future employers of our graduates, alumni, and other friends in the community.

First, we can all be proud of the way President Knapp, Provost Lerman, and Vice Provost Maltzman handled the discovery that data had been reported incorrectly: they came right out and said so, voluntarily, quickly, and with complete transparency.  We should applaud this decision, which included a full independent audit and restructuring of internal oversight procedures, because it makes clear that for GW ethics precedes expediency.  As President Knapp notes in his public statement to the community, “we [disclosed the mistake] without regard to any possible action that U.S. News might take as a result…”

Continue reading


Filed under Uncategorized

Some Random Thoughts Post November 6

Regardless of your ideological or political preferences, this year’s election results were a powerful reminder of what’s special and perhaps unique about the American system.  Just over the bridge from here, “swing-state” Virginians swung toward President Obama and elected Democrat Tim Kaine to be their Senator, while in the Richmond area they overwhelmingly voted for Republican (majority leader) Eric Cantor.  Up the road, our friends in Maryland approved gay marriage, rights for immigrant children – and casino gambling.  Californians, seemingly fed up with years of education cuts, supported Governor Brown’s plan to raise taxes to support needed reinvestment in what was once among the best university systems in the world; and they rejected the proposal to abolish capital punishment.  In Florida, where we still are waiting to see where the state’s swing will land when the final votes are counted, an initiative to overturn the prohibition on public funding of religious organizations failed; as did a proposed amendment that would have banned use of state funds for abortion.  (Maybe I’m out of touch, but I don’t think I would have predicted these outcomes from Floridians!) Voters in DC expressed their concerns for good government by approving draconian measures authorizing the Council to expel members for “gross misconduct,” requiring the mayor or a council member to resign immediately if convicted of a felony, and prohibiting elected officials convicted of felonies from ever holding office again.

Continue reading

Leave a comment

Filed under Uncategorized

Some Thoughts about the Chicago Teachers’ Strike

There are enough smart people expressing their strong views about the causes and effects of the Chicago strike that I hesitate to clutter the blogosphere with additional commentary.  But I do want to share a few thoughts:

1) My sense is that the strike was symptomatic of a deep problem in American education, which I would summarize as the demise of trust brought on by the accountability movement that may have run amok.  For a slightly longer discussion, see my Education Week commentary, due out in next week’s edition and available online probably by Monday or Tuesday.  It was written before the strike began, and will appear after the strike was settled; nevertheless, I think the issues I discuss there are relevant – not just to Chicago but to the ongoing debate about evaluation and accountability.

2) With somewhat uncanny coincidence, just as the strike was ending, the Carnegie Foundation for the Advancement of Teaching was convening a group of education policy cognoscenti for a two-day conversation about teacher evaluation.  In Tony Bryk’s opening to the conference he presented some fascinating data, drawn from the work of Richard Ingersoll and colleagues at Penn and included in recent reports of National Commission on Teaching and America’s Future (NCTAF).  It seems that the modal number of years teachers remain in their jobs dropped from 15 in 1987-88 to 1 in 2007-08.  That’s the national statistic; here in D.C., for example, that translates to meaning that roughly 10 percent of teachers leave teaching after one year on the job and that fewer than 5 percent have more than five or six years of experience.  For more on this issue, see the NCTAF report, “Who Will Teach? Experience Matters.”

Continue reading

1 Comment

Filed under Uncategorized