Seeing the lumber for the trees
(Sure, the title of this post is a bit forced, but if I’m not going to be bothered to get out of my bathrobe to write it, I’m not going to go to any great length to be clever.)
I just finished a conversation with my landlord about standardized testing and the recent government push for such tests to be written by undergraduates at American universities. She saw this as due to a fundamental misunderstanding of how learning occurs (which, of course, is dependent on how it’s measured). The SAT, she claims, better measures how well people can prepare to write the SAT than it does how knowledgeable and capable they are in the areas of knowledge represented in the test. Standardized testing won’t allow for student performance to be compared fairly across universities. It will instead pervert instruction so that students will perform better on the tests—especially if a university’s federal funding and prestige is tied to test results.
I wouldn’t argue with her conclusions. (Anyone who’s read Freakonomics or has decent common sense—that covers just about everyone, right?—knows that tying funding to standardized testing gives teachers incentives to teach to the test, if not cheat and change students’ answers.) I do, however, think that the motivation for such testing is not caused primarily by a misunderstanding of how learning occurs (or is best measured). No, it’s borne of a more pragmatic kind of thinking: management.
Having a verifiable, quantitative means of supporting decisions is much emphasized in prescriptive high-level management, economics, and game theory. If you can show that Shrek 7 is likely to result in a greater return on investment than Clooney Political Thriller 4, you can justify choosing Shrek. This is particularly important in cases where decisions are not individual, but an institutional process. The director may love the script for Clooney Political Thriller 4, the CFO’s cousin might owe Clooney a favour, but that’s not likely to convince the stockholders. They’ll want to know about the film’s ROI. Ultimately, there needs to be a measure, and it should be standard and accepted. Good managers act on key performance indicators.
This is exactly what wide-spread, all-encompassing standardized testing provides: a key performance indicator. Already, SAT and GRE scores are used to judge applicants to universities, as are grades. Why not extend the system so that they are standard, more consistent and reliable? The prestige of big-name institutions like Harvard, Stanford, and the like can be justified or destroyed by how well their students perform on the tests. Incoming hires can be judged by their scores, perhaps weighted by the average of scores for the university they came from. Scores can be tied to dollar values, correlated with other measures of performance. Standardized testing isn’t about measuring learning or capacity, but assigning value. It’s not about education. It’s a management initiative.
As I’m certain you’re all well aware, measures can be toyed with. There are thousands of ways of easing a decision through, clever ways of presenting stats, or assessing risk and making decisions using uncertain data, of course. Oversights, elision, optimism, bias, inaccuracy, time pressures: these can change how well a measure represents what it should. Scores are relied upon, used (and abused), even when they’re juked, they’re only as good as the means through which they’re obtained. They can be warped by not only a sneaky statistician or lazy inventory clerk, but by an incomplete form or poorly written question.
As powerful as it may seem, testing is an awful way of measuring understanding or expertise. A comprehensive multiple-choice test can tell you how well students performed on that comprehensive multiple-choice test on the day they were tested. You can only hope that this score is somehow indicative of that student’s greater, longer-term understanding of the topic. Further, the results of that test are confounded by those who studied more (not necessarily the same students who know more or better), who have good short-term recall, whose teacher used the same terminology used in the test, who cheated and got a copy of the test off the web, who are simply better at multiple-choice than those who are better in an oral exam, etc. Test scores are test scores, not knowledge or expertise scores. Managers who live and die by such measures may not see that. If scores change, so do decisions. The scores become what they measure.
Blaming managers is easy (and profitable). We all have a limited capacity for information and have to use summary measures and other generalities to get things done. Common currency is a good example. I don’t pay the taxi driver by driving her around and I don’t barter: she translates her services into a dollar value using a accepted equation and I give her that much money. There are cases where this form of exchange should not be applied, where the measures can’t be ascertained or agreed upon. Are divorce settlements accurate measures of the emotional investment spouses made in each other? Should the plight of ten million starving in another country be attended to ten times more intensely than the million hungry homeless in ours? No, these cases are blurrier and more difficult than paying for a cab.
Yet many of us confuse the measure for what it measures. Education is an SAT score. Health is BMI. Prosperity is GDP. Government is voting. Worth is salary.
No, they’re not. These are complicated and complex things that have had systems and measures forced upon them to make them, well, manageable. They’re worthwhile only as generalities, and we should be aware of their limitations and distance from reality. Education, health, these are all things greatly affected by high-level, institutional decisions. Those in positions of power should reason about their decisions using the effects they have, and to know and to judge these effects is important. It is just as important is not to overdo it, to think that numbers and indicators are the reality they’re supposed to measure. They’re faulty and incomplete, and approximations at best. Having every American undergraduate write an exam in third year isn’t going to tell much about how educated they are, and it should not be used as if it would.
And, please, don’t make me write any more exams. They’re so damn stressful.
Update: I’ve heard that, in business, people often say “what gets measured gets done.”
Post a Comment