The Replication Crisis comes for Stereotype Threat

Social psychologist Michael Inzlicht apologizes for promoting the turn-of-the-century fad explanation for racial disparities.

Dec 19, 2024

∙ Paid

On his Substack, U. of Toronto social psychologist Michael Inzlicht writes:

Revisiting Stereotype Threat: A Reckoning for Social Psychology
Michael Inzlicht
Dec 18, 2024
Another day, another idol falls.
This one has been teetering for years, so the collapse didn’t come as a shock. But that doesn’t make it any less painful.
I’m talking about stereotype threat, a once-revolutionary idea that shaped how social psychologists thought about identity, achievement, and inequality. For decades, it inspired research, drove interventions, and promised insights into the invisible forces that constrain human potential.
I still remember seeing its most eloquent advocate, Stanford University’s Claude Steele, deliver a keynote address in 1999 at the annual convention of what was then called the American Psychological Society. … Steele was nothing short of magnetic. Charismatic and at the height of his powers, he commanded the stage like no academic I had ever seen. He delivered his message with the kind of confidence that makes you believe science can change the world. …Professor Steele was a rock star, and I was as giddy seeing him on stage as I was seeing Kurt Cobain on stage a few years earlier.
What is Stereotype Threat?
The concept of stereotype threat, first proposed by Claude Steele in the early 1990s, posited that individuals who are part of a negatively stereotyped group can, in certain situations, experience anxiety about confirming those stereotypes, leading paradoxically to underperformance, thus confirming the disparaging stereotype. The initial research was groundbreaking.
In 1995, Steele and his student Joshua Aronson—who went on to become my postdoc supervisor years later—demonstrated that the notorious Black-white gap in academic performance could be partially closed when negative stereotypes impugning Black people’s intelligence were made irrelevant. When Black students at Stanford University were told that a test was diagnostic of intellectual ability, they performed worse than their white counterparts. However, when this stereotype threat was ostensibly removed—by simply framing the test as a measure of problem-solving rather than intelligence—the performance gap Black and white students nearly vanished.
Suddenly, here was an explanation for why certain groups didn't perform as well in academic settings. And it wasn’t just race; follow-up studies looked at women in math and science. Women, who dominate men in most academic disciplines, underperform in STEM fields because they were regularly, albeit subtly, reminded of the stereotype that women aren’t good at math, or so the story goes. The idea felt revolutionary, hopeful even, because it suggested that these vexing performance gaps could be addressed by changing people’s immediate environments rather than accepting them as fixed outcomes, inherent to the groups themselves
These findings were exhilarating. Before long, stereotype threat was not only the darling of social psychology, but it also became the darling of the political left who now had an answer to prevailing views of group differences held by the political right. This is partly because shortly before stereotype threat took its turn in the spotlight, Charles Murray and Richard Herrnstein published The Bell Curve, which resulted in a media firestorm that has had repercussions to this day. Not only did the book discuss racial differences in intelligence as real and consequential—and not mere products of culturally biased IQ tests—it suggested that a non-negligible factor in this gap was due to biological differences. This thesis was so toxic that the octogenarian Murray is still considered a pariah, shouted down and deplatformed from talks he tries to deliver at respectable colleges to this day.

For example, here was a stereotypical 2012 opinion piece in the New York Times:

In a 1995 article in The Journal of Personality and Social Psychology, Professors Steele and Aronson found that black students performed comparably with white students when told that the test they were taking was “a laboratory problem-solving task.” Black students scored much lower, however, when they were instructed that the test was meant to measure their intellectual ability. In effect, the prospect of social evaluation suppressed these students’ intelligence.
Minorities aren’t the only ones vulnerable to stereotype threat. We all are. A group of people notably confident about their mathematical abilities — white male math and engineering majors who received high scores on the math portion of the SAT — did worse on a math test when told that the experiment was intended to investigate “why Asians appear to outperform other students on tests of math ability.”
And in a study published earlier this year in the journal Learning and Individual Differences, high school students did worse on a test of spatial skills when told that males are better at solving spatial problems because of genetic differences between males and females. The girls were anxious about confirming assumptions about their gender, while the boys were anxious about living up to them.
The evolving literature on stereotype threat shows that performance is always social in nature. Even alone in an exam room, we hear a chorus of voices appraising, evaluating, passing judgment. And as social creatures, humans are strongly affected by what these voices say.

Back to Inzlicht:

Stereotype threat, in contrast, was a breath of fresh air. It promised that group differences were malleable, not fixed. They could be explained as momentary apprehension, akin to the nerves that might cause an elite athlete to choke on competition day.

Are black athletes notorious for choking on competition day? I suspect there might have once been such a stereotype but, if so, it faded pretty quickly, in part because of Stereotype Motivation: guys like Joe Louis, Jackie Robinson, and Bill Russell wanted to disprove demeaning stereotypes in order to show that blacks could win at the highest levels.

Yes, these group differences still have consequences, but now we have a remedy—change the situation so that stereotypes are less likely to be in the air and watch as all the Black students and female mathematicians rise to the top.

I’ve been writing about stereotype threat for 20 years. Back in 2004, I wrote for VDARE:

A little experiment Claude [Steele, identical twin of Shelby Steele] performed on some Stanford sophomores almost a decade ago has become wildly popular among liberals. They see it as the Rosetta Stone explaining the mystery of racial inequality. It supposedly proved that on standardized tests like the SAT college entrance exam, blacks would score the same as whites on average if only mean people like me wouldn't ever mention the fact that they, uh, don't score the same.
What Steele found was that when he told his black subjects that the little custom-made verbal test he was giving them would measure their intellectual ability, they scored worse than when he provided a less threatening description of the exam.
Here's the logic behind this extrapolation: At some point back in the mists of time, a stereotype somehow emerged that blacks do less well on the SAT. So, now, blacks are seized by panic over the possibility they might mess up and score so poorly that they validate this stereotype.
And, indeed, this nervousness makes them score exactly as badly as the stereotype predicted they would.
It's really a lovely theory. In its solipsistic circularity, it's practically unfalsifiable.
Still, you might object that Occam's Razor suggests a simpler explanation—that the arrow of causation runs in the opposite direction, with the stereotype being the result, not the cause, of decades of poor black performance on the SAT.
But that just shows you are a mean person, too.
If you were a nice person, then you would know that if we all just believe that everybody will score the same, then everybody will score the same!
Just like when we were children and all clapped at a performance of Peter Pan to show we had faith that Tinkerbell would recover.
Of course, to me as a former marketing executive, there's an obvious alternative explanation of Steele's findings: the students figured out what this prominent professor wanted to see, and, being nice kids, they delivered the results he longed for. This happens all the time in market research. After all, this was just a meaningless little test, unlike a real SAT where the students would all want to do as well as possible.

I doubt if any Institutional Review Board would approve a study in which professors attempted to badger black students into performing badly on a high stakes test like the LSAT. That’s because it’s not a completely implausible notion, it’s just highly implausible that it’s the panacea explanation for all racial inequality of average performance.

Off to the paywall:

Continue reading this post for free, courtesy of Steve Sailer.

Or purchase a paid subscription.