David Rozado, a professor in New Zealand, has been a leading proponent of big data analyses of media bias. He’s got a new Substack out on Wikipedia’s political leanings:
To study political bias in Wikipedia content, I first gathered from external sources a set of target terms (N=1,628) with political connotations (i.e. names of recent U.S. presidents, U.S. congressmembers, U.S. Supreme Court Justices or Prime Ministers of Western countries). I did not cherry pick the set of terms to be included in the analysis but instead used publicly available pre-existing lists of terms from Wikipedia and other sources.
Rozado is a big believer, like me, in the utility of using somebody else’s list made up for some other purpose than your own. He’s very wary of cherrypicking.
For example, he recently published an analysis of the frequency of use in the New York Times and the like of a list politically loaded woke terms he’d been following since the later 2010s suggesting that wokeness in the prestige press had peaked in the early 2020s and was recently in decline. I suggested that while that may well be true, one problem with his analysis was that wokeness, being heavily fashion-driven, was constantly developing neologisms to replace terms that had fallen out of fashion. For instance, his list did not include “equity,” a previously obscure term that has skyrocketed in popularity among the woke.
Rozado agreed that was an issue, but asked how he could find an objective way to add terms without risking putting his thumb on the scale. And that left me stumped.
I looked up some lists of woke terms, but they tended toward the satirical.
Any suggestions?
I then identified all mentions in English Wikipedia of those terms. This was followed by extraction of the textual context around those terms and feeding a random sample of those text snippets to a machine learning model for annotation of the sentiment/emotion with which the target term is used in the snippet (in total, I generated 175,205 sentiment annotations).
Results show a mild to moderate tendency in Wikipedia articles to associate public figures ideologically aligned right-of-center with more negative sentiment than public figures ideologically aligned left-of-center. These results suggest that Wikipedia’s neutral point of view policy (NPOV) is not achieving its stated goal of viewpoint impartiality in Wikipedia articles.
For example, here are a list of journalists with Wikipedia’s sentiment (I rotated the image 90 degrees to make the names easier to read). On the horizontal axis, left represents negative sentiment, right is positive:
Wikipedia’s most beloved journalist is New Yorker humorist Andy Borowitz.
I didn’t make the list, but I’d have to imagine by how hostile my Wikipedia entry is toward me that I’d score closer to Ann Coulter than Andy Borowitz.
Here’s Rozado’s graph (rotated) of recent US Senators:
Wikipedia hates Josh Hawley worst of all and adores former Senator Kamala Harris.
There’s no accounting for taste … except, in this case, it’s really easy to account for Wikipedia’s taste.
I find it odd that Rozado didn't combine what are obviously references to the same person, i.e. 'Senator McConnell' and 'Mitch McConnell'.
Perhaps this shows that calling him 'Mitch' tones down the rhetoric a bit -- maybe it helps hostile Wikipedia info mavens think of the ageing senator as a softer, more potentially-cuddly elderly man, rather than as a cold-blooded fascist ideological-assault turtle.
Peggy Noonan shows as being positive but Wikipedia is most unflattering (and this is just the start of the “Criticism” section):
“While Noonan's speechwriting has been praised, her books and Wall Street Journal columns have been the source of criticism and mockery. Critics have singled out her reliance on personal anecdotes to make broad assertions about current events and changes in American politics and society.
During Hurricane Katrina, she called for looters in New Orleans to be shot. Henry Giroux called it a "barely coded rationale to shoot low-income Black people."
https://en.m.wikipedia.org/wiki/Peggy_Noonan
How did she classed as having a positive entry?
Greg Gutfeld also has a lengthy negative Criticism section in Wikipedia.
Andrea Tantaros (never heard of her) is shown highly negative yet Wikipedia has nothing bad to say of her.
I don’t think this is ready for prime time.