Randomized controlled experiments hint at Wikipedia’s huge real-world impact

I became a Wikipedian because of a belief that knowledge — and access to knowledge — matters. Wikipedia, more than anything else I could point to, offered a way to bring together and make sense of the sheer, overwhelming accumulation of human knowledge. Library stacks full of more books and journals than anyone could read in a hundred lifetimes! Surely this kind of intellectual connective tissue makes a difference! Until recently that was a matter of faith to me; no longer.

Three well-designed experiments from the last few years show some specific ways that Wikipedia has large, measurable effects in the real world — and hint at what I’ve long believed. When you improve Wikipedia, you can be confident that it’s reaching people, affecting what they think, what they write, and how they behave. The juice is worth the squeeze.


“I sometimes think that general and popular treatises are almost as important for the progress of science as original work.” – Charles Darwin, 1865.

The first of these experiments — focused on chemistry — is a dramatic illustration that Wikipedia matters not just for the general public, but for experts working in their own areas of specialization as well. Neil Thompson and his colleagues added a set of new chemistry articles to Wikipedia, choosing topics that appeared in graduate-level syllabi. After posting these new articles, they analyzed new chemistry research papers that were published in the months that followed. Even with a relatively small sample size — they posted just over 20 articles, and composed that many more as controls — the effect these Wikipedia articles had on new scientific literature was large and significant: “each Wikipedia article is influencing ~ 250 scientific articles”, and the references used in the Wikipedia articles saw on average a 91% boost in citations. Put another way, a single Wikipedia article has about half the impact of a published review article — packing far more punch on a per-word basis. Darwin — writing to T. H. Huxley in the quote above — was righter than he could have known, and today the feedback loop between “popular treatises” and original work is much tighter than he imagined.


“[Courts] should be bound down by strict rules and precedent […and] the records of those precedents must unavoidably swell to a very considerable bulk, and must demand long and laborious study to acquire a competent knowledge of them.” – Alexander Hamilton, Federalist No. 78 (1788)

In a recent follow-up experiment, “Trial by Internet”, Thompson worked with law scholars to show Wikipedia’s similar role in a very different context. This time the experimenters examined the invisible flows of information from Wikipedia to the Irish legal system, posting a set of 77 new Wikipedia articles about cases decided by the Supreme Court of Ireland. They analyzed subsequent decisions out of Ireland’s lower courts, finding that the cases with Wikipedia articles were 21% more likely to be cited as precedents and that lower court decisions drew on the Wikipedia articles in framing these precedents and their meaning. With more than two centuries running the kind of court system Hamilton was describing, we’re far past the point where even the longest and most laborious study is enough to overcome the challenge of taming that bulk of precedent. It seems that Wikipedia plays a big (if unacknowledged) role in keeping this knowledge system running. (The same is surely true of every field of serious intellectual endeavor.)


Chemists and judges, of course, aren’t the only people who use Wikipedia. Another elegant randomized controlled experiment shows what you might have already guessed: Wikipedia content can have a big effect on the behavior of the general public. In their paper “Wikipedia Matters“, a team of economists explored this idea in the realm of tourism. Taking advantage of unusually-detailed public data available for hotels in Spain, these experimenters translated Wikipedia coverage of small Spanish cities into the corresponding pages on German, Italian, French and Dutch Wikipedias. For each city, they only improved two language versions of its article, leaving the other two as controls. Then they tallied how many tourists from Germany, Italy, France, or the Netherlands stayed in hotels in each city. Despite a relatively small intervention — typically adding just a few paragraphs — they found an average 9% boost in hotels stays caused by Wikipedia improvements. For example, improving a city article on German Wikipedia resulted in more tourists from Germany visiting that Spanish city. For cities with very little Wikipedia coverage to begin with, the effect was even larger — as much as 33%. The researchers estimated that a good Wikipedia article about one of these cities is worth about €160,000 per year in tourism revenue. (Unfortunately, marketers came to recognize the economic value of Wikipedia coverage long before the public sector, and the Wikipedia community devotes a lot of its energy to keeping puffery and advertising out.)

These three experiments are special in how they provide causal evidence for Wikipedia’s influence on specific domains, but the same invisible causal mechanisms are at work for almost any topic area that Wikipedia covers (or could cover). Experiments like these take advantage of content gaps on Wikipedia — topics with so little existing coverage that modest interventions provide a high “signal” compared with the baseline of what’s already on Wikipedia. But extant Wikipedia content is having ubiquitous real-world effects as well; we just can’t measure them as easily. When Wikipedia content is bad or missing, the world is tangibly worse for it. And like so many public goods, Wikipedia is systematically under-provisioned. But conversely — as these experiments have shown — a little bit of investing in Wikipedia can go a long way. (We can help.)

