A new server for Authorship Highlighting data

We quietly rolled out a change to Wiki Education Dashboard’s Authorship Highlighting feature last week: it now gets its data from a new server on Wikimedia Cloud, the Wikimedia community’s cloud computing system. This shouldn’t be any different — except that according to our tests, the authorship data should load substantially faster.

The Authorship Highlighting feature, which can show which parts of a live Wikipedia article were contributed by the students in one of our Wikipedia Student Program courses or participants in our Scholars & Scientists courses, is built on top of a system called WikiWho, which was designed to analyze the entire history of Wikipedia articles and determine which edit and which editor originally added each word in a given article. The WikiWho system was initially created by a team of researchers led Fabian Flöck, based on an algorithm developed by Flöck and Maribel Acosta and hosted by the GESIS research institute. Last year, Flöck announced that he would be leaving GESIS and the data service would be shut down sometime in 2022. So I teamed up with Wikimedia Foundation software engineer (and sometime Dashboard contributor) MusikAnimal to get a replacement server ready to provide uninterrupted WikiWho data service.

Over the last few months, MusikAnimal and I worked through the process of learning about the WikiWho API’s Python codebase, provisioning and configuring a fresh server, and importing and processing the entire history of the five language versions of Wikipedia that WikiWho supports so far. We also coordinated with the WikiWho team to release the server software under an open source license and establish a new code repository for it that we can maintain. For now, this means the Dashboard’s Authorship Highlighting, along with the Wikimedia Community Tech team’s “Who Wrote That” tool, will continue working after the GESIS server shuts down. With the potential to add more storage space on Wikimedia Cloud, we’re also hopeful that we can expand support for these tools to more languages — which will be especially relevant for Programs & Events Dashboard users. (To date, only editors working on English, German, Spanish, Turkish, and Basque Wikipedias have been able to use the Authorship Highlighting feature. If you’d like us to prioritize support for your language, the best place to do so is on Wikimedia’s Phabricator bug tracker.)

I want to say thanks to Fabian, Maribel, Kenan Erdogan, Roberto Ulloa, Olga Zagovora, and everyone else who has been part of the WikiWho project.

Categories

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.