Wikidata is 8! Let’s celebrate! In eight short years Wikidata has grown from a central repository for all language versions of Wikipedia, to an increasingly essential fixture of the internet.
Wikidata endeavors to represent the world’s knowledge as linked data. Linked data is the connective tissue that brings life and context to concepts, web pages, research, and anything else you can image across the internet. Many museums, libraries, and other cultural institutions are growing more interested in linked data because it is the future of data on the internet. It stands in contrast to tabular data or relational databases in many ways, but even if you’re unfamiliar with information architecture, there is a lot you can already appreciate about Wikidata. For your reading pleasure, here are eight characteristics to celebrate Wikidata:
First: Wikidata is multilingual. Having started as a project to connect all language versions of Wikipedia, Wikidata makes it easy to view and edit in any language that exists on Wikipedia. This is great not only for consumption of data, but also being able to contribute in the language of your preference (and, hey, if you can speak multiple languages, we have projects for you!).
Second: Wikidata is a hub of identifiers. What does this mean? Well, if you’re talking about a specific person or topic, there may be a unique number associated with it in a local collection somewhere (if you deal with identities, you’ll know this as authority control). Wikidata gathers as many of these unique numbers that refer to the same thing and associates them with that thing on Wikidata. In tracking thousands of identifiers, Wikidata helps with clarity and disambiguation, internet-wide.
Third: Wikidata supports an emergent ontology (ontology = the rules the database follows). If there’s a relationship missing on Wikidata, you can propose its inclusion! Many databases may adhere to very formal structures. Wikidata is adaptable, which is a trait that makes Wikidata expressive.
Fourth: Wikidata is expressive. Between multiple values, qualifiers, and ranks, Wikidata can express conflicting statements, vague representations, even incorrect data, and still not break or breed confusion. Having a system that is this expressive can help make information more explicit and improve data quality system wide.
Fifth: Wikidata is free and open. Is your collection incomplete? Check Wikidata to see if it can fill in the gaps. Download all of it. Or some of it. Or none of it. Use the Wikidata query service to get exactly what you want from Wikidata. You can do as much as you want. Data in Wikidata is licensed CC0, which means you can use it as you see fit…for free!
Sixth: Wikidata is visual! Using the Query Service, you can represent parts of Wikidata using images, maps, graphs, all with a quick click. Enamored with a data visualization? You can also embed it on a webpage for all to see!
Seventh: Wikidata plays well with machines (and humans!). Being machine readable has implications regarding teaching artificial intelligence, testing algorithms, and data science analyses, but it also means batch edits are possible. If you know what you’re doing, you can make some substantial changes to many things pretty quickly. And most important…
Eighth: Wikidata is changing. Underrepresentation on Wikidata is a real issue. Systemic bias is real. Wikidata is incomplete. Wikidata needs a diverse, well-represented community to begin to address these systemic issues. There’s room for it to grow and improve. We should feel an urgent obligation to improve these shortcomings and engage with these issues that pervade our historical and current representation of knowledge. Even though it’s flawed in its current state, there’s no better time to start contributing and join this community. Every item, every collection helps. Everyone makes Wikidata better.
Happy birthday, Wikidata. Looking forward to everything you’ll learn this year.
Interested in learning more? Join an upcoming virtual Wikidata course!