Ongoing: Wikidata Cleanup 2024 - Romaine continues his initiative, "Wikidata Cleanup," to coordinate community efforts in addressing the problem of items missing basic properties during the last ten days of 2024, when many users have extra time due to holidays. The aim is to improve data quality by focusing on ensuring all items have essential properties like "instance of" (P31) or "subclass of" (P279), adding relevant country and location data, and maintaining consistency within item series.
Upcoming events: Data Reuse Days - online event focusing on projects using Wikidata's data, 18-27 February 2025. You can submit a proposal for the program on the talk page until January 12th.
Press, articles, blog posts, videos
Blogs
Exploring YouTube Channels Via Wikidata, by Tara Calishain. "This time I'm playing with a way to browse YouTube channels while using Wikidata as context. And you can try it too, because it doesn't need any API keys!"
Flying Dehyphenator is an Ordia game. Given the start part of a word, use the spacebar to move the word and hit the next part of the word. Only hyphenations described with the Unicode hyphenation character work.
Want a wrap of your Wikidata activities in 2024? Wiki Year In Review has it for you! (use www.wikidata.org for the project URL)
Other Noteworthy Stuff
Wikibase/Suite-Contributing-Guide: Wikibase Suite's contributing guide has been published. This guide aims to help anyone who wants to contribute and make sure they are equipped with all the relevant information to do so.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
Le meilleur moyen d’obtenir un consensus sur l'admissibilité de l’article est de fournir des sources secondaires fiables et indépendantes. Si vous ne pouvez trouver de telles sources, c’est que l’article n’est probablement pas admissible.
N’oubliez pas que les principes fondateurs de Wikipédia ne garantissent aucun droit à avoir un article sur Wikipédia.
Here's your quick overview of what has been happening around Wikidata in the week leading up to 2024-12-30. Missed the previous one? See issue #659
Welcome to 2023’s Final Weekly Summary!
A huge thank you to everyone who contributed to the newsletter this year! 🎉 Each of your contributions, whether big or small, has made a difference and has helped us create a vibrant and informative resource for the Wikidata community. 🙏 Let's continue building and sharing knowledge together in the coming year! 🙌✨
Discussions
Open request for oversight: Ameisenigel (RfP scheduled to end at 6 January 2025 21:52 UTC)
Press, articles, blog posts, videos
Papers
Library Data in Wikimedia Projects: Case Study from the Czech Republic by Jansová, L., Maixnerová, L., & Š´tastná, P. (2024). "The paper outlines the collaboration between the National Library of the Czech Republic and Wikimedia since 2006, focusing on linking authority records with Wikipedia articles and training librarians and users. By 2023, the National Library provided most of its databases under a CC0 license, launched a "Wikimedians in Residence" program, and collaborated on projects involving linked data and using authority records in Wikidata. This partnership has enhanced their cooperation for mutual benefit, identifying key factors for their successful long-term collaboration."
How have you modelled my gender? Reconstructing the history of gender representation in Wikidata by Melis, B., Fioravanti, M., Paolini, C., & Metilli, D. (2024). "The paper traces the evolution of gender representation in Wikidata, showing how the community has moved from a binary interpretation of gender to a more inclusive model for trans and non-binary identities. The Wikidata Gender Diversity project (WiGeDi) timeline highlights the significant changes influenced by external historical events and the community's increased understanding of gender complexity."
Videos: Arabic Wikidata Days 2024 - Data Science Course - First Practical Session: Wikibase-CLI Tool (part 1, part 2) by Saeed Habishan. "The Wikibase-CLI enables command-based interaction with Wikidata using shell scripts and JavaScript. The tool runs on NodeJS and enables automatic reading and editing of Wikidata."
Tool of the week
WikiORA - is a tool designed for gene over-representation analysis. It integrates data from Wikidata, Wikipedia, Gene Ontology, and PanglaoDB to help researchers identify significantly enriched gene sets in their data.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
Newest WikiProjects: Uganda - aims to be a central hub for the curation of any and all items (biographical, cultural, geographical, organisational, etc...) relating to Uganda (Q1036)
WikiProject Highlights:
Narration/Folktales - creation of Items for motifs described in Thompson's motif index completed
Austria - concerns itself with improving data from nonprofit organizations in Austria
Showcase Lexemes: ਲੇਟਣ (L750580) - in Punjabi (pa) and "لیٹݨ" in Punjabi Shahmukhi (pnb) transliterate to "Leṭaṇ," which means "to lie down" or "to rest" in English.
Development
Most of the development team staff are still taking a break, so no development happened.
Le meilleur moyen d’obtenir un consensus sur l'admissibilité de l’article est de fournir des sources secondaires fiables et indépendantes. Si vous ne pouvez trouver de telles sources, c’est que l’article n’est probablement pas admissible.
N’oubliez pas que les principes fondateurs de Wikipédia ne garantissent aucun droit à avoir un article sur Wikipédia.
Here's your quick overview of what has been happening around Wikidata in the week leading up to 2025-01-06. Missed the previous one? See issue #660
Discussions
New request for comments: Constraints for Germanies - Following from a property discussion on P17 (German non-states), this RfC aims to find consensus on how to apply constraints that exclude items of historical periods in German history.
Please submit your proposals for the Data Reuse Days online event until January 12th. See current proposals on the talk page and here's some ideas to inspire you: presentations/demos of tools using Wikidata's data (10mins Lightning Talk presentations), discussions and presentations connecting Wikidata editors with reusers and/or explanations and demos on how to use a specific part of the technical infrastructure to reuse Wikidata's data (APIs, dumps, etc.).
Talk to the Search Platform / Query Service Team --January 8, 2025. The Search Platform Team holds monthly meetings to discuss anything related to Wikimedia search, Wikidata Query Service (WDQS), Wikimedia Commons Query Service (WCQS), etc.! Time: 16:00-17:00 UTC / 08:00 PDT / 11:00 EDT / 17:00 CET
The next Wikidata+Wikibase office hours will take place on Wednesday, 17:00 UTC, 15th January 2025 (18:00 Berlin time) in the Wikidata Telegram group. The Wikidata and Wikibase office hours are online events where the development team presents what they have been working on over the past quarter, and the community is welcome to ask questions and discuss important issues related to the development of Wikidata and Wikibase.
Blogs: (fr) female authors with male pseudonyms, blog post by Le Deuxième Texte including SPARQL queries to find female authors with male pseudonyms.
Websites :Global Dementia and Risk Factors, website by 'Students at the Maastricht Science Programme', includes data visualizations of the prevalence and current treatments of dementia across the world. It utilises data extracted as SPARQL Endpoints from Wikidata.
Papers
Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema - This paper proposes an ontology-driven approach to KG construction using LLMs where competency questions guide ontology creation and relation extraction, leveraging Wikidata for semantic consistency. A scalable pipeline minimizes human effort while producing high-quality, interpretable KGs interoperable with Wikidata for knowledge base expansion. By Xiaohan Feng, Xixin Wu & Helen Meng (2024).
Knowledge Incorporated Image Question Answering Using Wikidata Repository - Proposes a Visual Question Answering (VQA) model that integrates external knowledge from Wikidata to address complex open-domain questions by combining image, question, and knowledge modalities. Evaluated on the VQAv2 dataset, the model outperforms prior state-of-the-art approaches, demonstrating improved reasoning and accuracy (Koshti et al., 2024).
Videos: (arabic) Part 6: SPARQL Demo Session: connecting external services - Sparql SERVICE clause gives access to additional data such as labels via wikibase:label, interaction with MediaWiki APIs using wikibase:mwapi, and integration of data from subgraphs (such as the main graph and the scholarly articles graph). Integration of data from external SPARQL endpoints such as DBpedia.
Tool of the week
Wikidata Entity Linker - is a Microsoft Edge browser extension that creates web links for matching inner HTML text based on a regex format of Q\d+ which is the format of a Wikidata Entity ID. (email)
Other Noteworthy Stuff
Vacancy: Research Software Engineer / Wikibase-Expert - The Technische Informationsbibliothek (TIB) located in Hannover has a research position open for someone interested in the deployment, administration and maintenance of open source knowledge management software such as Mediawiki, Wikibase and OpenRefine as part of the NFDI4Culture partnership within the OSL.
January 1, 2025, marked Public Domain Day, with hundreds of 1929 films entering the public domain. Sandra has shared helpful notes to assist in making these films discoverable via WikiFlix, by adding video files to Wikicommons and Wikidata. Join the effort!
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
The next Wikidata+Wikibase office hours will take place on Wednesday, 17:00 UTC, 15th January 2025 (18:00 Berlin time) in the Wikidata Telegram group. The Wikidata and Wikibase office hours are online events where the development team presents what they have been working on over the past quarter, and the community is welcome to ask questions and discuss important issues related to the development of Wikidata and Wikibase.
Join the Wikidata Training Event 2025 organised by Wikimedia Botswana UG for Wikidata enthusiasts of all levels. Starts 18 Jan 10:00am CAT (UTC+2), registration required.
Wikidata module for the Hidden Figures CURE - The newly published Wikidata module for the Hidden Figures CURE teaches undergraduates to use Wikidata for uncovering and highlighting the contributions of hidden figures in natural history, such as women, people of color, and Indigenous peoples.
Memory of the World: Ways forward - Efforts to improve the representation of UNESCO's Memory of the World (MOW) international register on Wikidata include new articles, enhanced data quality, and training on creating structured data. Key contributions involve updating Wikipedia and Wikidata entries, addressing data inconsistencies, and expanding the visibility of MOW inscriptions across languages.
Public domain visibility on Wikidata (in Catalan). The article discusses how Wikidata is being used to enhance the visibility of public domain works by integrating copyright information and making it easily accessible.
Presentations: Wikibase e Wikidata per lo studio dell'epigrafia greca (in Italian, i.e. Wikibase and Wikidata for the study of Greek epigraphy), presentation at SAEG (Advanced Seminar of Greek Epigraphy) IX in Rome, 10 January 2025, by Pietro Ortimini, Anna Clara Maniero Azzolini, Epìdosis - slides
Tool of the week
Dungeon Of Knowledge - is a roguelike game with Items generated from Wikidata that lets you crawl through the Dungeon of Knowledge in a classic ASCII interface. (toot) (blog)
VIAF (cf. Q54919 and P214) underwent a relevant change of interface on January 10; the way of visualizing clusters in JSON format has changed in comparison with present OCLC documentation and e.g. http://viaf.org/viaf/102333412/viaf.json doesn't work anymore; this broke most or all Wikidata gadgets using VIAF data; in the absence of official communications from OCLC, developers are trying to understand if the new VIAF interface is stable before changing their gadgets accordingly
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
Entry height (Height of the entrance above ground level for boarding public transport vehicles.)
location code (the location code of the location item. Should be used with qualifier property {{Q|P459}} to specify which location code system being used.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
Edit-A-Thon for Black History Month: 12 February 1300 - 1500 MST (UTC+7) is an onsite event at the University of Colorado Boulder, with a theme to add or expand items on Black and African-American comics creators.
Data Reuse Days 2025 is from February 18 to 27, 2025! This is an online event focusing on how people and organizations use Wikidata's data to build interesting applications and tools. Don't forget to register so we can know you are coming.
Past: Missed the Q1 Wikidata+Wikibase office hour? You can catch up by reading the session log here: 2025-01-15 (Q1 2025)
Press, articles, blog posts, videos
Blogs: Cleaning up legacy Wikipedia links in Open Library: The blog post discusses cleaning up outdated Wikipedia links to improve article accuracy and navigation, while highlighting the importance of integrating Wikidata for better data management.
Tracking Looted Art with Wikidata Queries - As part of Art History Loves Wiki 25, Laurel Zuckerman will show how Wikidata SPARQL queries can aid provenance researchers and historians find, identify and track looted art.
OpenStreetMap and Wikidata in Disaster Times: Ormat Murat Yilmaz will speak on how Wikidata and OSM play a role in coordinating relief efforts by providing a collaborative platform for providing data about affected areas. Part of WM CEE meeting 2024 Istanbul.
Serbian Novels on Wikidata: Presented by Filip Maljkovič on the progress and process of adding Serbian literature into Wikidata, using OCR methods to map pages and assign Properties.
(german)Wikidata for NGOs: Use and network open data sensibly: Johan Hoelderle discusses how nonprofits can benefit from the largest free knowledge base and show what potential open data offers for non-profit projects.
Data partnerships and Libraries combating misinformation: WMDE's Alan Ang delivers a speech on how GLAM institutions can help prevent the spread of dis- and misinformation whether hallucinatory AI or malicious, part of the Wikimedia+Libraries International Convention 2025.
Product Manager: Wikibase Suite: Wikimedia Deutschland is looking for a PM to lead Wikibase Suite, empowering institutions like GLAMs and research groups to build customizable linked knowledge bases and contribute to the world’s largest open data graph.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
location code (the location code of the location item. Should be used with qualifier property {{Q|P459}} to specify which location code system being used.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
We’re making good progress on checking format constraints more efficiently and with fewer errors (T380751)
We’re working on making distinct-values constraint checks works with the split Query Service (T369079)
EntitySchemas: We’re working on making the heading on EntitySchema pages apply language fallback (T228423)
Search: We’ve started working on the new search UI component which will let you search for additional entity types from the main search bar and not just Items anymore (T338483)
Wikibase REST API: We're working on adding search to the API (T383209)