Ongoing: Wikidata Cleanup 2024 - Romaine continues his initiative, "Wikidata Cleanup," to coordinate community efforts in addressing the problem of items missing basic properties during the last ten days of 2024, when many users have extra time due to holidays. The aim is to improve data quality by focusing on ensuring all items have essential properties like "instance of" (P31) or "subclass of" (P279), adding relevant country and location data, and maintaining consistency within item series.
Upcoming events: Data Reuse Days - online event focusing on projects using Wikidata's data, 18-27 February 2025. You can submit a proposal for the program on the talk page until January 12th.
Press, articles, blog posts, videos
Blogs
Exploring YouTube Channels Via Wikidata, by Tara Calishain. "This time I'm playing with a way to browse YouTube channels while using Wikidata as context. And you can try it too, because it doesn't need any API keys!"
Flying Dehyphenator is an Ordia game. Given the start part of a word, use the spacebar to move the word and hit the next part of the word. Only hyphenations described with the Unicode hyphenation character work.
Want a wrap of your Wikidata activities in 2024? Wiki Year In Review has it for you! (use www.wikidata.org for the project URL)
Other Noteworthy Stuff
Wikibase/Suite-Contributing-Guide: Wikibase Suite's contributing guide has been published. This guide aims to help anyone who wants to contribute and make sure they are equipped with all the relevant information to do so.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
Here's your quick overview of what has been happening around Wikidata in the week leading up to 2024-12-30. Missed the previous one? See issue #659
Welcome to 2023’s Final Weekly Summary!
A huge thank you to everyone who contributed to the newsletter this year! 🎉 Each of your contributions, whether big or small, has made a difference and has helped us create a vibrant and informative resource for the Wikidata community. 🙏 Let's continue building and sharing knowledge together in the coming year! 🙌✨
Discussions
Open request for oversight: Ameisenigel (RfP scheduled to end at 6 January 2025 21:52 UTC)
Press, articles, blog posts, videos
Papers
Library Data in Wikimedia Projects: Case Study from the Czech Republic by Jansová, L., Maixnerová, L., & Š´tastná, P. (2024). "The paper outlines the collaboration between the National Library of the Czech Republic and Wikimedia since 2006, focusing on linking authority records with Wikipedia articles and training librarians and users. By 2023, the National Library provided most of its databases under a CC0 license, launched a "Wikimedians in Residence" program, and collaborated on projects involving linked data and using authority records in Wikidata. This partnership has enhanced their cooperation for mutual benefit, identifying key factors for their successful long-term collaboration."
How have you modelled my gender? Reconstructing the history of gender representation in Wikidata by Melis, B., Fioravanti, M., Paolini, C., & Metilli, D. (2024). "The paper traces the evolution of gender representation in Wikidata, showing how the community has moved from a binary interpretation of gender to a more inclusive model for trans and non-binary identities. The Wikidata Gender Diversity project (WiGeDi) timeline highlights the significant changes influenced by external historical events and the community's increased understanding of gender complexity."
Videos: Arabic Wikidata Days 2024 - Data Science Course - First Practical Session: Wikibase-CLI Tool (part 1, part 2) by Saeed Habishan. "The Wikibase-CLI enables command-based interaction with Wikidata using shell scripts and JavaScript. The tool runs on NodeJS and enables automatic reading and editing of Wikidata."
Tool of the week
WikiORA - is a tool designed for gene over-representation analysis. It integrates data from Wikidata, Wikipedia, Gene Ontology, and PanglaoDB to help researchers identify significantly enriched gene sets in their data.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
Newest WikiProjects: Uganda - aims to be a central hub for the curation of any and all items (biographical, cultural, geographical, organisational, etc...) relating to Uganda (Q1036)
WikiProject Highlights:
Narration/Folktales - creation of Items for motifs described in Thompson's motif index completed
Austria - concerns itself with improving data from nonprofit organizations in Austria
Showcase Lexemes: ਲੇਟਣ (L750580) - in Punjabi (pa) and "لیٹݨ" in Punjabi Shahmukhi (pnb) transliterate to "Leṭaṇ," which means "to lie down" or "to rest" in English.
Development
Most of the development team staff are still taking a break, so no development happened.
Here's your quick overview of what has been happening around Wikidata in the week leading up to 2025-01-06. Missed the previous one? See issue #660
Discussions
New request for comments: Constraints for Germanies - Following from a property discussion on P17 (German non-states), this RfC aims to find consensus on how to apply constraints that exclude items of historical periods in German history.
Please submit your proposals for the Data Reuse Days online event until January 12th. See current proposals on the talk page and here's some ideas to inspire you: presentations/demos of tools using Wikidata's data (10mins Lightning Talk presentations), discussions and presentations connecting Wikidata editors with reusers and/or explanations and demos on how to use a specific part of the technical infrastructure to reuse Wikidata's data (APIs, dumps, etc.).
Talk to the Search Platform / Query Service Team --January 8, 2025. The Search Platform Team holds monthly meetings to discuss anything related to Wikimedia search, Wikidata Query Service (WDQS), Wikimedia Commons Query Service (WCQS), etc.! Time: 16:00-17:00 UTC / 08:00 PDT / 11:00 EDT / 17:00 CET
The next Wikidata+Wikibase office hours will take place on Wednesday, 17:00 UTC, 15th January 2025 (18:00 Berlin time) in the Wikidata Telegram group. The Wikidata and Wikibase office hours are online events where the development team presents what they have been working on over the past quarter, and the community is welcome to ask questions and discuss important issues related to the development of Wikidata and Wikibase.
Blogs: (fr) female authors with male pseudonyms, blog post by Le Deuxième Texte including SPARQL queries to find female authors with male pseudonyms.
Websites :Global Dementia and Risk Factors, website by 'Students at the Maastricht Science Programme', includes data visualizations of the prevalence and current treatments of dementia across the world. It utilises data extracted as SPARQL Endpoints from Wikidata.
Papers
Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema - This paper proposes an ontology-driven approach to KG construction using LLMs where competency questions guide ontology creation and relation extraction, leveraging Wikidata for semantic consistency. A scalable pipeline minimizes human effort while producing high-quality, interpretable KGs interoperable with Wikidata for knowledge base expansion. By Xiaohan Feng, Xixin Wu & Helen Meng (2024).
Knowledge Incorporated Image Question Answering Using Wikidata Repository - Proposes a Visual Question Answering (VQA) model that integrates external knowledge from Wikidata to address complex open-domain questions by combining image, question, and knowledge modalities. Evaluated on the VQAv2 dataset, the model outperforms prior state-of-the-art approaches, demonstrating improved reasoning and accuracy (Koshti et al., 2024).
Videos: (arabic) Part 6: SPARQL Demo Session: connecting external services - Sparql SERVICE clause gives access to additional data such as labels via wikibase:label, interaction with MediaWiki APIs using wikibase:mwapi, and integration of data from subgraphs (such as the main graph and the scholarly articles graph). Integration of data from external SPARQL endpoints such as DBpedia.
Tool of the week
Wikidata Entity Linker - is a Microsoft Edge browser extension that creates web links for matching inner HTML text based on a regex format of Q\d+ which is the format of a Wikidata Entity ID. (email)
Other Noteworthy Stuff
Vacancy: Research Software Engineer / Wikibase-Expert - The Technische Informationsbibliothek (TIB) located in Hannover has a research position open for someone interested in the deployment, administration and maintenance of open source knowledge management software such as Mediawiki, Wikibase and OpenRefine as part of the NFDI4Culture partnership within the OSL.
January 1, 2025, marked Public Domain Day, with hundreds of 1929 films entering the public domain. Sandra has shared helpful notes to assist in making these films discoverable via WikiFlix, by adding video files to Wikicommons and Wikidata. Join the effort!
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nonprofit tax status (country specific tax status of organisations like non-profits)
The next Wikidata+Wikibase office hours will take place on Wednesday, 17:00 UTC, 15th January 2025 (18:00 Berlin time) in the Wikidata Telegram group. The Wikidata and Wikibase office hours are online events where the development team presents what they have been working on over the past quarter, and the community is welcome to ask questions and discuss important issues related to the development of Wikidata and Wikibase.
Join the Wikidata Training Event 2025 organised by Wikimedia Botswana UG for Wikidata enthusiasts of all levels. Starts 18 Jan 10:00am CAT (UTC+2), registration required.
Wikidata module for the Hidden Figures CURE - The newly published Wikidata module for the Hidden Figures CURE teaches undergraduates to use Wikidata for uncovering and highlighting the contributions of hidden figures in natural history, such as women, people of color, and Indigenous peoples.
Memory of the World: Ways forward - Efforts to improve the representation of UNESCO's Memory of the World (MOW) international register on Wikidata include new articles, enhanced data quality, and training on creating structured data. Key contributions involve updating Wikipedia and Wikidata entries, addressing data inconsistencies, and expanding the visibility of MOW inscriptions across languages.
Public domain visibility on Wikidata (in Catalan). The article discusses how Wikidata is being used to enhance the visibility of public domain works by integrating copyright information and making it easily accessible.
Presentations: Wikibase e Wikidata per lo studio dell'epigrafia greca (in Italian, i.e. Wikibase and Wikidata for the study of Greek epigraphy), presentation at SAEG (Advanced Seminar of Greek Epigraphy) IX in Rome, 10 January 2025, by Pietro Ortimini, Anna Clara Maniero Azzolini, Epìdosis - slides
Tool of the week
Dungeon Of Knowledge - is a roguelike game with Items generated from Wikidata that lets you crawl through the Dungeon of Knowledge in a classic ASCII interface. (toot) (blog)
VIAF (cf. Q54919 and P214) underwent a relevant change of interface on January 10; the way of visualizing clusters in JSON format has changed in comparison with present OCLC documentation and e.g. http://viaf.org/viaf/102333412/viaf.json doesn't work anymore; this broke most or all Wikidata gadgets using VIAF data; in the absence of official communications from OCLC, developers are trying to understand if the new VIAF interface is stable before changing their gadgets accordingly
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
Entry height (Height of the entrance above ground level for boarding public transport vehicles.)
location code (the location code of the location item. Should be used with qualifier property {{Q|P459}} to specify which location code system being used.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
Edit-A-Thon for Black History Month: 12 February 1300 - 1500 MST (UTC+7) is an onsite event at the University of Colorado Boulder, with a theme to add or expand items on Black and African-American comics creators.
Data Reuse Days 2025 is from February 18 to 27, 2025! This is an online event focusing on how people and organizations use Wikidata's data to build interesting applications and tools. Don't forget to register so we can know you are coming.
Past: Missed the Q1 Wikidata+Wikibase office hour? You can catch up by reading the session log here: 2025-01-15 (Q1 2025)
Press, articles, blog posts, videos
Blogs: Cleaning up legacy Wikipedia links in Open Library: The blog post discusses cleaning up outdated Wikipedia links to improve article accuracy and navigation, while highlighting the importance of integrating Wikidata for better data management.
Tracking Looted Art with Wikidata Queries - As part of Art History Loves Wiki 25, Laurel Zuckerman will show how Wikidata SPARQL queries can aid provenance researchers and historians find, identify and track looted art.
OpenStreetMap and Wikidata in Disaster Times: Ormat Murat Yilmaz will speak on how Wikidata and OSM play a role in coordinating relief efforts by providing a collaborative platform for providing data about affected areas. Part of WM CEE meeting 2024 Istanbul.
Serbian Novels on Wikidata: Presented by Filip Maljkovič on the progress and process of adding Serbian literature into Wikidata, using OCR methods to map pages and assign Properties.
(german)Wikidata for NGOs: Use and network open data sensibly: Johan Hoelderle discusses how nonprofits can benefit from the largest free knowledge base and show what potential open data offers for non-profit projects.
Data partnerships and Libraries combating misinformation: WMDE's Alan Ang delivers a speech on how GLAM institutions can help prevent the spread of dis- and misinformation whether hallucinatory AI or malicious, part of the Wikimedia+Libraries International Convention 2025.
Product Manager: Wikibase Suite: Wikimedia Deutschland is looking for a PM to lead Wikibase Suite, empowering institutions like GLAMs and research groups to build customizable linked knowledge bases and contribute to the world’s largest open data graph.
New General datatypes property proposals to review:
About box (Screenshot of the About Box of the respective software (contains important information such as authors, license, version number and year(s) and is included in almost every software))
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
location code (the location code of the location item. Should be used with qualifier property {{Q|P459}} to specify which location code system being used.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
We’re making good progress on checking format constraints more efficiently and with fewer errors (T380751)
We’re working on making distinct-values constraint checks works with the split Query Service (T369079)
EntitySchemas: We’re working on making the heading on EntitySchema pages apply language fallback (T228423)
Search: We’ve started working on the new search UI component which will let you search for additional entity types from the main search bar and not just Items anymore (T338483)
Wikibase REST API: We're working on adding search to the API (T383209)
Call for Proposals: IslandoraCon 2025. "IslandoraCon brings together a community of librarians, archivists, cultural heritage collections managers, technologists, developers, project managers, and open source project enthusiasts in support of the Islandora framework for digital curation and asset management." Deadline for session proposals: February 14, 2024.
PhotoNearby.js - a user script that checks Wikimedia Commons for a nearby photo if no image (P18) statement and has coordinate location (P625). Displays above the Statements heading. Defaults to a 500 meter radius. Displays a link to WikiShootMe.
Other Noteworthy Stuff
As part of an effort to benchmark open source SPARQL engines on Wikidata, the page Wikidata:Scaling Wikidata/Benchmarking/Existing Benchmarks contains some initial results and analyses of benchmarking Blazegraph, MilleniumDB, QLever, and Virtuoso on several existing SPARQL query benchmarks for Wikidata. There are some surprising results there, particularly related to different answers produced by different engines. Suggestions on how to improve the effort or provide deeper explanations of the results are particularly welcome on the discussion page.
New General datatypes property proposals to review:
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
location code (the location code of the location item. Should be used with qualifier property {{Q|P459}} to specify which location code system being used.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
Newest WikiProjects: No Longer at the Margins - aims to highlight and document the contributions of women in science, ensuring their visibility and recognition in the historical and archival record by addressing biases and gaps in representation.
Storage growth: We are making some changes to the terms-related database table in order to scale better (phab:T351802)
Constraint violations: We’re working on making distinct-values constraint checks works with the split Query Service (phab:T369079)
EntitySchemas: We’re working on making the heading on EntitySchema pages apply language fallback (phab:T228423)
Search: We are working on the new search UI component which will let you search for additional entity types from the main search bar and not just Items anymore (phab:T338483)
Wikibase REST API: We're continuing the work on adding search to the API (phab:T383209)
Lua: We are investigating if we can increase the Entity Usage Limit on client pages (phab:T381098)
使用MediaWiki History dumps的開發者請留意:資料平台工程團隊在這些轉儲中新增了幾個欄位,以支援臨時帳號。如果您負責維護讀取這些資料庫的軟體,請檢查您的代码並查閱更新後的文档,因為欄位在行(row)中的順序將會改變。此外,在mediawiki_user_history轉儲中,anonymous欄位將重命名為is_anonymous。這些變更將在2月的發行版本中生效。 [12]
We are excited to reveal WikidataCon 2025 will be returning this year, keep an eye on the project page for more details to come, and block your calendar for October 31 - November 2.
New Linked Data for Libraries LD4 Wikidata Affinity Group project series! The next LD4 Wikidata Affinity Group project series session on Tuesday, 4 February, 2025 at 9am PT / 12pm ET / 17:00 UTC / 6pm CET Time Zone Converter Eric Willey will be facilitating a series of four sessions focused on starting a Wikidata project from the foundation up at your institution. The first session will focus on selling your project to administrators.
Wikidata Indonesia is holding a Datathon (February 5 - 7) and Quiz (January 31 - February 7), take part!
OpenStreetMap X Wikidata Meetup #73 February 10 Time: 19:30-21:00 UTC+8 at Taipei 摩茲工寮 (Q61752245)
Data Reuse Days, February 18-27: online event dedicated to the applications using Wikidata's data and their technical setup. A first version of the program is now available. Make sure to register to receive the event's access links.
Why Wikidata? and edit-a-thon hosted by Illinois State University on February 4, 1400 - 1600 CST (UTC-6). Eric Willey and Rebecca Fitzsimmons will hold a hands-on demonstration of Wikidata, at the Milner Library, ISU (Room 165).
Past Events
Wikidata Workshop Jan 2025 - Hosted by Wikimedia Canada, this workshop offered 2 sessions for English and French-speaking attendees. Subjects covered include the basics of Wikidata, intro to editing, linking photos to Commons and how to query Wikidata. The workshop took place 30 January 01:00 - 03:00 UTC.
Press, articles, blog posts, videos
Blogs
Bob duCharme, author of Learning SPARQL posts a blog entry on filtering (only) foreign labels from a SPARQL query, using the WDQS to illustrate their example.
Towards a Sustainable Community-Driven Documentation of Semantic Web Tools A Wikidata-based toolkit to help knowledge engineers and developers find and document semantic web tools by categorizing them into a taxonomy and integrating GitHub metadata to track their maintenance status. By A. Reiz, F.J. Ekaputra & N. Mihindukulasooriya (2025).
(arabic)OpenRefine and QuickStatements - In this 2nd session of the Arabic Wikidata Days 2024, advanced skills of OR such as improving and importing tabular data. QS will also be demonstrated and how it simplifies adding and editing Wikidata. Presented by Professor Qais Shraideh.
Resource, Description & Access & STA - Michaela Edelmann introduces the cataloging platform that runs on Wikibase for the German-speaking DACH countries.
New developments of Wikibase-as-a-Service at the Open Science Lab (part of NFDI4Culture). Presented at Art Loves History Wiki Conference, it shows developments to the WB software suite.
Tool of the week
Holonet Galactic Map - Explore information and facts of the planets that inhabit the Star Wars universe, powered by Wikidata.
Other Noteworthy Stuff
⚠️ Wikidata Query Service graph split: The graph split is about 2 months away. If you are doing queries that involve scholarly articles or if you have an application that does you will be affected. Please check d:Wikidata:SPARQL query service/WDQS graph split for details.
We (Peter F. Patel-Schneider and Egezort) want to run a course on the Wikidata Ontology for a limited number of participants. Designed for those already familiar with Wikidata, it will present information about ontologies and how they form the core of Wikidata, incorporating several exercises on analyses of and fixes to the Wikidata ontology. Upon successful completion (ending with a group project in consultation by us), participants will receive certificates. Please give feedback and suggestions to improve the structure and course content (found in more detail at WikiProject:Ontology Course) which will be incorporated into our Wikimedia rapid grant application to support the effort. Interested in helping or want to share your thoughts? Let us know.
Several database changes will impact Wikidata in the coming months, including the migration of the term store (wbt_ tables) to a dedicated cluster to improve performance and enable future growth. This move will speed up most Wikidata SQL queries but prevent direct joins between term store data and other Wikidata tables. Additionally, the wb_type table will be removed, with its mapping hardcoded in Wikibase, simplifying the codebase. More details.
Call for projects and mentors for Google Summer of Code 2025! Deadline: February 28th. More info!
nomenclatural type of (taxon item of which this item is the taxonomic type (name-bearing type), e.g. the family for which this genus is the type, the genus for which this species is the type, the taxon for which this type specimen is the type, ect...)
World Heritage type (Propriety of World heritage site : the Type (Cultural, Natural, Mixed))
location code (the location code of the location item. Should either be used with qualifier property {{Q|P459}} to specify which location code system being used, or be used as the qualifier of {{P|31}}.)
DIF historia player ID (Identifier for a sportsperson connected to Djurgårdens IF on difhistoria.se (official site))
directs readers to (document or class of documents to which this item or class directs readers (aliases: is citation of {{!}} links to {{!}} refers to {{!}} target))
items classified (class of items that this classification system classifies (aliases: items categorized {{!}} classifies {{!}} categorizes))
WikiProject: Ontology Course - as mentioned above, this WikiProject plans to be a certified course to teahc participants about proper Wikidata ontologies.
Storage growth: We are continuing to make some changes to the terms-related database table in order to scale better (phab:T351802)
Wikibase REST API: We are continuing to work on bringing search to the REST API (phab:T383126)
mul language code: Support for the language code has been rolled out fully
EntitySchemas: We finished adding language fallback to the heading of EntitySchema pages (phab:T228423)
Sitelinks: Fixed a bug that prevented linking Wikidata Items from Wikipedias (phab:T385261)
Scoped search: We continued working on improving the main search field on Wikidata in order to allow you to search for Properties, Lexemes, etc more easily with it (phab:T321543)