This is an archive of past discussions with User:James Allison. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
Data Modelling Days, from November 30th to December 2nd: 3 days of online events to address data modelling challenges, discuss how to improve the way we structure data together, and discover the point of view of external reusers. Feel free to have a look at the program (under construction) and to sign up as a participant.
Linked Data for Libraries LD4 Wikidata Affinity Group Working Hour November 20th, 2023: Over the summer and into the fall the LD4 Wikidata Affinity Group will be offering a series of Wikidata Working Hours to give folks an opportunity to try out various Wikidata-related skills and tools by assembling a data set of diverse library and information science (LIS) materials (articles, conference proceedings, books) and adding it to Wikidata. Wikidata Working Hours provide hands-on Wikidata experience in a supportive space. We hope you will join us if you are interested in learning more about Wikidata, exploring LIS literature, and have been looking for a fun Wikidata project to contribute to. The seventh Wikidata Working Hour will cover the Author Disambiguator tool, which helps users assign authors to articles.During the session we will demonstrate how to use the tool on an author who was created during a previous working hour, and another who doesn't exist in Wikidata yet. After the demonstration, participants are encouraged to try the tool themselves during the rest of the working hour. This session will build on the work done in previous Working Hours by connecting authors to the articles they have written. This session will be recorded and the recording shared on the event page
ItWikiCon '23 (Italian) was hosted in Bari, Italy between the 17th - 19th November. Check the Programme for details on sessions and check for recordings or slidedecks of presentations.
GLAM Wiki 2023 took place in Montevideo, Uruguay. There were several Wikidata-related sessions some of which are linked in the Videos section.
Can you trust Wikidata? - is a paper exploring Wikidata's veracity and trustability for providing values to Knowledge Graphs. Written by V. Santos et al.
User-level gender statistics for Wikipedia - a tool that computes the number of articles created by gender has been repaired after some months of unavailability. It relies on xtools and P21 property.
Luthor - tool for finding usage examples from Wikisource and adding them to lexemes on Wikidata.
WikiProject Manuscripts - This WikiProject coordinates efforts on Wikidata to gather and curate structured data on manuscripts.
WikiProject Grove Hall Black Women Lead - aims to shed light on the lives and stories of Black women leaders who have shaped Boston’s history from the colonial era to the present day.
Newest database reports: User:Pasleim/projectmerge/enwiki-svwiki - 3875 merge candidates in English Wikipedia and Swedish Wikipedia based on same sitelink name.
Next Linked Data for Libraries LD4 Wikidata Affinity Group call November 28, 2023: As a satellite event for Data Modeling Days, we will facilitate community discussion around data modeling in Wikidata for library collections in a variety of formats, including people, books, serials, scholarly articles, rare materials, music, media, and realia. Agenda
Linked Open Data and Wikidata < Alan Ang (WMDE) talks about the importance of Linked Open Data and forming mutually beneficial partnerships between the Foundation and Institutions.
Notebooks
Explore new ways of visualising your data with a Circular Dendrogam, illustrated here with Association Football players broken down by Country and Team.
Tool of the week
Harvest Templates - is a tool that helps transfer data from Wikimedia projects to Wikidata.
User:MichaelSchoenitzer/Updown - is a userscript used for faster navigation. If there are a lot of values for one property it will add arrows that allow you to jump to the first/last value.
Other Noteworthy Stuff
WMDE is researching ways to improve the editing experience in different languages and would love to hear your feedback. We would like to talk to a few of you in online interviews to learn about your experiences, expectations, and concerns. Please let us know in this sign-up form if you are interested in taking part.
The Research team at WMF is running a second labeling campaign to evaluate the Revert Risk model for Wikidata. This is part of ongoing work on creating a new generation of Machine Learning models to support patrolling work on Wikimedia projects. Please help by going to this link, and labeling each revision in one of these three categories: Keep, Not Sure, Revert. Notice that "Not Sure" should be used in all cases where the Keep or Revert labels are not clear to you.
COR SEM (The Danish central word registry identifier)
UPOC ({{TranslateThis
| de = Ist eine eindeutige id zur Identifizierung von Organisationen/Sendungen/... zu einem Dienstleister.
<!-- | xx = Beschreibungen in anderen Sprachen -->
}})
WikiProject Heritage Collections - The aim of the present project is to create the world’s most comprehensive high quality database of archival fonds and heritage collections (including contemporary scientific collections or documentation holdings) and to ensure the interlinking of respective catalogues, finding aids, or collection databases with Wikidata.
WikiProject Events and Role Frames - The primary aims of WikiProject Events and Role Frames is to define a set of properties that consistently model event occurrences and their participants; to fill gaps in Wikidata regarding items for events and actions; and to encourage use of the proposed model and newly introduced items across Wikidata.
We continued to make a lot more language codes available (phab:341409)
EntitySchemas: We are experimenting with how to work around some technical blockers for the new datatype
Wikibase REST API: We've been working on the ability to remove an Item's label in a specific language and modify the descriptions on a Property (phab:T342981, phab:T342981)
Hello! Voting in the 2023 Arbitration Committee elections is now open until 23:59 (UTC) on Monday, 11 December 2023. All eligible users are allowed to vote. Users with alternate accounts may only vote once.
The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.
USPS - Delivering for America (futher revised version)
Before I respond to Jonathan on their most recent draft, I would appreciate your input. Feel free to ignore this or respond on the appropriate article talk section. Cheers. DN (talk) 02:20, 2 December 2023 (UTC)
Around 250 war-threatened architectural monuments documented (German) - Wikidata, Wikibase and Commons are helping preserve and plan the restoration of culturally-significant Monuments damaged or destroyed by the Russian invasion of Ukraine.
ZotWb < export records in a Zotero group library to a custom Wikibase, prepare datasets to send to OpenRefine, feed OpenRefine reconciliaton results back to the Wikibase. Wikidata is envolved in the entity reconciliation. Here's a short explanation and demo video Tool is written and provided by David Lindermann with support from WMF Rapid Grant.
Montana Plant Life URL (URL for a plant family, genus, or species on the Montana Plant Life website)
event role (item that describes a role in an event class)
role in event (event class for which the item describes a role)
selectional preference ((to be used only with the subclasses of Q_event_role) an item that plays this role in an event instance should descend from this item via a combination of P31 and P279)
event arguments and types (item that plays a role in an event instance; used with a qualifier "argument type")
BnF archives and manuscripts ID (identifier for a manuscript in the archives and manuscripts catalogue of the Bibliothèque nationale de France (BnF). Do not include the initial "cc")
clerked for (this person has held a clerkship with the judge)
battery life (the length of time a device can continue to work before it needs its battery to be recharged)
Showcase Lexemes: läsa - 'read' about this Swedish word with many pronounciations and grammatical features.
Feel free to suggest next week's Showcase Item and Lexeme!
Development
Wikibase REST API:
We finished the endpoint for removing an Item's label in a specific language (phab:T335841) and the endpoint for modifying descriptions on a Property (phab:T342981)
We are working on the endpoint for adding aliases in a given language for a Property (phab:T343721) and removing a Property's label in a given language (phab:T342983)
Mismatch Finder: We are continuing the work on moving the tool over to the new design system Codex
We adjusted the styling for the values of monolingual text statements to make the language easier to distinguish from the value (phab:T280774)
mul language code: We made some final adjustments to get it ready for testing.
Lexemes: We are adding a license note for anon users when editing a Lexeme’s lemma, a Form or Sense (phab:T343999)
Here's your quick overview of what has been happening around Wikidata over the last week.
Discussions
New requests for permissions/Bot: LccnBot (Task: Adds P244 to bibliographic entities base on library authority records.)
New request for comments: Duplicate References Data Model and UI < During Data Modelling Days '23, 2 proposals emerged trying to answer the question of how to handle duplicate References on Wikidata Items.
Next Linked Data for Libraries LD4 Wikidata Affinity Group call December 12, 2023: Several members of the Chinese Culture and Heritage Wikidata group will provide an overview of the group's Wikidata projects as well as the challenges they have encountered. Agenda
Data-SHS Bordeaux Week: Processing and Analyzing Quantitative Data in Human and Social Sciences 2023. Dec. 11 - 15, Bordeaux, FR.
OpenRefine - a open source tool for working with data < This session explores the advantages of using OR to wrangle, clean, transform and standardise data for Wikidata. Presented by Jinoy Tom Jacob at the IndiaFOSS3.0 Conference.
QLever SPARQl Engine < If you attended Data Modeling Days '23, you may have seen an extraordinary Session given by Hannah Bast and Johannes Kalmbach showcasing the power and advantages of the QLever engine. QLever can handle queries that cause the WDQS to timeout or allowing Federated queries and Geospatial!
(QLEver has already featured in Tool of the Week but we wanted to showcase it again after experiencing it at DMD '23)
counterexample (qualifier for deprecated P279 statements; example instance or subclass of the item class for which a "subclass of" statement does not hold)
WikiProject Heritage Collections: database of archival fonds and heritage collections (including contemporary scientific collections or documentation holdings) and to ensure the interlinking of respective catalogues, finding aids, or collection databases with Wikidata.
WikiProject Source Reliability: is an effort to identify and aggregate online sources of assessments of the reliability and credibility of sources.
Wikibase REST API: We continued work on the routes for adding aliases in a given language for a Property (phab:T343721) and removing a Property's label in a given language (phab:T342983)
Monolingual text values can now use many more languages than before. We’re still working on doing the same for Lexemes. (phab:T341409)
Other discussions: How to handle concepts of trans people on Wikidata? Should {privacy at wikidata.org} be redirected to {privacy at wikimedia.org} or should it be monitored by Wikidata volunteers? Join the discussion!
Upcoming: Next Linked Data for Libraries LD4 Wikidata Affinity Group Working Hour December 18th, 2023: Over the summer and into the fall the LD4 Wikidata Affinity Group will be offering a series of Wikidata Working Hours to give folks an opportunity to try out various Wikidata-related skills and tools by assembling a data set of diverse library and information science (LIS) materials (articles, conference proceedings, books) and adding it to Wikidata. Wikidata Working Hours provide hands-on Wikidata experience in a supportive space. We hope you will join us if you are interested in learning more about Wikidata, exploring LIS literature, and have been looking for a fun Wikidata project to contribute to.The ninth and final Wikidata Working Hour in the series will be using SPARQL and Scholia to query and visualize the data we’ve added to Wikidata during our series. This session will be recorded and the recording shared on the event page
Blogs: #LD42023. Part I: The Future of Wikidata + Libraries (A Workshop) - This blog series explores how libraries engage with Wikidata and Linked Data in the face of AI challenges. Led by Silvia Gutiérrez and Giovanna Fontenelle from the Wikimedia Foundation, the series summarizes insights from a collaborative session at the 2023 LD4 Conference, using Design Thinking strategies to connect the Library-Wikidata community with WMF, focusing on Wikidata, Wikibase, and Structured Data on Commons (SDC) in libraries. By Silvia Gutiérrez & Giovanna Fontenelle
Papers
Wikipedia gender gap: a scoping review - This review analyzes Wikipedia's gender gap from 2007 to 2022, revealing a slight majority of female authors, addressing key themes, and exploring strategies to mitigate the gap, providing valuable insights into the research landscape in this domain. By Núria Ferran-Ferrer, Juan-José Boté-Vericad and Julia Minguillón.
Ten years of Wikidata: A bibliometric study - This research delves into scholarly publications about Wikidata from its inception in 2012 to late 2022, revealing 945 relevant papers, primarily from conferences. The analysis highlights a concentration of experts and contributors from the Global North, as well as governmental institutions as predominant funders. The study calls for enhanced networking and outreach to promote diversity and inclusion within the Wikidata research community. Emphasizing computer science perspectives, the research focuses on methods for developing and utilizing open knowledge graphs, notably Wikidata, with a narrower but significant interest in application-oriented studies in digital humanities, biology, and healthcare. (Turki, et al)
Videos
Duplicating Everywhere All at Once | Cebuano Wikipedia - Five years ago, Lsjbot's Wikipedia articles caused duplicate Wikidata items, notably impacting geographic places on Cebuano Wikipedia. This video by User:Canley at Wikimania 2023 delves into the history, visualizes the issue, and suggests cleanup strategies for Wikidata and Wikipedia, emphasizing Aotearoa New Zealand and parts of Australia, with implications for the global challenge of bot-created duplicates.
Useful Authorities for Data-Driven Collection Research with Alicia Fagerving - Alicia Fagerving, Wikimedia Sverige, introduces the project "Useful Authorities for Data-Driven Collection Research" and Wikidata. The project, spanning 2021-2023, links vocabularies from the databases of Nationalmuseum and Statens historiska museer to Wikidata, exploring it as a platform for semantic interoperability among cultural heritage institutions and providing tools and visualizations for similar projects.
2023: OSM-Wikidata Map Framework. Combining OpenStreetMap and Wikidata allows to leverage the strengths of the two projects to create richer maps. This talk explores how OSM-Wikidata Map Framework simplifies this process. By Daniele Santini
It's not bad! Measuring Gérard Depardieu's mark on French cinema (in French) - The analysis centers on Gérard Depardieu's impact on French cinema amid legal issues and sexual assault allegations. Despite difficulties in addressing these accusations, the author leverages Wikidata to measure Depardieu's influence by querying films from directors born after 1930 to assess his involvement.
How to Become a Billionaire: A Billionaire's Occupations Network Analysis - This network analysis investigates billionaires’ primary sources of income with a network graph—based on their occupations—connecting billionaires from all over the world and uncovering some of the biggest industries in the world.
Drama Corpora Project (DraCor) is a digital database of plays, primarily from Europe. It collects and organizes texts of plays in a way that allows researchers and others to extract and analyze information from those texts. This could include details about the characters, the dialogue, the stage directions, and more. The data is being pulled from Wikidata.
We finished adding the endpoints for adding aliases in a given language for a Property (phab:T343721) and removing a Property's label in a given language (phab:T342983)
We started working on the endpoint for removing a Property's description in a given language (phab:T342985)
We are fixing an issue with incorrect handling of lowercase statement IDs in edit requests (phab:T352644)
Special:PrefixIndex now shows label/lemma for Properties and Lexemes (phab:T343115)
Language codes: We changed where Wikidata is getting its languages from for Lexemes and Monolingual text statements and thereby resolved many tasks requesting another language being added to them (phab:T341409)
Here's your quick overview of what has been happening around Wikidata over the last week.
Discussions
New request for comments: Community request for the development team to access inverse properties on client wikis. (Summary: We currently cannot access inverse property values on Wikipedia. This can be a data management issue on Wikipedia as we must always ask ourself if we must introduce an inverse property for cases where we need them. So I think it’s useful to gather the usecases community would want and draft a request for an API to the devteam to do that.)
Upcoming: The next Wikidata+Wikibase office hours will take place on Wednesday, 16:00 UTC on Wednesday, 17th January 2024 (18:00 Berlin time) in the Wikidata Telegram group. The Wikidata and Wikibase office hours are online events where the development team presents what they have been working on over the past quarter, and the community is welcome to ask questions and discuss important issues related to the development of Wikidata and Wikibase.
Past
Provenance Loves Wiki (PLW24), Jan 12th - 14th, research and data on the origin of artworks and cultural heritage and how Wikibase and Wikidata can support this.
WikiLovesWomen #SheSaid campaign wrapped up the 2023 campaign by visiting Kinshasha and Kisangani, where local Wikimedians improved quotes from women on FR Wikipedia and Wikidata.
QLever: a new way to query OpenStreetMap --> Discussion of the new opportunities offered by QLever to query OpenStreetMap and to run federated queries with Wikidata
Wikidata for authority control: 3 years of work --> The three-year Wikidata for authority control project, a collaboration between Wikimedia Sverige and Swedish museums, concluded in December 2023. It equipped museum staff with tools and skills to integrate their authority databases with Wikidata, resulting in added identifiers, SPARQL query proficiency, and enhanced knowledge sharing within the GLAM sector.
Go-ahead for Wikidata Project of GLAM institutions from Baden-Württemberg --> The GLAM-BW project, under "GLAM goes OpenData," connects major collections in Baden-Württemberg, focusing on the württembergische Kunstkammer. With over 3,000 objects, the project integrates information on collectors, histories, and objects into a knowledge graph for semantic searches, contributing to the broader realm of linked open data, akin to Wikidata.
Swiss GLAM Programme --> Wikimedia CH imported the Museum of Natural History of Neuchâtel's urchin fossil casts to Wikimedia Commons, connecting structured data on Wikidata. The project involved data cleaning, adding missing elements, and file imports via OpenRefine, highlighting seamless integration between Wikidata and Commons.
Papers
Reflections on the PCC Wikidata Pilot at UCLA Library: --> Undertaking the PCC Learning Objectives. Discusses the 14-month Pilot programme for cooperative cataloguing of UCLA Library and Museum Collections. By E. Zhang, P. Biswas & I. Dagher.
SMWCon 2023: Semantics, Wikis, and AI --> Day 1, Keynote by Prof. Markus Krötzsch who explores origins and principles of semantic wikis and key challenges that lie ahead in managing knowledge.
Brian M Sperlongano released US boundary QA checker, a quality assurance tool for finding issues with boundary data in the United States by using Wikidata, OpenStreetMap, and US Census Bureau data.
The Surrounding Ocean (available at vrandezo.github.io/TheSurroundingOcean) - is a tool that allows you to browse lexicographical data. You can use the tool to explore words and their meanings, translations, and synonyms. The tool is currently under development, and the developer, Danny, would appreciate feedback to fix any issues with the tool. More info: Wikidata:The Surrounding Ocean.
WikiProject Highlights: Ontology Cleaning Task Force: A group of people have started a task force to discuss problems with the Wikidata ontology and how to clean them up. Anyone interested in participating is welcome. The task force maintains Wikidata:WikiProject Ontology/Cleaning Task Force as a record of its activities. You can add yourself to the participants list there and find out how to join group meetings or otherwise participate in the group. (Got something noteworthy happening in your WikiProject? Share it in the upcoming issue!)
IP masking: We are working on adjusting Wikibase to handle the upcoming introduction of IP masking, which will give editors who are not logged in a temporary account name instead of using their IP to attribute edits to (phab:T351968)
Lexicographical data: We are changing how empty Senses and Forms are represented in the dumps (phab:T305660)
mul language code: We are doing user testing for the current implementation to see if it is understandable for people.
Mismatch Finder: We are continuing the work on migrating it to the Codex design system.
REST API:
We improved the handling of lower-case statement IDs (phab:T354262)
We are working on getting a sitelink for a given wiki (phab:T344039)