T-8 days! Much obliged to @Turtlecrown for cleaning up the rules. One thing I noticed from this is that the 'Popular articles' section is a little wonky. It also needs to be updated daily - is there someone willing to commit to this? I'm hesitant to do so only because I couldn't get the WMCS link to work at first pass, which is somewhat concerning. Kazamzam (talk) 16:51, 23 October 2024 (UTC)[reply]
Assuming it is fixed before the drive, I can help with that section. It's fairly simple, just copy-paste the 50 (I think?) articles in. The wikitext would need to be added, but that's easily done with some code or just ChatGPT. ARandomName123 (talk)Ping me!04:07, 24 October 2024 (UTC)[reply]
@ARandomName123 The list of "Popular articles" is very popular among our Backlog Drive participants. Let me know if you need help updating it daily – if so, please let me know how best to go about grabbing the list. It seems we tend to get stuck at the top part of the alphabet, which is unfortunate. It would be great if we could mix it up a bit – either just wipe the list and start over completely fresh once a day and/or grab a more random selection from throughout the alphabet on a daily basis. Cielquiparle (talk) 02:33, 2 November 2024 (UTC)[reply]
@Cielquiparle: Yea, sorry about the delay, just updated it a few minutes ago. I'm not sure how we'd mix it up in terms of the alphabet, but you could change the timeframe the view count is counted. The default looks like the past 3 weeks.
If you see it hasn't been updated by around 00:10 UTC, feel free to do it. My usual workflow is just to let it run, then copy and paste the first 50 entires, including the misc. stuff so there's no need to copy-paste one by one, into ChatGPT (I'm too lazy to write a script lol) with a prompt along the lines of Please format the follow list in a code block in the format *[[article title]]. ChatGPT will then add the required wikitext and remove the unnecessary pageview numbers/rankings. Then just click the Copy Code button that pops up, and paste it into the section. Should be fairly quick, but if you find a method that works better for you, feel free to use it. ARandomName123 (talk)Ping me!02:41, 2 November 2024 (UTC)[reply]
@ARandomName123 Thanks for the instructions. Re: Alphabet bias, I think what's happening is that we're trying to run the Massviews analysis on too large of a category (exceeding 20k items), so it's always taking only the first 20k items in alphabetical order. Without writing a script, the best bet may to narrow by sub-categories...like everyone's favorite "Articles without references from December 2009", for example. So I just went ahead and added a handful to the top of the list (also because I had to delete several from the "new" list you refreshed which put back a bunch of articles that are no longer Unreferenced because of the WMCS lag and because #TeamBacklogDrive is so incredibly fast). Cielquiparle (talk) 03:31, 2 November 2024 (UTC)[reply]
@Cielquiparle @ARandomName123 In case it is a bit useful, to work around the 20K alphabetical limit in Massviews, you can do a Petscan against the category, sorting the output by number of incoming links in descending order (as incoming links is a sort of proxy for popularity) and then taking the top 1000 into a Pagepile, and running Massviews against that. For example:
@Cielquiparle no problem, very glad to help. The combination of Petscan and Massviews is really useful. In a similar vein, I added a section Popular biographies below popular articles that uses this Petscan 29822085 Pagepile and Massviews in a similar way. If you think the section is overkill, please do remove it - I won't be at all offended. SunloungerFrog (talk) 11:37, 13 November 2024 (UTC) P.S. @ARandomName123 your ChatGPT hack is awesome. Best use of AI yet![reply]
Watchlist notice and invitations
Would like to post a watchlist notice (text ad at the top of every editors' Watchlist), with the words the "November 2024 Unreferenced Articles Backlog Drive has begun. How is it done? Cielquiparle (talk) 08:27, 2 November 2024 (UTC)[reply]
Cielquiparle: If you'd like to request a watchlist notice, you can do so at MediaWiki talk:Watchlist-messages. Just a quick note: we can only display a watchlist notice once per event, and it stays up for seven days. Since we already displayed one for this drive from 20 October to 28 October, we won't be able to request it again. – DreamRimmer (talk) 09:36, 2 November 2024 (UTC)[reply]
For info: I put a short invitation note on the Talk pages of four of the large WikiProjects from here (Germany, Military history, Football, and Albums) with a link to bambots' /bycat/[WikiProject]#Cites no sources. Turtlecrown (talk) 13:05, 2 November 2024 (UTC)[reply]
Review page instructions
Again, very late here. But I'd like to change the review page instructions to the following, which:
presents the process as ordered steps,
tells reviewers what to check,
distinguishes between 'submission' and 'reference',
gives more prominence to the Tool (which many missed at the start),
and (optionally) covers a couple of FAQs (well, once-asked, but still good to know).
Part of the reason it's so late is that I was slow to understand the process at the start of the drive.
Welcome to the reviews page. Your role on the drive is extremely important and we are glad that you dedicated your time to do this grunt work. Each reviewer should create their own dedicated section for reviewing.
It is mandatory that you don't review your own submissions (obviously) and it is recommended that you spread out your reviews to many participants. Here is the procedure for reviewing a submission.
Find a submission from the drive. A tool is available to locate unreviewed submissions (credit: ARandomName123). Submissions can also be found via the "Tally" link for a participant in the Leaderboard.
Review the submission. Check that the article was unreferenced and the submission contains an inline citation from a reliable source, and verifies the immediately preceding sentence.
Use of primary sources is permitted, so long as it follows WP:PRIMARY.
If the citation only verifies part of a paragraph, a {{citation needed}} should be added to the prior part.
Duplicate #NOV24 tags will be caught by the bot and don't need to be flagged in review.[1]
In a submission with multiple references, a Y may be given if one citation passes these criteria, as long as any other errors are addressed.[2]
(recommended) Cleanup and refinement. If you need to make any adjustment to the article, you can prefix your edit summary with #NOV24REVIEW. Warn editors about their mistakes and sloppy work; if needed, you can raise concerns at Wikipedia talk:WikiProject Unreferenced articles. Consider re-reviewing user corrections to submissions made within the drive period.[3]
Document your findings in this page. Use the Wikitext format:
This is likely somewhat controversial, but I would be inclined to say: If it was already tagged as "Unreferenced" despite having the list of general references, and you actually add an inline citation, it should count. Cielquiparle (talk) 08:11, 1 November 2024 (UTC)[reply]
I would disagree with that; an article having {{no footnotes}} is a separate issue from one being {{unreferenced}}, even if the former is erroneously tagged as the latter. When I came across such articles in the FEB24 drive I just corrected the tags.
The point of the drive is to cite information that is uncited, and if there is a non-inline reference, there's a good chance that it is backed up by a citation (though ofc ideally it should be made inline so it's possible to verify efficiently). novovtalkedits09:38, 1 November 2024 (UTC)[reply]
I agree with the first sentence as a general rule.
I just think in the case of a Backlog Drive, we should reward good-faith addition of inline citations, rather than splitting hairs over whether an external URL or book title which already appeared on the page qualified as viable references or not, etc., etc. It gets slippery and is a buzzkill.
Most important thing is that in-line citations are added, referencing reliable sources...as we bring down the overall backlog by at least 10,000. Let's do it. Cielquiparle (talk) 13:39, 1 November 2024 (UTC)[reply]
This is what we did with the last drive. If an article had general references and the unreferenced tag, but no reflist or in-line citations, and the user added at least one in-line cite, it would count towards the drive. The sentiment was it's better this happens than just moving it to the {{no footnotes}} queue. As one of the users who did a lot of this type of thing last drive, it didn't boost my numbers by any means since there was still a lot of editing being done to add the in-line cites, as appropriate. --Engineerchange (talk) 21:20, 1 November 2024 (UTC)[reply]
Hello everyone, I have created a user script called User:DreamRimmer/DriveEditSummary.js. This script adds a +NOV24 button right above the Save, Preview, and Changes buttons in the source editor. Clicking this button automatically inserts a preloaded edit summary with the #NOV24 hashtag. Please note that it only works in the source editor, not in the visual editor. I hope you find it useful! – DreamRimmer (talk) 13:41, 1 November 2024 (UTC)[reply]
Great, thanks for this tool. Just did an extra page to try it. (Success!) Just a thought: "Adding reference(s) ... using" does make it sound like the script was used to add the references rather than the edit summary. Turtlecrown (talk) 03:45, 2 November 2024 (UTC)[reply]
I'm not sure if these ones should also be AfD'd - sourcing seems to be pretty sparse on the ground, but I'm wondering if it's a transliteration/spelling issue? They're notable figures and I can find oblique references to them here and there, and there are more recent figures in this royal family. So I'm not sure what the best way to proceed is here. Thoughts? Smallangryplanet (talk) 11:47, 4 November 2024 (UTC)[reply]
We love to see it! @Smallangryplanet, just a note, can you be sure to post question like this on the talk page for the drive itself? That will hopefully help more participants see any issues as they pop up. Kazamzam (talk) 20:28, 4 November 2024 (UTC)[reply]
@Kazamzam Oops, I thought this was the talk page for the drive itself! I think I got turned around because the "Discussion" tab on the drive links to this page, not the drive's talk. Is that correct or should I change it? Smallangryplanet (talk) 20:31, 4 November 2024 (UTC)[reply]
Why is the drive only rewarding editors for adding inline citations? General references are a legitimate citation style according to policy, preferred by many for short articles (and preferred for everything on some sister projects, e.g. dewiki), and have the same result of removing the article from the "unreferenced" backlog. – Joe (talk) 09:38, 21 November 2024 (UTC)[reply]
Well, with in-text or parenthetical citations. But more to the point, it is not a policy requirement that we attribute specific statements to their source except in limited circumstances. This is "WikiProject Unreferenced articles" not "WikiProject Articles with no footnotes". – Joe (talk) 10:23, 21 November 2024 (UTC)[reply]
@Joe Roe - that's a great question and a very interesting point that I have wondered about myself. I have definitely added footnotes in addition to (and sometimes instead of) inline citations, but I do then add the 'no inline' clean up tag. The following are my personal thoughts and not necessary those of the project. Happy to discuss further and take this as feedback for the URA and future drives going forward!
Short answer, per WP:INTEGRITY and WP:GENREF: The point of an inline citation is to allow readers and other editors to see which part of the material is supported by the citation. The disadvantage of general references is that text–source integrity is lost, although this is sometimes mitigated if the article is very short and the general references is very direct. General references are frequently reworked by later editors into inline citations; it saves time to just do it now.
Longer answer: Looking over the WP:MINREF guidelines, I think the URA is holding up articles to a higher standard than might be strictly necessary by focusing on inline citations. And mulling it over in the pre-caffeine morning, I am largely fine with this. The guidelines note that "it is typical for editors to voluntarily exceed these minimum standards", and I'm pleased that we are going above and beyond with this drive. Editors both in and out of the project can help add citations to articles as they see fit but we are choosing to exceed that minimum and reward to that effort for this drive; there's no prescription against general references, in-text, or parenthetical citations. But more importantly (in my opinion) is the following statement: "any material lacking an inline citation to a reliable source that directly supports the material may be removed and should not be restored without an inline citation to a reliable source". Yes, general references and non-inline citations might be sufficient in some cases and maybe would solve a lot of headaches around tiny stubs of German streams and rivers, but I think we are a) holding ourselves as a project to a higher standard that is ultimately beneficial to the encyclopedia and b) mitigating the need for future work by providing inline citations to statements that will hold up when editors are inclined to delete statements or articles that are unreferenced or only ("only") have footnotes.
Beyond the perspective of the drive, because the URA will still be here with tens of thousands of articles when this is over, adding general references is quite common but from a clean-up perspective, it punts the article from "unreferenced" to "lacking inline citations" and this, given the existence and widespread use of that specific tag, seems to be considered insufficient by the community. I think many if not most of us are here to improve Wikipedia, not shuffle articles from our preferred category to someone else's. If we're going to improve the encyclopedia by cleaning up these articles, we should actually do that rather than shuffling things around and hoping no one notices the level of untidiness is actually unchanged.
There's more to say on this and doubtless other people will have different opinions but those are my two cents (or with inflation, 12 cents). Again, this is a great question and thank you for raising it. I hope we have some really excellent discussion as a result. Cheers, Kazamzam (talk) 13:10, 21 November 2024 (UTC)[reply]
I don't think "shuffling things around" is an accurate description: you're moving something from one of our most pressing cleanup categories ({{unreferenced}}) to a less severe one ({{no footnotes}}). That is an improvement to the encyclopaedia. It's also not a given that the article with general references will or should be tagged with {{no footnotes}}: a stub with no quotes or potentially controversial claims, for example, will not benefit from inline citations unless and until it is expanded.
URA participants are of course welcome to pursue a higher standard than that required by policy or what a literal reading of the name of the project would imply. It just seems inefficient to me. One of the things I really like about this project is that it is tightly focused on one problem, rather than trying to 'tidy' everything about an article at once. – Joe (talk) 13:26, 21 November 2024 (UTC)[reply]
That's a wrap on the November 2024 drive! Thanks to all participants for their help; details on progress and benchmarks reached can be found on the drive page. Help is still needed on reviewing, so please sign up if you are interested! This update covers the changes since October so the totals are a little different.
Headline: We cleared 10,489 articles and are now officially below 72,000! For yourself and your fellow editors, please clap.
Minutiae: For anyone interested in a more detailed breakdown of the numbers - average was 114.8 articles; median 40; mode 39.
Highlights: the entire year 2008 is in the dustbin of history, along with January and February 2009! With the heroic work of editor in arms, @Turtlecrown, the stubborn, German river-centric category of September 2020 saw a 25.3% drop. And everyone's littlest friend, December 2023, further decreased from 48 to 37, a 22.9% decline.
Low-hanging fruit: Other than December '23, June 2024 is a skinny 115 articles, dangling precariously like a sinner in the hands of an angry God. Give it a nudge.
High-hanging fruit: Everyone's favourite BFC (Big Friendly Category), December 2009, is a lean, mean 9,568 articles as of this writing - bringing all categories to under 5 digits. The other high-hanging fruit are, still, the Frustrating Five (name open for revision): January 2013 (1,039), April 2019 (875), May 2019 (1,822), June 2019 (3,915), and September 2020 (1,020). This time, January 2013 had the lowest percentage of change between updates. Godspeed to anyone working on these.
Challenge results: February 2024 beat January 2024 264 to 270, and December 2015 sneaked a win from January 2016, 286 to 289. Live sports are just riveting, aren't they.
New challenge: No ties this time. Have a wonderful end of 2024 and I hope to see you all in 2025!
Announcements: Something discussed prior to the drive was building a regular, semiannual schedule for drives, which would put us on target for May 2025. I see this as a reasonable amount of time to prepare and make adjustments based on editor feedback about the strengths, weaknesses, and areas of opportunity for improvement. However, I see no reason to rush this as we are still tallying this drive's results. If this date is amenable to people, we can get started on a draft page in January. I would especially like to address the points raised by @Joe Roe about the prioritization of inline citations and how we can address this in the future.
I had a great time working on citations during this drive. Thanks to everyone working to organize these events. See you all in the field between now and the next drive! I will defeat December 2023!! Gnisacc (talk) 19:03, 5 December 2024 (UTC)[reply]
I take it back. Let's have as many backlog drives as we can in 2025 and clear the backlog. Why not schedule 3–4 next year? February, May, August, November. Cielquiparle (talk) 13:09, 7 December 2024 (UTC)[reply]
@Cielquiparle - I LOVE this attitude. Personally I'm open to it but I wonder if, by having (too) many drives, we run the risk of diminishing returns via lower editor engagement/participation. We cleared fewer articles in the November compared to the February drive but we also had a much larger backlog. February started with 111,643 and cleared 14,300 (12.8%); November started with 80,645 and cleared 8,511 (10.5%) - not much of a difference proportion-wise, plus we are probably clearing out the easier articles sooner and so what's left are the tougher nuts to crack. My concern is that if we did a drive every few months, would the proportion drop to more like 5-6%? How to do we make sure to encourage and sustain engagement throughout the drive? We definitely lost momentum by the halfway point of both drives, so I think making sure that we keep that up (somehow) for future drives is a concern to prioritize and I'm worried that might be difficult if we do 4+ drives a year. I think it's doable, but we have to be smart about it. Setting smaller goals that are likely to be met through the drive to keep momentum up is probably a good idea; at the same time, we can't have a party every time we go down 100 articles.
Maybe this is the point when we start organizing off-wiki. I also agree with your point that May might not be a good idea for exam reasons (excellent foresight!). Personally April is a wash for me but I could be game for June. Cheers, Kazamzam (talk) 16:52, 8 December 2024 (UTC)[reply]
Updates like these are the type of stats I love to see and make these things much more fun to participate in. Any chance that you have stats on the biggest decrease for categories by topic? Wozal (talk) 16:18, 8 December 2024 (UTC)[reply]
@Wozal - I wish :/ the way that the search function assigns categories by "topic" is pretty hit or miss. For example, if you try to filter the All articles lacking sources category for Biography, the first five results (by "relevance"): Fourth Dynasty of Egypt family tree, List of monarchs of Pontus, Single suiter, Quantitative notrump bids, and Entry-shifting squeeze. Not until you get past the first 20 hits do you get an actual biographical article of an actual person (Avgust Tsivolko, for anyone curious). So providing the stats on decrease by topic, let alone 200+ categories by topic, would be an incredibly time-consuming task. But this raises the larger issue of how the articletopic search function is basically useless; I think this is detrimental to potential WP members who might be really passionate about, say, Ancient Roman history or wetlands or Judaism or whatever but get frustrated and turned off by this filter that is supposed to be helpful. It's something I'm hoping we can address before the next drive, either to improve or replace it with something better.
I enjoyed the update very much, and always do. Thanks for all the hard work. And well done on the backlog drive, everyone!
My two cents on inline citations is that they make sense as a backlog drive task because of a need for what I'll call "meta-verifiability" - the ability to check each other's work and see if it does what it says it does. I've also seen a fair few articles in the wild that use only general references in which the article topic is not mentioned, and hunting down these resources, where possible, is time-consuming, and I don't want to assume intent but it can make content seem more reliable than it is. Maybe it pays in general to emphasise the difference between {{no footnotes}} and {{unreferenced}} though. Turtlecrown (talk) 23:10, 8 December 2024 (UTC)[reply]
Kazamzam thank you for the update. As a new editor, I really enjoyed working on the drive, and I learned a lot of things by reading through and adding references to a fairly broad selection of articles about different topics. I was always pleasantly surprised when it turned out that there was an interesting twist on an otherwise unprepossessing-looking article, and it was very satisfactory to be able to keep them in the encyclopedia. On the flip side, I was rather surprised to come across a few dreadful articles of almost no merit that had hung around for a long time; it was equally as satisfying to get those into the various deletion processes and bid farewell to them.
I was particularly pleased that we reduced the known backlog of unreferenced BLPs to zero, and I firmly think that, in time, (and maybe it is not as far off as we might fear), we can do it with the general unreferenced article base too.
On the issue of how the articletopic search function is basically useless, I note that our sibling project Unreferenced BLP Rescue have set up various Petscan searches to do this for Unreferenced BLPs. Maybe that is an approach that we could consider adopting, and I would be happy to do what I can to help set it up.
Would there be interest in organizing a year-round "marathon" for Unreferenced articles in 2025? It could run in parallel with the June 2025 Backlog drive (or take a break in June).
What would make it different from general project participation would be: We'd have a leaderboard and tally – just so we have visibility into volunteers who are active and what they have been up to. I think it would just help our loose community of year-round contributors feel a bit more plugged in...so you're not just having to constantly check Bambots to work out what articles are getting referenced.
It might also encourage more discussion or coordinated efforts to focus on specific categories or topics. Or is it too much overhead to set up and run...? Cielquiparle (talk) 04:02, 13 December 2024 (UTC)[reply]
I would definitely be game for that. Seeing monthly updates would be great motivation. It would raise the visibility of the project as well, especially if we are able to bring back the hashtags to include in edit descriptions. JTtheOG (talk) 07:04, 13 December 2024 (UTC)[reply]
@JTtheOG @Cielquiparle - the edit description that I've been using for a few years now is the following: WikiProject Unreferenced articles; you can help!
I think it's pretty straightforward and to the point, plus it directs people to the project webpage. We could make something specific for the "marathon" but I would want something that casual editors can use to go directly to the URA page rather than just a hashtag for points tracking purposes. There's also a talk page banner that someone created way back when but I don't think it gets a lot of use - maybe we want to change that? Cheers, Kazamzam (talk) 18:20, 15 December 2024 (UTC)[reply]
It would definitely be helpful for me. I wasn't able to participate as much as I would have liked in the recent drive because of other commitments. So a more generalized tracker that I can do any time would be more beneficial. SilverserenC17:31, 13 December 2024 (UTC)[reply]
I like the idea of a year long marathon, with a leaderboard etc. I also saw that the Guild of Copy Editors hold week-long blitzes which might be worth considering as a complementary activity, for example based on a theme or other subset of unreferenced articles. Maybe we could more explicitly tie those in with other WikiProjects, for example a blitz on unreferenced women's biographies in tandem with WP:WIRED. Cheers, SunloungerFrog (talk) 08:24, 14 December 2024 (UTC)[reply]
@ARandomName123 Could the monthly leaderboard exist somewhere on the main URA page so as not to detract from casual participants (but also encourage such participants to sign up)? Or would it need to be separate? Cielquiparle (talk) 20:48, 16 December 2024 (UTC)[reply]
Hi all! I was doing some work on January 2010 and found this article, Puccio Pucci (politician), which was edited as part of our November drive. I'm concerned about two things in this edit - one, the use of Geni as a reference and two, the lack of a references section. I'm less concerned about the points of the drive and more concerned about ensuring that what we do as a Project is a net positive - not that this article is worse off for the edits, but imho it wasn't really improved. I'm wondering if it's possible to make a filter for reviewers to ensure that people are not getting points for adding "references" from sites like Geni, Goodreads, IMDb, etc. I think we can work this into the next drive (June 2025?) but I'm wondering if it's a bot-problem or something that needs human eyes and reviewers. Thanks, Kazamzam (talk) 16:40, 18 December 2024 (UTC)[reply]
@Kazamzam: Do you mean something like a bot that checks all submissions against RSP (or whatever list we decide), and provides a feed of only potentially-problematic references for reviewers to look through? ARandomName123 (talk)Ping me!20:35, 18 December 2024 (UTC)[reply]
Well, I could modify the review tool to highlight or arrange separately edits with unreliable sources, though it might take the tool longer to load. Alternatively, we could just have a bot periodically post a list somewhere on-wiki. ARandomName123 (talk)Ping me!23:20, 18 December 2024 (UTC)[reply]
@Kazamzam: Another suggestion, if this has not already been addressed, would be clear and well-publicized reminders about Wikipedia policies on WP:UGC. I edit many biography articles, and many of these articles still have citations to genealogy web sites and Find-A-Grave in support of facts about birth and death. I don't know how many active editors still cite UGC, but with the prevalence of UGC cited, it's easy to see how many editors would perceive this as an editing norm. Thanks to everyone herding us cats. Best regards, Oldsanfelipe2 (talk) 22:58, 18 December 2024 (UTC)[reply]
Does anyone know if there a project oriented around removing citations of UGC? (quick search only turned up an AI cleanup project). I could be interested in that. Relatedly, I know that the editor will stop you from using some banned urls and such, does that not catch Imdb/geni/goodreads? Gnisacc (talk) 23:39, 18 December 2024 (UTC)[reply]
Maybe something to consider as we get closer to clearing the backlog (ie under 40k) would be to run a filter sweep like this and see how many articles are referenced solely with UGC. The reward for finishing the work? It’s more work. Kazamzam (talk) 04:54, 19 December 2024 (UTC)[reply]