This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
The bot drew the first date it saw on a webpage and decided that that was the date of publication. In reality, this website has no date of publication, but instead collects UK singles chart information for a particular band. The date it added (April 20, 1966) is actually the UK chart debut of the single "Daydream" by the Lovin' Spoonful.
Speaking of that diff, I notice Citation bot is modifying a {{cite book}} citation, which contains a |journal= parameter, where one of |title= and |journal= is a strict substring of the other, which both also match /[Pp]roceedings/ or /[Mm]eeting/ or /[Ss]ymposi/. I would think with these characteristics, Citation bot could confidently alter the template type to {{cite conference}}, removing the article from Category:CS1 errors: periodical ignored (22,730) (assuming it is the only erroneous citation of this type present in the article). Folly Mox (talk) 21:36, 30 December 2023 (UTC)
Publishing platforms often omit standard citation information (like "paper presented at Some Expensive Conference") in their treatment of conference proceedings
Publishing platforms almost always omit standard bibliographic information (like editorial contributions) in their treatment of conference proceedings
There's no other way to cite something that both has chapters and is a journal issue
Altering {{cite journal}} to {{cite book}} where an ISBN is present has contributed to tens of thousands of template errors
I'm not sure why you'd characterise this particular citation template as garbage (perhaps a link could be dropped if you're not feeling like explaining), but in my efforts to contract the maintenance category Category:CS1 errors: periodical ignored (22,730), I've had cause to use {{cite conference}} in dozens of cases, and there are likely thousands more that could be identified without too much difficulty. Folly Mox (talk) 22:24, 30 December 2023 (UTC)
add a time component to some free DOIs
DOI prefix 10.1155's registrant is Hindawi, an open access publisher. However, Hindawi became open access in 2007, and some (rare) DOIs from prior to 2007 are not free, e.g.
On a related tangent, a substantial fraction of "dead" DOIs on wikipedia are DOIs that are owned by a different company than the current journal owner. Medknow is other big cause. AManWithNoPlan (talk) 18:21, 1 January 2024 (UTC)
Hindawi is the registrant for 10.1155 DOIs. Who currently legally owns that particular article is irrelevant. Headbomb {t · c · p · b}23:20, 1 January 2024 (UTC)
Citation bot removes functioning links to pdfs.semanticscholar.org from the URL parameter.
What should happen
Only remove PDF links if they're actually duplicative of the identifier, i.e. if they were redirected to the landing page. While most pdfs.semanticscholar.org URLs got broken at some point, some are still working. The presence of the identifier gives no information about the availability of the full text, so it does not convey the same amount of information. (At least until an s2cid-access=free parameter is introduced, but I'm not aware of such a proposal.)
Relevant diffs/links
special:diff/1192041925 (though in this case there's also a doi-access=free link so the title remains linked)
That's the only ISBN in the article. How would Citation bot know the intended style for the article was a hyphenated ISBN? This doesn't seem like a bug to me. Folly Mox (talk) 21:19, 31 December 2023 (UTC)
Another alternative is to replace the unhyphenated ISBN with {{Format ISBN|<10- or 13-digit ISBN>}}. That template is set to auto-subst so AnomieBOT will take care of the substing. Because Citation bot is a bot, it may be necessary to add {{Format ISBN}} to User:AnomieBOT/TemplateSubster force.
That not being acceptable, Module:Format ISBN/data holds a list of ISBN ranges and the number of digits in each of the three center digit groups of a 13-digit ISBN. Perhaps the bot can read that module.
Normally the bot is able to figure it out the DOI from the url, but here I had to give the DOI before the bot processed that citation. Headbomb {t · c · p · b}21:07, 2 January 2024 (UTC)
Thanks for checking. Too bad this publisher is so confusing, there might be some geographical restrictions involved. It's probably safer to only link PubMedCentral when available. Nemo14:35, 31 December 2023 (UTC)
Indeed, both links are free. They want your email for the PDF download, but the HTML version is displayed by default. There's nothing weird about how Cureus works. Headbomb {t · c · p · b}16:47, 31 December 2023 (UTC)
The result is the same. Their website doesn't offer anything useful, so there's no point sending people there with false hopes of finding what an open access article would usually provide. Nemo16:41, 1 January 2024 (UTC)
Can you provide a screenshot or something? Your personal browser, based on which you make decisions about what automated edits should be made across wikipedia, seems consistently significantly different than everyone else's browser. –jacobolus(t)03:36, 5 January 2024 (UTC)
Maybe you can also give more information about your browser settings / location / ...?
Every browser on my machines across 2 operating systems, including when I try to access these via a different IP address, shows the full text of the articles. –jacobolus(t)03:43, 5 January 2024 (UTC)
True, if I don't click anything I instead get a blank screen with a banner: phabricator:F41651606. (Granted, that's partly because my browser is configured to reject, by default, the various surveillance systems installed on this website.) Nemo15:27, 5 January 2024 (UTC)
{{fixed}} Ironically, it was "extra text" that caused the problem. There was an invisible unicode character in the source code from copying and pasting your lists. AManWithNoPlan (talk) 19:23, 11 January 2024 (UTC)
Any online source should use cite web. On the jazz project Cleanup Listing, I have fixed many errors due to people using cite book and cite magazine instead of cite web. On the Steve Oliver page here, Citationbot changed the Billboard reference from cite web to cite magazine. Why? Nearly always, the citation is from an online source (an online version of Billboard), not the physical copy of the magazine. I'm not a fan of Citationbot's changes.—Vmavanti (talk) 03:41, 9 January 2024 (UTC)
Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —David Eppstein (talk) 06:32, 9 January 2024 (UTC)
I disagree. A web source should use cite web. Take a look sometime at the kinds of errors found on the Jazz Cleanup Listing. I didn't change them BECAUSE they used cite news. I changed them because the cite news usages were creating error messages as found in the Cleanup Listing.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
I think it's useful to distinguish when a reference is for a print magazine, though, since the parameters will be different (presence of page numbers, date of publication, quite often the same article has different titles in print vs online). WP:SAYWHEREYOUGOTIT, if you got it from the website that’s different from a print magazine — especially for older articles which might have digitization errors from OCR or if an online version has (sometimes silently) made emendations.
Billboard is also an online database, and references to that as a magazine I think also confuses things.
This is different of course from digital facsimiles of magazines and books which are identical in all respects to the paper versions including pagination. Umimmak (talk) 06:43, 9 January 2024 (UTC)
I'm basing my judgment on 1) common sense (a web source uses cite web); 2) it's an easier template for contributors to use, based on the number of errors I have seen over eight years of editing when it comes to using cite news or cite magazine. Ask a member of the public. I have spoken to many of them over the years. I have seen their successes and their mistakes. Plenty.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
Disagree. Vmavanti has pointed ou the flaw in that reasoning. Templates like cite magazine were never designed to be used for simple web pages. Hence it having paremeters like page number. Only if it is a digital copy (e.g. scan) should such a template be used for an online source. Otherwise web pages should use cite web. Tvx1 16:59, 11 January 2024 (UTC)
What if someone is trying to verify the citation who has access to a library with a physical copy of the magazine but not proxy access to the institutional subscription to the magazine? Why not include all necessary parameters to verify a source by either physical or digital means, if the information is available? Folly Mox (talk) 17:16, 11 January 2024 (UTC)
The lack of page numbers is irrelevant. Lots of academic journals no longer refer to articles by page numbers. They are still journals and should be cited using {{cite journal}}. Same for magazines. The type of citation format describes the type of publication that is being cited, its editorial or organizational structure, not the format by which some editor happened to find it, because that is more important for understanding the nature of the source. —David Eppstein (talk) 17:57, 11 January 2024 (UTC)
This is the incorrect usage of Via. Without a URL, there can be no "via". There are also often many copies of the same book on google, and they come and go. Say "via google books" is no more helpful than "I googled it, so trust me". AManWithNoPlan (talk) 15:39, 9 January 2024 (UTC)
I don't think removal of |via= where |url= is not present constitutes an error, but I did notice in this diff Citation bot modifying a {{cite report}} (Ewert et al 2018) by reparameterising |title= to |chapter=, then adding a |title= that duplicated |series=, which I just fixed. Folly Mox (talk) 18:53, 10 January 2024 (UTC)
IMO, via=Google Books makes as much sense as via=the Faculty library, i.e., none, it is just noise. Ditto via=Internet Archive and via=JSTOR. Can someone give an example of a sensible via that does not have a url because I can't think of any.
remove publisher = NLM for cite journal
Status
Won't fix -- too many books, etc. I did most of them by hand
Factiva links are being replaced, as in this edit. The new links do not work. The original https://global-factiva-com.bris.idm.oclc.org links are created in Factiva when I use the "Share" function and select "In other accounts". So, despite the URL containing an identifier for my institution, it seems like these links are suitable for wider sharing. The new link format introduced by the bot does not let me access the news articles I've cited.
Wontfix addition of mojibake is an invitation for {{bots|deny}}, or stronger measures if it is more widespread. What do you think we would do to a human editor who did this? —David Eppstein (talk) 02:05, 4 February 2024 (UTC)
Reduce cosmetic edits by Citation bot
Citation bot is a useful tool that adds missing data and fixes formatting errors in citation templates1. However, sometimes it makes minor edits that do not affect the appearance or content of the article, such as removing a space between a quotation mark and a reference tag. These edits are considered cosmetic and are discouraged by the Wikipedia policy on bot usage. For example, see this: [11].
Cosmetic edits by bots can clutter the page history, the watchlist, and the recent changes, making it harder for editors to track the actual changes to the article. They can also trigger the abuse filter and lead to the account being blocked, as it happened to me: [12]. Therefore, I propose that Citation bot should avoid making cosmetic edits when checking multiple pages, unless they are accompanied by other significant edits.
To implement this feature, Citation bot could keep track of the number and type of edits it makes to each page before saving it. Then, it could compare the number of edits with a configurable threshold value, which would determine the minimum number of edits required for the bot to save the page. For example, the bot could save the page only if it makes at least 3 "fast" edits (such as adding a URL or an access date) or 1 "slow" edit (such as retrieving a bibcode or a DOI) per page. The default threshold value could be 1, to preserve the current behavior of the bot.
This way, Citation bot could reduce the number of cosmetic edits and comply with the Wikipedia policy, while still improving the quality and consistency of the citations. I think this would benefit both the bot operators and the Wikipedia community. What do you think? Maxim Masiutin (talk) 19:35, 25 January 2024 (UTC)
The bot is pretty good about avoiding minor edits already. Can you point to some specific minor edits? None of the above count as minor. AManWithNoPlan (talk) 14:38, 26 January 2024 (UTC)
Not that. "Economics Working Paper Archive" is not a journal.
We can't proceed until
Feedback from maintainers
In attempting to reinstantiate the same bad edit [13], the bot made a different bad edit that mashed together two differently titled versions of the same news story [14][15]. The version with the doi is the better version to use, but the bot should not have mashed that into the existing citation of the other version. —David Eppstein (talk) 20:43, 30 January 2024 (UTC)
Could you make is so that the free DOI checking based on prefixes is done before the broken check? It was a bit of a pain in the ass to deal with all the broken Medknow DOIs that should have been flagged as free but weren't beause they were broken. Headbomb {t · c · p · b}Headbomb {t · c · p · b}02:29, 24 January 2024 (UTC)
Ignoring the fact that this source took its content from Wikipedia, bot converted more-or-less correct {{cite web}} to a wholly incorrect {{cite journal}}. This error caught because the bot included html numeric entities for [[ and ]] around whatever it was that it thought to be the journal name.
Idk, can't people click through to the publisher? Seems like potentially a lot of processing to save a click for extra low effort editors.... And does google scholar have reliably complete citations? Or is it more like Citation bot would have to be programmed to follow the link and parse the target? Folly Mox (talk) 03:01, 27 December 2023 (UTC)
This request is to parse the google scholar information since it's given, and fill the template accordingly. IDC what happens to the original link. Headbomb {t · c · p · b}Headbomb {t · c · p · b}03:13, 27 December 2023 (UTC)
The bot would have to follow the link and expand based upon that. The other problem is that a lot of the links are intended to be to scholar and not the article itself. AManWithNoPlan (talk) 16:03, 27 December 2023 (UTC)
I think the key part is view_op=view_citation in the url. If that's there, seems to be a metadata page, rather than something more useful. Headbomb {t · c · p · b}22:12, 27 December 2023 (UTC)
Adding today's date as publication date for a book
Status
{{fixed}} by adding some code to detect super new dates on google books. Weird. I cannot replicate this.
Attempts to create a last/first split name from a non-name: Iowa. (Ter.) (from Google books metadata?) → | last1=) | first1=Iowa. (Ter
What should happen
values assigned to |authorn=, |firstn=, |lastn= should never be composed solely of punctuation and/or digits; this applies to the other namelists as well
What is the bot doing here? It takes out the pages and puts in "gigabyte" and some numbers. I undid it and it repeated the next day, so it's not some transient thing. Abductive (reasoning)04:46, 10 February 2024 (UTC)
This is totally wrong. The bot is replacing |page(s)= with the suffix from |doi=. The bot should not be doing that. Ever.
|pmcid=PMC5528981 (pmcid is not a valid parameter; PMC5528981 is not a valid |pmc= value) is wrongly converted to |s2cid=PMC5528981 where PMC5528981 is not a valid |s2cid= value
I can see there are some issues with the links in an article I edited on Wikipedia. The bot might find it unreliable, but the links I have used, which I feel might be the issue, are 100% legitimate. Actually, that is the website we check to confirm the results of chess matches and tournaments.
So please help me remove the warning from the top of the page. I am hoping to present it to the person, on his birthday, which is in less than 24 hours now.
Thanks
Adds Interstate Commerce Commission as report author
Status
{{fixed}} - added bad author detection to DX.doi.org code.
"United States. Interstate Commerce Commission" added as author of NTSB reports
What should happen
no author should be added for reports with no named authors, as long as the report publisher is the entity that actually authored the report. If any author is added, it should be the publisher itself, in this case the NTSB, not the Interstate Commerce Commission, who had no involvement with the report.
>Consult APIs to expand templates
>Checking that DOI 10.1111/een.13011 is operational...
!CrossRef title did not match existing title: doi:10.1111/een.13011
> Possible new title: Lava crickets (Caconemobius spp.) on Hawai'i Island: first colonisers or persisters in extreme habitats?
> Existing old title: Lava crickets (''Caconemobius'' spp.) on Hawai'i Island: first colonisers or persisters in extreme habitats?
What should happen
Bot should ignore the italics/bold/etc, or at least attempt a re-match without the formatting... or is this just the destination spitting out bad data and the bot is doing the correct recheck?
We can't proceed until
Feedback from maintainers
Single and the only author expanded from first/last to first1/last1
@AManWithNoPlan: I raised an issue in the past, but it is now archived as User_talk:Citation_bot/Archive_37, see there: "Bug? The bot should not replace first/last to first1/last1 when there is just one author"
There was a reference:
<ref>{{Cite book |last=Handy |first=E. S. Craighill |url={{google books|plainurl=y|id=PoXQAgAAQBAJ|page=120}}|title=Ancient Hawaiian Civilization: A Series of Lectures Delivered at THE KAMEHAMEHA SCHOOLS |last2=Davis |date=2012-12-21 |publisher=Tuttle Publishing |isbn=978-1-4629-0438-9 |language=en}}</ref>
The bot changed it to the following:
<ref>{{Cite book |last1=Handy |first1=E. S. Craighill |url={{google books|plainurl=y|id=PoXQAgAAQBAJ|page=120}}|title=Ancient Hawaiian Civilization: A Series of Lectures Delivered at THE KAMEHAMEHA SCHOOLS |last2=Davis |date=2012-12-21 |publisher=Tuttle Publishing |isbn=978-1-4629-0438-9 |language=en}}</ref>
To reproduce it, copy the example to your sandbox and click "Citations" button (you should have this button present as a gadget enabled in Wikipedia preferences).
@AManWithNoPlan: You wrote {{tl|wontfix}}, since the complexity of going back and changing them will just make the bot's author handling that much more insane, and it is already complicated enough.
How did you get to a conclusion that it is already complicated enough? This issue and issues like this is very important because when a bot only changes a page with from first/last to first1/last1 is not only questionable but triggers WP:COSMETICBOT violation.
Unnecessary volume parameter
Status
{{fixed}} - added International Journal of the Sociology of Language to list of journals that crossref calls issues volumes
The bot added |volume= with the value of 134, which actually pertains to the |issue=, and is already included. The actual volume of the source is the year-related value of 1998, and can be confusing; it is redundant.
No matter what article I run Citation Bot on, it reports !Operation timed out after 20001 milliseconds with 0 bytes received or >Could not resolve URL for every URL it tries to process, before eventually !Giving up on URL expansion for a while. It appears that at least some of the specialized URL expanders (for journals and such) are still working, but not the general-purpose one. :Jay8g [V•T•E] 01:04, 27 February 2024 (UTC)
Please remove claim that editors may use the bot to check a single article, because it consistently fails
Every time I have tried to use the bot to check a single article, it fails with Error: Citations request failed. The bot has always suffered intermittently from this problem, but at least BrownHairedGirl used to challenge those few editors who saturated the service. Since she was blocked, there has been noone with the expertise to call out the abuse, so presumably it has become endemic.
Regardless of the reasons, it is surely time to declare in all honesty that the bot is not a practical option for single shot use. Better still, create another instance of the bot that cannot be used by any editor for more than one article in a 24-hour period. 𝕁𝕄𝔽 (talk) 12:22, 3 March 2024 (UTC)
Example of such a failure? Because the bot hasn't had any availability issues in several years for me. Headbomb {t · c · p · b}14:36, 3 March 2024 (UTC)
I always have this issue when using the edit window button that the gadget adds, but using the toolbar button (in read mode) or the Toolforge page almost always works. I've just given up on the edit window button, assuming it was something wrong with my configuration (perhaps Firefox or one of my extensions is blocking it), since the other methods don't give me problems. :Jay8g [V•T•E] 00:13, 4 March 2024 (UTC)
Thanks. I've been using it from the edit window button and had the problems. Trying it from the toolbar link worked fine on a couple of articles. Looks like the problem is with the edit window button rather than the bot it self. --John B123 (talk) 01:00, 4 March 2024 (UTC)
I use it from the toolbar button on the edit window, I'm not aware of alternative methods. I tried to use it on Robert Hooke four times on three widely [by hours, then days] separated occasions and each time it failed. This is precisely the behaviour that I had come to expect and so had given up on using the tool. But a GA award and an upcoming DYK made me feel I should try it again. The outcome is as I expected: failure. --𝕁𝕄𝔽 (talk) 08:57, 4 March 2024 (UTC)
And just to take the wind out of my sails, I tried again just now and it went through in less than a second (no changes required, which is all I wanted to know). But my concern still stands: is anybody monitoring these failures? --𝕁𝕄𝔽 (talk) 11:17, 4 March 2024 (UTC)
That generates an error something like "no response from server". In this case, the response is Error: Citations request failed, which can only come from the bot, surely? --𝕁𝕄𝔽 (talk) 16:55, 4 March 2024 (UTC)
I feel like there's a bit of confusion about the different options here. There are three different ways to activate the bot:
Through the "expand citations" link in read mode (in the tool area on the right side of the page if you're using the default skin) - this almost always works unless the bot is down
Through the "✓ Citations" button in edit mode -- this almost never works and gives the "Error: Citations request failed" message described by users above
The first two sometimes give a Wikimedia Error message but the bot still runs, as AManWithNoPlan describes above. The last option is the one that I believe John B123 and JMF are describing. This is the one that puts the edits into the edit window to be saved under the user's account (rather than under the Citation Bot account), so it's not something that can still work in the background even if the browser gives up. :Jay8g [V•T•E] 19:05, 4 March 2024 (UTC)
But the browser is not giving up, because the response in that case would be "no response from server". It has to be the widget.
So if the honest appraisal is that the ✓ Citations button is consistently unreliable (which is true), then it is time to remove it and stop advertising it. 𝕁𝕄𝔽 (talk) 08:51, 5 March 2024 (UTC)
The button is still useful and works in the vast majority of cases. That it fails on very large pages and on select citations is not a reason to remove the option. Headbomb {t · c · p · b}12:34, 5 March 2024 (UTC)
In that case, can the Error: Citations request failed message be improved? Like "try again using Expand citations option in the tools column (left)"? Because it all honesty, I would have to say that the toolbar icon works in the vast minority of cases. I have been complaining here for years and it has taken until now to find the solution. --𝕁𝕄𝔽 (talk) 12:45, 5 March 2024 (UTC)
Is there data to show that it "works in the vast majority of cases"? There are three of us here who can't seem to ever get it to work. It's possible that it doesn't work in some browsers/configurations -- I'm using Firefox. :Jay8g [V•T•E] 19:09, 5 March 2024 (UTC)
Bot added |chapter= to {{cite journal}}. {{cite journal}} and the other periodical or periodical-like cs1 templates ({{cite magazine}}, {{cite news}}, {{cite periodical}}, {{cite web}}) do not support |chapter= (and aliases |contribution=, |entry=, |article=, |section=)
This happens every couple days. They are logged, and I go back and manually fix them (if someone else does not get to them first). Almost always, the citation is broken before the chatper is added. AManWithNoPlan (talk) 15:03, 20 February 2024 (UTC)
The bot is probably trying to address all those broken reFill edits, but in some rare cases it's actually correct |title=Wayback Machine. Not sure how to address. -- GreenC23:45, 28 February 2024 (UTC)
Should not remove via=The Wikipedia Library on cite encyclopedia
Not that. The via link on this citation is a necessary part of the citation, as it describes how a copy of the citation was and can be obtained. Without that information, the citation fails to describe how to find the reference.
We can't proceed until
Feedback from maintainers
I think I agree with Citation bot on this one. I think the parameter value should be |via=EBSCO Literary Reference Center Plus. It wasn't obvious to me, a Wikipedia Library user, that I was supposed to use the default search bar at the top that is powered by EBSCO. They have pretty poor coverage of my usual topic areas. After forgetting to place my search term, "Baker & Taylor Author Biographies", in quotes for literal string matching, I actually went to Taylor & Francis next on a misguided hunch, before just asking google which publishing platform licensed the reference work, after which I was able to verify that TWL does provide access to it.
If that was my experience, what about the experience of a reader without TWL access who tries to verify that citation? What about our experience when someone sets |via=Inaccessible University Undergraduate Library System? Folly Mox (talk) 07:29, 14 December 2023 (UTC)
The fact that you found this specific instantiation of WP:SAYWHEREYOUGOTIT difficult to follow might be a reason for making an easier-to-follow recipe for finding the information. It is not an excuse for blanking that information. Also, although that source happens to be in the EBSCO source, I think the default search bar uses a combination of sources. I have found plenty of non-EBSCO material that way. I agree the search is not good in general, but in this case searching the title as a quoted string found it easily. —David Eppstein (talk) 07:33, 14 December 2023 (UTC)
I agree that had I formatted my initial search properly, I would have found the source without false starts and getting lost. I think what I'm trying to communicate is that |via=(membership in something with an institutional subscription) is never going to be helpful for people outside that membership, and even for the members it's more of a starting point (yes, I should be able to access this content) than a way (via) to access the content. Just some sleepy thoughts. Folly Mox (talk) 08:09, 14 December 2023 (UTC)
For the same reason you think we should remove all paywalled doi links on journal articles because they are never going to be helpful to someone without a subscription? Maybe just remove non-free-to-read references altogether? No. —David Eppstein (talk) 15:23, 14 December 2023 (UTC)
I'm sorry I communicated so poorly. That is not at all what I intended, and after having slept I do agree with you that no |via= parameter is a disimprovement in this case over TWL, but EBSCO would be more helpful (since other institutional subscriptions have access to it). Folly Mox (talk) 17:18, 14 December 2023 (UTC)
The distinction I'm attempting to draw here is between publishing platforms (accessible to many different groups, host the actual content, material of sufficient interest to wealthy outgroup folk can be purchased for an exorbitant sum) and access systems (TWL, SomeUniversity.edu, "I have access to ProQuest because I'm a journalist or whatever"). Access systems are generally entirely closed and invite-only, and typically don't offer a means to help specify which work is cited except by proxy links that only work when logged in to the access system.In this case, if we take "The Wikipedia Library" to mean "The Wikipedia Library search bar", that does sufficiently identify the source, but it still only works for us. Even if we take it at face value, like I did, it gives us a starting point for verification in the way that Citation bot's removal doesn't. EBSCO would let any reader know which publisher they or their institution needs a subscription with (or to hand over money to) in order to verify. Folly Mox (talk) 19:03, 14 December 2023 (UTC)
I do not know how to make an EBSCO link to EBSCO content obtained through the Wikipedia Library that will remain permanently valid and will allow both Wikipedia Library subscribers and other EBSCO subscribers to access the content. If I did know how to provide such a link, I would have used it instead of just saying that you can find the content through the Wikipedia Library. Maybe you can educate me on how to provide such links instead of continuing to harangue me on how the access method I described was somehow so useless that bot-removal was an improvement. —David Eppstein (talk) 20:03, 14 December 2023 (UTC)
David Eppstein, I'm legitimately deeply sorry I've made you feel harangued. I've been trying to explain myself, because I was feeling misunderstood entirely (which is likely my fault due to poor wording). I did say above that I have come round to the feeling that Citation bot's edit was a disimprovement on your original |via=. As to creating an EBSCO link, that's also not what I intended to mean. My position is that the most useful value of |via= for this citation is "EBSCO Literary Reference Center Plus" as I said in my original comment. That's all.Sorry again. Folly Mox (talk) 21:51, 14 December 2023 (UTC)
I, for one, would not have any idea how to access "EBSCO Literary Reference Center Plus" (except maybe after seeing this thread), despite regularly using The Wikipedia Library. —David Eppstein (talk) 22:19, 14 December 2023 (UTC)
I also agree with Citation bot. Inclusion of |via=Wikipedia Library is cruft of very low value. The specific library system through which someone accessed an source (or even, gasp, Sci-hub) does not need documentation. Ifly6 (talk) 15:51, 14 December 2023 (UTC)
We need some way of identifying how to find the citation. In this case my judgement as an editor was that the title and name of work alone were inadequate, and that the via= provided that identification. This is not the sort of judgement Citation bot should be automatically reversing. Your opinion as another human agreeing with the removal is not relevant to the question of whether this is the sort of edit a bot should be making. —David Eppstein (talk) 17:56, 14 December 2023 (UTC)
In this particular case, the via= parameter is rather helpful; the citation is bare enough without it that improving it was on my list of things to fix about the article. The only bot edit I could imagine being good here would be to wiki-link all occurrences of The Wikipedia Library in the via= parameter, because it's probably unfamiliar to readers who aren't themselves fairly serious Wikipedia editors. XOR'easter (talk) 18:14, 14 December 2023 (UTC)
From the citation I'm not quite sure what "Baker & Taylor Author Biographies" is. It would help to specify that Baker & Taylor is the publisher and what format the work is in. It seems to be some kind of database, so people would know to search it in the usual places like Worldcat. Given the date, it's most likely based on a previously published book which the publisher has acquired, so the best solution would be to cite the original authors and source. Nemo20:58, 14 December 2023 (UTC)
I don't know exactly what it is either. It is what The Wikipedia Library told me the citation was from. The suggested AMA-format citation provided by EBSCO / The Wikipedia Library is:
You will notice the useless login-page url and the total lack of publisher and format information. Given that information, it's not obvious how a human editor could reasonably have been expected to produce anything better. But we are not here to talk about that, we are here to talk about how a bot editor can be prevented from making a not-very-good citation even worse. —David Eppstein (talk) 22:24, 14 December 2023 (UTC)
Incidentally, by some web searching I found a different way to link EBSCO content: if you use the "permalink" function on the right toolbar you will get a link that demands a Wikipedia Library login rather than an EBSCO login. So I guess it can only be read by other Wikipedia library users? How helpful. —David Eppstein (talk) 22:36, 14 December 2023 (UTC)
I'd recommend replacing the "Via" with "Literary Reference Center Plus", since the source is not really the Wikipedia Library per se. I see using the latter for "via" as something akin to putting "via=My local librarian printed it out for me", which is frankly not very useful to anyone who has a different local librarian. –jacobolus(t)00:45, 15 December 2023 (UTC)
The intended meaning of the "via" was that to access this source, assuming you have Wikipedia Library access, you should go to the Wikipedia Library and type the title into the search bar across the top of the screen. The search bar is not labeled "Literary Reference Center Plus". I do not know what "Literary Reference Center Plus" is. Searching the Wikipedia Library page for the string "Literary Reference Center Plus" finds nothing. Putting via="Literary Reference Center Plus" would, for me, be as useless as leaving it blank. Not everyone has the same local librarian but all established Wikipedia editors (you know, the people who might want to verify a reference, for instance to see what it says in the context of an AfD discussion or to use it to expand the article) have the same Wikipedia Library. It would be better to have a link that readers and not just editors could access, but we don't. And again, you're missing the point: it should not be whether someone else might have come up with a better description of how to access the reference, it should be whether it is appropriate for a bot to be blanking this deliberately-included information. —David Eppstein (talk) 01:57, 15 December 2023 (UTC)
It is indeed unfortunate though that EBSCO and Baker & Taylor are apparently really bad at providing meaningful links or information about their various published documents.
There is at least a little bit more relevant metadata which might help someone locate this document: Baker & Taylor Author Biographies is OCLC877175691, and apparently at Literary Reference Center Plus (the name of the EBSCO database providing the document, accessible from a wide variety of public and university libraries, which should definitely be mentioned somewhere in this citation), this particular record is apparently Accession Number 49334395.
You're probably right that the bot shouldn't blank the via parameter in this kind of case. I wouldn't be surprised to see a human editor blanking it though. –jacobolus(t)03:26, 15 December 2023 (UTC)
Finally through some more searching I find that the correct solution (I think?) should be to use {{EBSCOhost}} with the id as a parameter. I say "should be" because it doesn't actually work. The example in the EBSCOhost template documentation leads to a document, but the one in the citation above just sends me to a search page that tells me nothing by that id was found in the "Academic Search Complete" database. To make it work I also have to include the magic incantation dbcode=lkh: "Anne Sigismund Huff". Baker & Taylor Author Biographies. January 2000. EBSCOhost49334395. Now wouldn't it be nice if a bot could figure all that out instead of just blanking things. —David Eppstein (talk) 06:28, 15 December 2023 (UTC)
Add links to Internet Archive Scholar archived copies, where available and found by DOI, if Unpaywall and PMC have none.
We can't proceed until
Feedback from maintainers
This should be relatively fast with the API; Google Scholar is doing the same and shows those OA links, which were generally archived due to being public domain or CC-licensed. You can see the docs at https://scholar.archive.org/api/redoc but here's an example:
If by "a lot" you mean about 2 million out of 25 million: yes, I'd expect the entire arxiv to be archive by IA scholar. There's no need to link these if there's already an arxiv identifier. (Though it's sad that the arxiv identifier doesn't auto-link.) Nemo22:31, 4 December 2023 (UTC)
I am curious which type of url is best. I am always a bit leery of PDF links that do not end in PDF (option 3). I wonder if the first method would ever provide multiple options. AManWithNoPlan (talk) 22:13, 7 December 2023 (UTC)
Recommend the /download/ link, because it has the .pdf extension, it's more standard than the scholar.archive.org URLs, the URL is shorter and less complex, it's more aligned with where the content is actually located. scholar.archive.org is basically an index, not a repository. The data is hosted at //archive.org (that seems confusing since it's the same site but they are different servers). -- GreenC01:21, 8 December 2023 (UTC)
As GreenC says, the archive.org/download/ links are usually preferred. In this case I'd prefer the scholar.archive.org resolver because 1) the edits will look more consistent, using the same domain name whether the PDF is under web.archive.org or archive.org, 2) some of these items might be split and relocated in the future, in which case the scholar.archive.org links will probably still work somewhat but the archive.org/download/ links may break. These are just aesthetic or very rare issues though.
I recommend using scholar.archive.org for the works which are linked to web.archive.org though, because bots and the cite templates themselves often complain about web.archive.org being in the url parameter, so you'd be forced to add all of url, archive-url, url-status=unfit and the entire family of parameters. Nemo09:19, 8 December 2023 (UTC)
From valid reference to unrecognizable junk in three Citation bot edits
Not reporting as a bug, because I think the original bug that started this chain of garbage is long fixed, but: Special:Diff/924930722 (2019): adds the dois for the reviewed items to two references to reviews of the item; Special:Diff/958243206 (2022): expands one of the references with more metadata from the doi; Special:Diff/1196114785 (2024): piles on even more metadata creating a broken citation template because of incompatible parameters (|chapter=, added in this edit, and |journal=, present in the original reference).
This is a phenomenon I have frequently complained about, to little avail: when Citation bot takes a single pass over an article, the results are often (but not always) improvements. But when Citation bot and the other bots take pass after pass after pass over an article, any mistakes are amplified, to the point where eventually they overwhelm the improvements.
Given that the last of these was "suggested by Grimes2": User:Grimes2, you are ultimately responsible for these bad edits. Please take more care in checking that the results of your suggestions are actually improvements. —David Eppstein (talk) 18:06, 16 January 2024 (UTC)
Tarantello, Gabriella (2009). "On Some Elliptic Problems in the Study of Selfdual Chern-Simons Vortices". Geometric Analysis and PDEs. Vol. 1977. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-01674-5_4. ISBN978-3-642-01673-8.
That is not my version. My version is Esposito, Pierpaolo (2010), Mathematical Reviews, MR2500525{{citation}}: CS1 maint: untitled periodical (link). It is intended as a reference to Esposito's review of Tarantello's book, not as a reference to the book. —David Eppstein (talk) 19:17, 16 January 2024 (UTC)
The problem is that the DOI is wrong. Once a citation script gets ahold of an incorrect DOI, there's really no stopping the compounding errors. Folly Mox (talk) 19:48, 16 January 2024 (UTC)
The problem is that the wrong DOI was added by Citation bot (an old fixed bug). And then over the course of two more edits Citation bot took that wrong DOI as an excuse to add more and more wrong metadata until the citation was totally trashed. The problem is that this sort of bot edit can amplify earlier bot mistakes into bigger mistakes, and over time that noise comes to dominate any signal. —David Eppstein (talk) 20:29, 16 January 2024 (UTC)
The review is subscription-only content under the MR. If you are not a subscriber you will only see the metadata for the book that it reviews. If you are a subscriber you will see the review: a paragraph of text, beginning "The author gives a nice and very clear survey on some planar elliptic problems ..." —David Eppstein (talk) 20:27, 16 January 2024 (UTC)
I'm all for people activating Citation bot taking more care in making sure that Citation bot's edits aren't creating template errors or garbling references. I've been slowly gnoming away errors via Special:RandomInCategory/CS1 errors: periodical ignored, and low key recording which user activated Citation bot where Citation bot is responsible for the error and I bothered to check the history. Results can be seen by searching my recent contributions for "activator" (changed from "suggested by"). Most of the major Citation bot users make appearances there, and the sample size is still really low, but the vibe seems to be that the people who use Citation bot to run over a whole category don't appear to check in on its output after a run, which they should definitely be doing.That said, I'm not sure how anyone could have detected the problem under discussion in this thread after the most recent run. Even with the stable identifier MR 2500525, I don't see any mention of the reviewer Pierpaolo Esposito, and haven't been able to find this review with a manual search of the AMS website or with google scholar. At least on my device, the linked review has no information that it's a review at all, and just gives the information about the reviewed work. David Eppstein, is there more information there for subscribers? As it stands, I don't know if there's any way a script could have avoided this error. Folly Mox (talk) 20:18, 16 January 2024 (UTC)
Grimes2, I haven't seen you leaving Citation bot errors unaddressed during my maintenance category repair work. Thanks for your diligence. The error under discussion here seems it would have been difficult to spot without an AMS subscription. Maybe for citations to this source going forward, the citation template syntax should include a string instructing Citation bot to ignore it. Bypass comments could also be added to existing citations to Mathematical Reviews in an AWB run, to prevent any further garbling. Folly Mox (talk) 22:13, 16 January 2024 (UTC)
The specific error of adding a DOI to an MR review citation was fixed long ago. My concern here is more general: that repeated bot runs tend to amplify earlier bot errors, so that the more times the bot is run on the same citations, the less likely it is to be a good citation. We need some mechanism to cut the bad feedback loop early. —David Eppstein (talk) 22:42, 16 January 2024 (UTC)
It's not so bad to run citation bot again on earlier bot errors, because the error becomes obvious (red error message/error category). Could citation bot detect those red error messages. Grimes2 (talk) 22:57, 16 January 2024 (UTC)
It would be pretty convenient if Citation bot could read the categories of the pages it edits before and after, and if it adds a page to a maintenance category, notify the activator on their talkpage. Folly Mox (talk) 00:19, 17 January 2024 (UTC)
|work=[[American Political Science Association]] 2010 Annual Meeting Paper -> |journal=American Political Science Association 2010 Annual Meeting Paper
What should happen
|work=[[American Political Science Association]] 2010 Annual Meeting Paper -> |journal=[[American Political Science Association]] 2010 Annual Meeting Paper (wikilink should be retained)
In Special:Diff/1212531849, the bot borked yet another citation forcing manual cleanup. The citation in question goes to the 1969 2nd Dover edition of Neugebauer's book The Exact Sciences in Antiquity. Instead, the bot added metadata from the 1957 first edition (matching the "orig-year" parameter), published as a book in Acta Historica Scientiarum Naturalium et Medicinalium. By adding this using a |journal= parameter (invalid for cite book), instead of the correct |series= parameter, the bot broke the citation, causing it to emit an error message. The pmid that appears to have triggered this bad edit was also previously added by Citation bot, in 2020, in Special:Diff/985257358.
What should happen
Not that. At a bare minimum, the bot should never add |journal= to {{cite book}}.
Semantic scholar links continue to mostly consist of spam
Can Citation bot please stop littering every s2cid it can find wherever it can possibly fit? The vast majority of these links contain zero useful information beyond a (redundant) link to the publisher's website (typically paywalled), and putting them on every citation in Wikipedia is more or less spam. It's a distracting waste of space with no redeeming benefits.
The easiest solution here would be to deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it.
Next best, probably my personal recommendation, would be that only humans should ever add s2cid links (and ideally the ones which were added by a bot in the past should be removed), or barring that that a human should manually review any s2cid that gets added by any bot. At the very very least, the bot should try to check them for meaningful content and skip the vast majority of totally useless ones going forward. –jacobolus(t)18:13, 20 October 2023 (UTC)
Agreed as well! They only got added because of someone who works for Semantic Scholar (Help talk:Citation Style 1/Archive 66#Request to add Semantic Scholar IDs to the citation template). If there is truly a consensus among editors working on a page that it would improve the citation to include an |s2cid= … fine I guess, but a vast majority of the time someone who has never edited a given article runs the prompt and the bot clutters up all the citations with a spammy parameter without any human editors actively wanting it there. Umimmak (talk) 18:59, 20 October 2023 (UTC)
Although I don't agree that s2cid is a spam, still, the point is not whether it is a spam or not, but how to tell the bot to not add this attribute.
One option could have been via a template. For example, in cs1 config we may add an attribute s2cid=disabled (or any other boolean value that means no or false or zero). Another option is to use "bots" template. For example, on my user page I can specify {{bots|optout=cs1-errors}}. We may add an attribute such as {{bots|optout=s2cid}}
Whichever option you prefer, we need a consensus. With a consensus, I can ask the citation bot developers to accept this feature via my source code pull request. Maxim Masiutin (talk) 00:29, 4 January 2024 (UTC)
In my opinion the bot should never add this template parameter, and should remove every existing one that was ever added by a bot. In theory, the parameter would be okay in cases where it adds a new unique access to the full text which was not otherwise available. I have literally never seen this happen in practice. –jacobolus(t)04:47, 4 January 2024 (UTC)
I would also be supportive of "deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it", along with stopping the bot from adding them. Unlike most of the other codes we use, I cannot remember ever seeing a case where these were useful. Stopping the bot is on-topic here but the other stuff should probably be discussed on Help talk:Citation Style 1, which is the centralized discussion point for all the citation and cite templates. —David Eppstein (talk) 20:22, 20 October 2023 (UTC)
I think this depends on which articles you are reviewing. There are plenty of useful places like S2CID16831869. Citation bot already avoids adding s2cid where there are no sources. — Chris Capoccia💬19:08, 25 October 2023 (UTC)
The example you cited is a poor example, because the publisher's page is open access; this citation should use doi-access=free and not include an s2cid. Citation bot already avoids adding s2cid where there are no sources – This is nowhere close to accurate. Citation bot adds tons of completely vacuous s2cids that provide no information beyond a link to the publisher page, more or less analogous to blogspam. –jacobolus(t)19:14, 25 October 2023 (UTC)
You're not paying attention to what I wrote. Yes it adds s2cid where only link is publishers and same as DOI. But it does not add s2cid where there are no sources. — Chris Capoccia💬15:30, 26 October 2023 (UTC)
What do you think the point is of adding an S2CID containing no meaningful content beyond a link to the publisher's website which was also already included in the citation template? From my perspective, such S2CIDs are spam with zero redeeming value. –jacobolus(t)15:57, 26 October 2023 (UTC)
I don't agree with you. The bot is a tool, and there is nothing wrong in the tool to add s2cid if it is a legitimate attribute in the cs1 module. If you think that s2cid should not be used in Wikipedia, ask for the removal from the cs1 module, not from the bot. Maxim Masiutin (talk) 17:41, 16 February 2024 (UTC)
Came across some more S2CID spam today which led me to this conversation. Is there an actual way to have an RfC or something for this? It's fine if humans want to add it, but for something with a DOI already there, having a bot add something that is pretty useless doesn't help. Why? I Ask (talk) 05:15, 12 November 2023 (UTC)
I personally like Semantic Scholar but I never use s2cid links from English Wikipedia citation templates. It's one of those IDs which are useful sometimes when everything else fails, but should probably be hidden by the citation templates in most cases. I don't know whether it's realistic to get such a change implemented in the citation templates though. Nemo13:24, 4 January 2024 (UTC)
Citation bot continually adds back particular S2CIDs even when they have been explicitly manually removed by humans for being useless. Can whoever controls this bot please stop such behavior? (Or ideally just get rid of S2CIDs altogether?) Otherwise I am encouraged to ban Citation bot from editing particular article pages altogether. –jacobolus(t)15:59, 16 February 2024 (UTC)
The bot does not edits the pages by itself, it is asked by the people to edit the pages or is used as a gadget. Talk to the people who ask the bot to edit the pages you are referring to. Maxim Masiutin (talk) 16:04, 16 February 2024 (UTC)
People invoking Citation bot never check the edits, and should not be responsible for doing so if you want to have Citation bot continue to operate as it currently does. They certainly aren't going to be manually double checking every bit of metadata spam to check whether it's useful or not. –jacobolus(t)16:30, 16 February 2024 (UTC)
You raised good point. People are responsible if their edits are by tools or by bots. As for the consensus on whether the citations bot should add the S2CIDs or not, there is no consensus on not adding it, therefore, you don't have grounds for banning the bot. Maxim Masiutin (talk) 17:39, 16 February 2024 (UTC)
The SC2IDs in question contain literally zero useful information. They typically have a subset of the metadata already included in the Wikipedia article, and are nothing more than a redundant spam wrapper around the DOI, which typically points at a paywalled publisher website. Including them in Wikipedia serves no encyclopedic purpose. Readers do not benefit from these links, and any reader who cares to find a semantic scholar link can trivially find it for themself. The CS1/CS2 templates, Citation bot, and ultimately Wikipedia are abusing readers' attention for strictly marketing purposes, which violates core Wikipedia principles.
If we're going to include these marketing links we might just as well also include links to every other citation index (Google scholar, Microsoft Academic, Scopus, and so on), book selling website (Amazon, etc.), etc. These are typically more valuable than S2CIDs, and if spam marketing is fair game, the more the merrier right?
If Citation Bot insists on adding these to pages even after they have been deliberately manually removed by human editors, it should be blocked from those pages by adding {{bots|deny=Citation bot}} to the pages. But a better solution would be for the Citation bot developers to stop spamming Wikipedia with these links. –jacobolus(t)18:08, 16 February 2024 (UTC)
Yeah I’ve been so frustrated by this and other changes that I tend to just block the bot… maybe once in a blue moon it’s useful to have a semantic scholar link, but it certainly shouldn’t be part of a standard citation. And it’s frustrating when editors who have never worked on a page before just run citation bot, can’t preview their edits to see if they make sense, and don’t check to see if the edit which was made was an improvement. The bot keeps adding it to more and more pages, creating the impression of a growing consensus, unless editors are on constant vigilance to block/revert the bot. Umimmak (talk) 20:27, 16 February 2024 (UTC)
+1 I've accepted it as a part of my digital existence here to routinely have to play s2cid bot-revert pong, but I dream of a different future. Esculenta (talk) 20:51, 16 February 2024 (UTC)
In the meanwhile, I can make a script that looks for all {{cite}} templates with s2cid parameters, and, when it find that, can do one of the two following ways:
delete s2cid parameter from that template ant put <!-- Deny Citation Bot--> to the whole template;
remove the value from the s2cid parameter and use <!-- Deny Citation Bot--> for this value only to keep it empty.
Thank you, I understood your question. You can search requests for approval and look at the contents of the requests. Probably your question on whether the bot has been modified after the approval? Maxim Masiutin (talk) 23:22, 16 February 2024 (UTC)
The bot's most recent approval predates the implementation of this parameter by almost a decade, so if there was a parameter list associated with its approvals it could not possibly have included this. Nikkimaria (talk) 23:56, 16 February 2024 (UTC)
@Jacobolus I guess the bot just supports all supported attributes that is within the scope. The bot supports exclusion of page, citation or a particular attribute from expanding. Maxim Masiutin (talk) 00:01, 17 February 2024 (UTC)
@Esculenta we can make a tag for a page to tell bits to not insert this attribute, I can implement this tag snd submit a pull request to the developers of the bot, but for now there are tags to tell the bot to not expand citations at all per individual citation oe per page or not add the id for a particular citation. Maxim Masiutin (talk) 18:50, 3 March 2024 (UTC)
That's not a good solution in my opinion. We shouldn't have to litter every page's markup with instructions telling bots not to litter. –jacobolus(t)18:58, 3 March 2024 (UTC)
I think a better solution is to disable it completely, and have whoever wants this added as a permanent feature go through the bot approval process and request community feedback on the implementation of new identifiers. Esculenta (talk) 19:34, 3 March 2024 (UTC)
It will only had the link if there is not already a free link and s2cid is licensed and this is the big one, they have a link to a PDF. That last one will stop almost all of the links, other than ones that actually are useful. AManWithNoPlan (talk) 14:46, 15 March 2024 (UTC)
I will archive, but note that existing runs will not see the change, so long category runs might take a while. I will now archive. AManWithNoPlan (talk) 15:20, 15 March 2024 (UTC)
Two errors where the bot has added parameters in error:
In Beulé Gate, given the citation {{cite journal|last1=Billard | first1=Yves| last2=Chandezon| first2=Christophe| year=2012| title=Ernest Beulé (1826–1874). Archéologie classique, histoire romaine et politique sous Napoléon III| trans-title=Ernest Beulé (1826–1874). Classical Archaeology, Roman History and Politics under Napoleon III| journal=Liame| volume=24| url=http://journals.openedition.org/liame/277| access-date=2024-02-09| doi=10.4000/liame.277| lang=fr|issn=2264-623X| doi-access=free}}, it mistakenly added an additional |issue=24 (diff)
In PY Ta 641, given the citation {{cite book|last=Judson|first=Anna P.|year=2020|title=The Undeciphered Signs of Linear B: Interpretation and Scribal Practices|publisher=Cambridge University Press|isbn=9781108859745|doi=10.1017/9781108859745}}, it mistakenly added |url=https://www.repository.cam.ac.uk/handle/1810/265630. This URL links not to the book but to Judson's PhD thesis by the same name. (diff)
In the meantime, I've marked the citations with comments so that the bot doesn't get to them.
In two consecutive edits last summer, Citation bot modified a journal citation with an incorrect date and doi, but otherwise correct metadata, by adding an incorrect isbn and s2cid pointing to the 1989 conference version of the same paper (Special:Diff/1169023860), and then relied on the bogus metadata it had just added to convert the citation to a book citation, adding the book title but leaving the journal title in place and creating a borked citation template (Special:Diff/1171957733). This error in a BLP went unfixed until I found it just now.
The reference in question was a book style reference to a journal. The bot guessed wrong. I have since cleaned up the reference by hand some. AManWithNoPlan (talk) 14:31, 16 March 2024 (UTC)
fails
Status
Not a bug of the bot, but general finickiness of the gadget
Afaik [?], Grove Online is routinely cited inline using Template:Cite web, which (unlike Template:Cite Grove) allows for inclusion of actual author information. Afaik, this is correct and therefore does not require automated correction. 86.177.202.175 (talk) 18:50, 27 December 2023 (UTC)
I've tried with Cite Grove, like this (though to my eyes it looks a bit 'busy'). Fwiw, in the paaast, I think I've seen refs like this changed to Cite web (perhaps because of it being the 'Online' version?) 86.177.202.175 (talk) 22:01, 27 December 2023 (UTC)
In Special:Diff/1138690565, the bot removed the |chapter= parameter from a reference to the chapter "Eulerian Numbers" in the book Eulerian Numbers, possibly out of confusion because of the fact that the chapter and the book have the same title. This left the reference in a state where it cited the whole book but didn't name the chapter within it that its doi and page numbers pointed to. Then, in Special:Diff/1211621780, the bot decided to use the doi to fill in the chapter parameter once more, but in doing so it removed the |title= parameter of the reference, again likely out of confusion from the equality of titles. The combination of these two edits left the reference in a broken state without a book title. This had already happened twice before, in Special:Diff/984898465, Special:Diff/1068697840, so the bot has broken the same reference in the same way at least three times, going back at least to 2020.
What should happen
None of those things
That is an interesting problem (I know someone who's first and last name are the same, and it causes similar confusion with people). I will look into ways to detect that. I have added comments to both parts to fix the specific page. AManWithNoPlan (talk) 15:14, 6 March 2024 (UTC)
Removed proxy/dead URL that duplicated identifier
There is a message in the edit summary "Removed proxy/dead URL that duplicated identifier".
I find this message frightening and even misleading, because the word "dead" is frigthening on itself, however, in most cases URL is not dead but just duplicates the identifier such as PMID or DOI, such as doi-10.15347/WJM/2023.003|url=https://doi.org/10.15347/WJM/2023.003 or pmid=35987379|url=https://pubmed.ncbi.nlm.nih.gov/35987379/. These URLs are not dead, and they also cannot be considered "proxy" in a classical sense. Please consider removing "proxy/dead" from the message so it will be just "Removed URL that duplicated identifier", for the following reasons:
The term "proxy/dead" might be confusing for users who are not familiar with the terminology; also, these messages are read not only by the users of the citation bots but by all Wikipedia editors; if the bot modified an article those editors were working on, they would see this frightening message without ever using the bot. A more straightforward message "Removed URL that duplicated identifier" would be easier to understand and more accurate.
The term "dead" is often associated with broken or inaccessible links, which is not the case here. The URLs are functional and simply duplicate the identifier. Therefore, using "dead" might lead to misunderstandings. The word "dead" can have negative connotations and might cause unnecessary alarm. Using neutral language would contribute to a better user experience.
The term "proxy" in a classical sense refers to a server that acts as an intermediary for requests from clients seeking resources from other servers. In this context, it might not be the most appropriate term to use.
Please remember that these messages go to the edit summaries which are kept forever, these are not just log message that only one user will see. Therefore, we should be very cautions about the edit summaries that we leave. Maxim Masiutin (talk) 09:19, 27 March 2024 (UTC)
I'm guessing the bug is due to the page being a download page. A way to resolve this, I think, is to convert the download page into the original page. Spinixster(trout me!)07:42, 26 March 2024 (UTC)
Since when is it ok and not a violation of WP:CITEVAR for the bot to reformat manually-formatted citations to use the citation templates? Despite the edit summary "Changed bare reference to CS1/2", this was not a bare-url reference; it was formatted, but manually formatted. There are many reasons to use manual formatting for references, among the most salient being not wanting the bots to mess with the citations. —David Eppstein (talk) 19:24, 27 March 2024 (UTC)
Error with adding issue param of a letter when none is needed
Hello, I have noticed the bot has made an error on the Kingsman (franchise) page when it has been used three times on it (first by me when I was cleaning it up and noticed it, and the second times by two other editors performing standard use). The edits in question are the same: here and here. It appears the bot is looking for an |issue= use in an archived dead CBR citation which has a quote in it, and the bot is pulling the "C" from the "U.N.C.L.E." bit of the quote as an instance of |issue= when it is not. It removes the "N.C." from the word, thus breaking the link as a result. Trailblazer101 (talk) 01:53, 2 April 2024 (UTC)
If there is no URL, the access date is pointless and will throw off an error. It's often pointless even when there is a url. Headbomb {t · c · p · b}05:47, 31 March 2024 (UTC)
Page shows the bot's changes after it finishes analyzing
Relevant diffs/links
"Error: citations request failed" box shows up
Replication instructions
. Enable the citations expander gadget in Wikipedia. Press ✔ Citations button on pages. After waiting for a bit, this error box will show up around 85% of the time. See image
If for some reason, you want to edit just part of a page and the button fails, then copy the section (maybe the whole page???) to a sandbox and use the Expand citations option. AManWithNoPlan (talk) 14:11, 27 March 2024 (UTC)
>Expand individual templates by API calls.. nothing found.. no record retrieved.
>Remedial work to clean up templates.
>Writing to Draft:Scottish mother's day...
!API call failed: Edit conflict.. Will sleep and move on.
!Unhandled write error. Please copy this output and report a bug. There is no need to report the database being locked unless it continues to be a problem. .
!Possible edit conflict detected. Aborting.
diff
Twice today at 2022 Glasgow City Council election, the bot has added |date=14 April 2022 to a citation which can't possibly have been published then. The election took place in May 2022 so the information in the source can't have been published before that.
I don't mean edit summary, but maybe you could implement a query to request current version or an URL to see current version used. Maxim Masiutin (talk) 00:25, 25 April 2024 (UTC)
OK, thank you! I didn't know that the version at the "Citations" button and on citations.toolforge.org is the most current; I thought there is a time lag between GitHub and the Wikipedia/toolforge use. Maxim Masiutin (talk) 01:06, 25 April 2024 (UTC)
The delay is usually in the range of minutes. Although, once a run starts the bot version for that run does not change. Won't fix. AManWithNoPlan (talk) 01:24, 25 April 2024 (UTC)
Submit a pipe-separated list to a queue instead of immediate processing for particular users
Can you please implement an opportunity to submit a pipe-separated list of article to expand via https://citations.toolforge.org/ to a queue instead of immediate processing for particular users who has legitimate interest for it, such as me to hunt for NULL DOIs? Currently, when then I submit a list that cannot be quickly processed, my web browser shows me a timeout error and no pages are processed or only a few so I don't know where it stopped. I'd like to ensure that all pages were processed sooner or later if I am authorized to submit such requests.
The was implemented, and then people got all mad that they cite newspaper template was radically different than cite news and they chose it on purpose. AManWithNoPlan (talk) 00:37, 25 April 2024 (UTC)
Bot tries to convert extra text to date when it shouldn't
It seems like whenever there is a string of numbers that looks sort of like a year in extra text floating around in a citation template, the bot decides that's a year -- even when those numbers are part of a longer string. This seems like a bad idea.:Jay8g [V•T•E] 06:51, 23 April 2024 (UTC)
When I run the bot for a citation template where I have entered a doi about a chapter, the bot does not add the author name to the template.
Thank you for letting me know. I also tried using the book information URL, but it doesn't seem to work. However, the author's name seems to be written in the Citation Tools on the book information page.
Special:Diff/1218002845 changes a {{cite book}} that reads Demaine, E. D. (2001). Folding and Unfolding. Doctoral Thesis (PDF) (Doctoral Thesis). University of Waterloo, Canada into a {{cite thesis}} that reads Demaine, E. D. (2001). Folding and Unfolding. Doctoral Thesis (PDF) (Doctoral Thesis). University of Waterloo, Canada. The change to template type may be correct, but the repeated words may not be. The repeated words may not be. Please please, when making changes to improve metadata, do not do not make the visible data worse. That is the wrong wrong tradeoff to be making. Meanwhile, when adding "type=", the word "thesis" should be lowercase (look! an actual easy to fix bug!) and the bot didn't even notice that the url should really be an hdl, nor properly split the author name into first and last.
Run the bot, if cbignore (placed after, as I didn't realize it would do this) is removed.
We can't proceed until
Feedback from maintainers
This isn't so much a bug, I don't expect, as just the assumption that the lowercase "dan" would be the shift key not being held. In this instance, the title of the journal is in Indonesian (it's an Indonesian journal being cited): Jurnal Kependidikan dan Kemasyarakatan, which is "Journal of Society and Education".
The article is in English, so the language=id parameter was not included in the citation. I added the cbignore template to that single citation to keep the problem from recurring, loathe as I am to do so because bots search databases and find other means of accessing an article than the means I found, which may be useful in the future should a doi become broken.
Regardless of whether this is considered a "bug", I wanted those developing the bot to be aware of the instance. OIM20 (talk) 20:16, 27 April 2024 (UTC)
The cbignore template does nothing to citation bot. While the name would lead someone to think so, it is not our template. I have added " dan Kemasyarakatan ", " dan Bisnis ", and " dan Sastra " to the list of "special non-english" phrases. AManWithNoPlan (talk) 22:13, 27 April 2024 (UTC)
I also tested it and for me, neither link works (actually neither the urls in the citations, nor the doi links). The reason is that the entire digital.ucd.ie domain is restricted. Rontombontom (talk) 08:40, 29 April 2024 (UTC)
Upon looking at it further, I think the mistake was to provide |url tags where the doi already does the link, and the |url tags had non-permanent links. (That, and the second reference was cite web instead of cite book.) I edited the page, please check is it works now for you too. Rontombontom (talk) 12:19, 29 April 2024 (UTC)
Just FYI, to explain the edit summary: the bot's usage is dominated by certain editors who initiate batch runs over categories to trawl through Wikipedia looking for articles in need of improvement. One class of articles that they trawl over is ones in need of some other kind of improvement. For some reason, every time you say "as of [YEAR]" in a Wikipedia article (using the template for that), it tags that article as needing improvement. Those articles were the ones trawled over this time. —David Eppstein (talk) 18:00, 3 May 2024 (UTC)
Not that. Every time a Citation bot changes an article without CS1 errors into an article with CS1 errors, something is wrong. This one is doubly bad, since it also added a broken doi that it knows is a broken doi.
We can't proceed until
Feedback from maintainers
That's not really a bug. It's been published, it's the correct DOI. The DOI is just borked and the bot can't figure out the details, but a human can. The error is an improvement. Headbomb {t · c · p · b}04:31, 2 May 2024 (UTC)
When a borked doi causes the bot to bork a citation, it is a bug. Really. The fact that there might have been something a more intelligent editor could do to make a less-borked citation does not prevent it from being a bug. —David Eppstein (talk) 05:53, 2 May 2024 (UTC)
Not dead dois that look dead
Status
{{fixed}} - no idea, seems to have been hiccup. I have looked a couple times and it works now
The bot adds a volume number for a journal without volume numbers. From the looks of JSTOR and elsewhere, Past & Present doesn't use volume numbers, only issue numbers. This seems to be corroborated here and here.
Citation bot has found yet another way to break harv/sfn citaitons. In this one as well as the reported long-ago damage of removing the author and thus breaking the link between the sfn/harv in the text and the source, it also adds a "{{cite book}}: Unknown parameter |agency= ignored " error. DuncanHill (talk) 11:19, 5 May 2024 (UTC)
I see that AdsAbs API limits the number of requests for the bot at https://citations.toolforge.org with 25000 per day which is saturated by the end of the day. I could wish to supply my own AdsAbs API key to for the requests I made to not spend the amount of requests that can be made by the other users, and, if it is already overwhelmed by the time I make my request, the bot will be able to use the API with my key.
The env.php file contains @putenv('PHP_ADSABSAPIKEY=xxxxx'); but we should probably change it to
if (!getenv('PHP_ADSABSAPIKEY')) {
@putenv('PHP_ADSABSAPIKEY=xxxxx'); // https://ui.adsabs.harvard.edu/help/api/
}
so that we could supply the API key parameter to https://citations.toolforge.org such as via a hidden form parameter that will go to this environment variable; still, we should make due sanitization of characters by only allowing hexadecimal characters and by length checking to avoid various threats such as injections. Maxim Masiutin (talk) 10:07, 22 April 2024 (UTC)
I am occasionally running the bot via citations.toolforge.org to check the pages which were known to contain NULL dois, so this should be helpful in rechecking the DOIs. Hope it helps. Please let me know whether I can continue sometimes running the bot on NULL-doi-high-probablity pages at citations.toolforge.org, or this is not needed.
Presumably intended to be PhD (doctorate_philosophy). Probably not enough value to add anything like this. Izno (talk) 20:43, 9 May 2024 (UTC)
Yes. The fact that this was a doctoral dissertation was already spelled out in the series= parameter but maybe type= would have been a better choice. —David Eppstein (talk) 20:51, 9 May 2024 (UTC)