This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
This bot replaces "first=" and "last=" in author names with "first1=" and "last1=", even when there is only one author. It is considered poor style to use a 1 when not also using at least a 2. Can somebody fix this problem? Antinoos69 (talk) 16:53, 20 December 2020 (UTC)
*Don't replace publication-place
Hi Abductive, in this edit ([2]) you changed a |publication-place= parameter into a |location= parameter in a citation. Please don't do that, they are not the same. By changing the parameter you are invalidating the information in the citation. |publication-place= is, obviously, for the publication place, and |location= is for the written-at-place. (The mixup is likely because in the past |location= was a parameter used for both.) Thanks. --Matthiaspaul (talk) 09:13, 5 December 2020 (UTC)
Location is not for the "written at" place, location is the location of the publisher. "Written at" is not bibliographical information ever presented in reference lists in any style guide. |location= and |publication-place= are alias of each other, and the only place there's a distinction is in cite conference to indicate the location of the conference vs the location of the publisher. Headbomb {t · c · p · b}13:18, 5 December 2020 (UTC)
This is not true. |location= and |place= are aliases of each other, but |publication-place= is not. The publication place and written-at-place are both bibliographical information and they are presented in citations when relevant, that's why we have parameters and code to distinguish between them where necessary. --Matthiaspaul (talk) 15:15, 7 December 2020 (UTC)
While Matthias is correct, we have a category tracking where both are used because we are entertaining deprecating the separate behavior. Its size is ~300 pages. Removal of the parameter is in the domain of Help talk:CS1, but I'd recommend ignoring the complaint. --Izno (talk) 14:13, 5 December 2020 (UTC)
There is no consensus to deprecate or remove the parameter at all. There are cases where both places are not the same and both relevant to be included in a citation, but even in cases where only of them one is given or relevant, it is important to distinguish between them in the parameters. What might happen in the future is that we will have a dedicated parameter for the written-at-place (like |written-place=) to stop the (historical) mess.
If someone explicitly used |publication-place= to indicate that the given place is in fact a publication place, because s/he knows that it is a publication place rather than a written-at-place, this is a good thing helping to improve the quality of information in a citation and not something anyone should override. Ideally, the opposite should happen, move |location= information into |publication-place= if it is known to be a publication place, but this is something that no bot could do, because it needs a human checking the source if the provided information is a publication place or a written-at-place.
Citation bot replacing |publication-place= by |location= is an obvious bug that needs to be fixed. It is the same as if the bot would replace |publication-date= by |date= or |editor= by |author= or similar, it is invalidating the information in the citation.
If you want to discuss it, Help talk:CS1 is over there. I assume that the bot only treats them as synonyms when there is only one, which is exactly what the module does. It is not a bug accordingly. --Izno (talk) 15:29, 7 December 2020 (UTC)
It has been discussed. What's relevant for the discussion here is that |publication-place= and |location= are not aliases of each other. The only alias for |publication-place= is |publicationplace=; the only alias for |location= is |place=. The bot should treat parameters according to their purposes, not try to emulate historical quirks in citation templates which only exist to maintain compatibility with legacy citations. The reason for why we have |location= and |publication-place= rather than (something like) |written-place= and |publication-place= has reasons lying in the often odd development history of our citation templates using ambiguous parameter names — mistakes we have learnt from in the past and are trying hard to correct and not to repeat. We are moving forward, not backward.
Replacing a parameter dedicated for the particular purpose like the publication place by a parameter used for a different purpose is an error. An editor doing this would be reverted for it. If these parameters were aliases, this would be a pointless cosmetic edit, since they are not, it even creates damage. It destroys valuable machine-readable information carefully researched and provided by an editor. No good.
Linking to fractions of journals goes against that style guides and is almost always the wrong thing. In this case, link should be change to a publisher. AManWithNoPlan (talk) 00:44, 11 December 2020 (UTC)
I have activated citation bot in my gadgets and have clicked save, but when I click the citations box while editing all it gives me is a diff with no changes made. This is the case whether there is a template
It'll still do cleanup, find DOI identifiers (and others) and expand stuff based on database lookups. The bot doesn't only work with URLs. Headbomb {t · c · p · b}01:11, 17 December 2020 (UTC)
In Special:Diff/975926604, Citation bot added an |osti= link to the preprint version of a published paper. The full reference to which it added the link was something like "<ref>Author biography from {{cite journal| ...}}</ref>". The osti preprint did not contain the author biography from the published paper, so it was a useless addition to the reference. More harmfully, that link (or possibly a similar preprint link added later as a url by OABOT) caused another editor to think the reference was wrong and remove it from one of the claims it referenced.
What should happen
Preprint versions are not always the same as the final published version, and need manual checking when links are added to make sure that they accurately represent the part of the source needed for the reference. OABOT can do this, because it works in conjunction with a human editor who is responsible for checking the edit (whether that actually happens is a different question, but there's at least someone to blame when something goes wrong). Citation bot cannot perform this kind of check. In this case I have added a comment to the osti parameter which I hope should prevent a recurrence (unless some bot decides that fields with only comments in them are empty and should be removed), but more generally I think Citation bot should not be adding osti links, and instead leave that task to OABOT.
You miss the point. OABOT credits its edits to human editors, who can be expected to take responsibility for their bad edits, learn from their mistakes, and check more carefully in future. Citation bot operates automatically and cannot be made to have the proper level of checking. So it should not be making edits that require checking. —David Eppstein (talk) 20:44, 6 December 2020 (UTC)
But the fact that X errs should not be used as an "excuse" for Y erring as well. We are always striving for the potentially best possible behaviour one can think of, nothing less, and in the case of bots, only error-free behaviour is acceptable at all (save the occasional programming bug (bug = error) caused by the human programming the bot), firstly, because programs can be made to be error-free deterministically, and secondly, because of the speed of automated edits making it possible to cause large-scale corruption in no time.
There are more than enough tasks where bots can be very helpful and where we can be sure that the result is correct, so let's not use them for tasks where they can only "guess" and will be wrong in some cases. Using them for the latter is like using a hammer for a screw driver -- wrong tool for a task. Citations are a playing field too delicate to have them messed up by bots.
David's remark pointed to the fact that preprints should not be treated like final published versions, unless it can be checked and verified that the relevant information to support a statement in the article is present in both of them. If noone, who actually checks this, is involved, the change should simply not be carried out.
Basically, this would apply to any preprint citation where another ID would be added. (It would also apply to preprint IDs to be added to final publication citations, but we (hopefully) don't do that automatically, anyway.)
Perhaps, a good solution for this and similar cases would be if Citation Bot would add a HTML comment like
<!-- CB: @Editors, please check: |osti=123 -->
or
|osti=<!-- CB: @Editors, please check: OSTI 1234-->
instead of actually adding the live parameter, and leave it to subsequent editors to check the relevance and either include the parameter or remove the HTML comment. This way, the bot could still be helpful by providing a potentially useful hint, but without risking to mess up the citation. Best of both worlds?
The addition of pmid, pmc, bibcodes, hdl, licensed s2cid, arxiv, etc. based upon journal articles are tasks automatically done by several bots. These styles of additions have been done for over a decade. This reference is a very rare exception to the rule. The addition of the comments would generate a firestorm like none ever seen before about the citation bot. AManWithNoPlan (talk) 16:10, 8 December 2020 (UTC)
I think the things that make this particular case rare are that (1) an editor actually checked the preprint link against what it was supposed to source and noticed that it didn't source what it was supposed to, unlike many Wikipedia references which are never checked, and (2) another editor (me) found that the problem was a mismatch between published and preprint versions. I think that variation between published and preprint versions, while maybe not the majority of cases, are much less rare, and that this behavior, repeated widely by bots for over a decade, has probably led to a large undiscovered pool of bad sourcing in Wikipedia. The fact that it's widespread and long-term doesn't make it a good thing and is not a justification for continuing to do it once problems have been discovered. —David Eppstein (talk) 18:52, 8 December 2020 (UTC)
Unwanted activity
This bot doesn't seem to take into account the fact that a journal or magazine may have a number within a volume AND an overall issue number. In such situations, one may wish to cite a reference using the format "vol. 21, no. 5 (No. 347)", which unfortunately isn't readily accomodated by existing "cite" templates (at least as far as I have been able to ascertain). In the example given, it would therefore be necessary to set "number=5 (No. 347)", which this bot would then change to "number=347", thus discarding some useful information. If there is no alternative method of including both numbers in a citation, could the owner please consider preventing the bot from making such changes, thanks?
In Special:Diff/986754608 from early November, Citation bot added a pmid and in Special:Diff/990603801 from late November, it came back and added a pmc, to a citation to the paper "Quasi-random graphs" by Chung, Graham, and Wilson. There is in fact a paper with that title and authors with the pmc and pmid that were added, but it is a different paper than the one in the citation. The pmc and pmid go to a very short overview in PNAS from 1988. The citation to which these ids were added was a much longer paper in Combinatorica in 1989. Citation bot should be capable of distinguishing these two citations, noticing that the publication year and journal do not match, and not adding bogus ids of other papers to citations.
Citation to book "Rays, Waves, and Scattering" is mangled by adding journal, doi, bibcode, volume, and issue information for a book review of the same book in Am. J. Physics
This is larger undertaking than I first thought, as I work out the details in my mind and avoiding harming citations. How common is this? AManWithNoPlan (talk) 17:16, 6 December 2020 (UTC)
It is not removing page numbers. Google has two page number keywords, the bot just removes the one that is not being used in the urls, since it is both unused by software and misleading to humans. AManWithNoPlan (talk) 13:21, 6 January 2021 (UTC)
Citation bot removed archive-link and archive-date tags
What should happen
Citation bot should not automatically remove archive-links for citations that have url-status live, because sometimes a specific version of a page is desired
{{cite news |url=https://www.washingtonpost.com/investigations/federal-government-spent-millions-to-ramp-up-mask-readiness-but-that-isnt-helping-now/2020/04/03/d62dda5c-74fa-11ea-a9bd-9f8b593300d0_story.html |title=Federal government spent millions to ramp up mask readiness, but that isn't helping now |last=Swaine |first=Jon |date=April 3, 2020 |work=[[The Washington Post]] |url-status=live |archive-url=https://www.washingtonpost.com/investigations/federal-government-spent-millions-to-ramp-up-mask-readiness-but-that-isnt-helping-now/2020/04/03/d62dda5c-74fa-11ea-a9bd-9f8b593300d0_story.html |archive-date=December 21, 2020}}
|url= holds the same value as |archive-url= so |archive-url= does not point to an archived snapshot of |url=. When |url= dies, in this case, |archive-url= will also die. What is the point of that?
@AManWithNoPlan: Our thread has been archived but I'd like to bring this up again. You said that the issue I described was not a bug. I do understand that someone purposefully implemented this removal, but I still think that it should be de-implemented. I've seen the bot do the described change multiple times since the last discussion, most recently here. The quote-title format is not uncommon and I do not believe that there are more titles that misuse the quote marks than there are actually quoted titles. Your proposed workaround of tagging every single such title seems like more unnecessary maintenance work. In the rare cases of quote marks actually being misused, they can still be fixed by hand, no workaround required. IceWelder [✉] 22:09, 5 January 2021 (UTC)
I keep getting a "Nonce already used" error message, such as: !API call failed: The authorization headers in your request are not valid: Nonce already used: 1ff9ed985e5606960cae0e173a525eb4 !Unhandled write error.
bot adds |pmc-embargo-date= for expired embargos so cs1|2 adds Category:CS1 maint: PMC embargo expired which will cause gnomes to delete |pmc-embargo-date= so the bot will add |pmc-embargo-date= ... You see where this goes...
that edit is good. it removed an archive-url that is the same as the non-achive url. it is no different than saying "go to X, but if they are closed, then go to X instead". AManWithNoPlan (talk) 12:34, 6 February 2021 (UTC)
Certain newspapers/websites are correctly moved from |publisher= to |work= but left in an incorrect form, such as [[New York Times]] or New York Times.com instead of [[The New York Times]]. Also, |agency=''(Boston Globe)'' was corrected to |agency=(Boston Globe), but should be further corrected to |agency=The Boston Globe, with "The" and without parentheses.
|work=New York Times.com most definitely is an error: there is no such website, newspaper, agency, organization, or other entity. The fact that Izno thinks that other citations near the one that includes |agency=(Boston Globe) look like garbage has no bearing on the need for such markup to be corrected to |agency=The Boston Globe. Cheers! —Anomalocaris (talk) 23:22, 25 December 2020 (UTC)
The second citation's in page data is "Boston Globe". So, requiring the "The" is a matter of taste. As I said, the citations are garbage.
As for |work=New York Times.com most definitely is an error: there is no such website, newspaper, agency, organization, or other entity, we routinely have |work=New York Times. You may think that is suboptimal, but it also is not an error. The .com after the end is a natural extension.
Please don't ping me again on this page. You could also lose the snide attitude on the point. You can disagree that it is not in fact a GIGO situation as you wish without deciding that my opinion is particularly special (or the opposite, disinteresting or irrelevant). --Izno (talk) 23:27, 25 December 2020 (UTC)
Izno (without a link, as requested): It is wrong to just slap .com willy-nilly. The website of The New York Times is at nytimes.com. The website of the Republican National Committee is at gop.com; we would never use anything anything like "Republican National Committee.com", following your model. It is Wikipedia style to name newspapers as they name themselves, viz, The New York Times, The Wall Street Journal, Los Angeles Times. The word "The" is often omitted when the name of the newspaper is used before another noun to modify it, e.g. "in a New York Times editorial from January 1, 1900". But it's not supposed to be omitted in a reference. This is not a matter of taste. I am sorry if you found any of this snide. It might be fair to say that the tone is snide, but if so, it was unintentional. You can't read my mind, so it it is not fair to say that I have a snide attitude. Kindly keep my attitude out of this discussion, and let's stick to the facts. I have no opinion on whether or not citations near the one involving a story credited to (The) Boston Globe are garbage. I haven't inspected them. Their quality has no bearing on the correctness of the Boston Globe citation. (Here I'm using Boston Globe to modify "citation", so it's best to omit the definite article. Cheers! —Anomalocaris (talk) 00:46, 26 December 2020 (UTC)
The The is a matter of taste. Having half the newspapers |journal=The New York Times while the other half as |newspaper=Los Angeles Times is pretty silly. Normalizing to all The ... , or all The ... is perfectly acceptable. Headbomb {t · c · p · b}04:14, 26 December 2020 (UTC)
Headbomb: (1) Even though |journal= is a synonym for |work= it's probably best to avoid using it except in {{Cite journal}}, which is, of course, reserved for peer-reviewed academic journals. (2) It is wrong to include the word "The" in publications that don't use them. There is no such newspaper as The Los Angeles Times. When I am editing articles for other reasons, I generally correct any missing The in The New York Times and other papers, and remove spurious the The in Los Angeles Times and other papers. There are other editors who do likewise. It's not a matter of taste. Cheers! —Anomalocaris (talk) 05:59, 27 December 2020 (UTC)
It is. You like the The in The New York Times. That doesn't mean your view is the only legitimate one. There are several style guides, such as MLA, that require to omit the leading 'The'. Headbomb {t · c · p · b}14:07, 27 December 2020 (UTC)
I happen to agree that it would be good to normalize these names of these works where they appear in a citation, I just don't think it's an error when the 'The' is omitted. Suboptimal, but not an error. --Izno (talk) 19:32, 27 December 2020 (UTC)
I agree with Headbomb that many style guides call for omitting the word "The" from some (but not all) publications that start with the word "The". (Publications such as The Nation or The Hindu just won't work without the "The".) However, I don't believe any style guide allows inserting "The" into the name of a publication that doesn't already have it. For example, I don't think you'll find a style guide that allows The Los Angeles Times. I thought Wikipedia had something calling for publications to be called by their real names, but I can't find it now, so maybe it doesn't exist. Anyway, I didn't come here to debate the word "The". I came to raise an eyebrow over running a bot to make changes to numerous Wikipedia articles tweaking citation parameters for little benefit while missing much larger errors in those same parameters. —Anomalocaris (talk) 11:11, 28 December 2020 (UTC)
|last=LucknowDecember 1|first=India Today Web Desk|last2=December 1|first2=2020UPDATED:|last3=Ist|first3=2020 11:35
was carefully modified to
|last1=LucknowDecember 1|first1=India Today Web Desk|last2=December 1|first2=2020UPDATED|last3=Ist|first3=2020 11:35
fixing the non-issue of numbering the unnumbered |last= and |first=, while ignoring the real issue that all of these parameters are completely wrong. It should be
|author=India Today Web Desk |date=December 1, 2020
Headbomb: My point is that there is something very strange that AManWithNoPlan finds it worthwhile to run a bot to make a meticulous change from valid markup to other valid markup, with no display difference and no error reduction, while the parameter names are changed to valid synonyms, the parameter values are completely bogus. It would be much more beneficial to Wikipedia to unbollix the citation. —Anomalocaris (talk) 05:59, 27 December 2020 (UTC)
The bot is not that smart in this context. I have seen similar with many of the Indian websites, I suspect as a result of a bad Zotero translator or in fact no Zotero translator being used in Citoid. The bot could reasonably be changed to make these citations better but not best anyway since there is inevitably some information not captured (best still would be for someone to make a Zotero translator for the Indian websites so that we don't have to deal with the garbage!).
As for last/last1, I think that's a good change regardless of any other changes (though I believe it is also considered to be one of the cosmetic edits that the bot will only make with another non-cosmetic edit, so it isn't the point of the particular change). --Izno (talk) 19:32, 27 December 2020 (UTC)
The bot (correctly) used Google Books to add a year to a citation, changing it to:
* {{citation
|last=Franklin|first=Alfred|title=Histoire de la bibliotheque mazarine|year=1969|url=https://books.google.com/books?id=uZst3Cw62qIC&pg=PA249|accessdate=6 November 2019|publisher=Slatkine |id=GGKEY:ZXAXTFKG8NF}}
The first step here is to actually use the parameters meant for the information. date and volume goes in |date= and |volume=, not |journal=Headbomb {t · c · p · b}04:14, 11 December 2020 (UTC)
Does citation bot have a feature to add a |title= when one is missing ie. determine a reasonable title for a given URL? -- GreenC15:27, 23 August 2020 (UTC)
Pings and emails to Smith unanswered. If there no response in a few days, will ask admins on the IRC to adopt the tool, or possibly someone would volunteer jig a restart. -- GreenC02:09, 25 August 2020 (UTC)
Bot changes BBC News from a publisher to a work, causing it to be italicised. This is not correct. BBC News is a division of a broadcasting company and is no more a "work" than ABC News, any radio or TV station identified by call letters, or companies such as News Corp.
You are changing |publisher= to |work= for a variety of business organizations that are actually publishers, not websites or newspapers or magazines or works. According to ABC News, BBC News, CBS News, NBC News, Reuters these are all businesses! They are not websites! They are not magazines! They are not TV or radio programs! Note: You seem to be correctly leaving Fox News as a |publisher=. Thank you for that.
What should happen
You should be flipping these the other way, changing |work= or any of its aliases to |publisher= for ABC News, BBC News, CBS News, NBC News. When the news item is on Reuters' website, it's |publisher=Reuters, otherwise it's |agency=Reuters. I'm not sure if "you" (the bot) are sophisticated enough to deal with this.
I'm sorry, but that's ridiculous. Look at the output: BBC News. That is not a website, it's the news division of the corporation, the employer of the reporters. Particular stories also appear under the heading "BBC Future", for example. You speak of a body of work, but italics indicate a publication; I would italicize a particular program(me) on the BBC or any other broadcaster (such as Today or All Things Considered, both news shows I have cited as "work="). They do not indicate a company or a division of a company; this makes Wikipedia look stupid (or like advertising). And @AManWithNoPlan: these objections make the bot run contentious, this should be discussed centrally before you resume what at least two of us have now objected to as degrading the encyclop(a)edia using automated tools. Yngvadottir (talk) 17:06, 25 December 2020 (UTC)
Indeed, this is implementing Trappist the monk's nonsensical view that something hosted at springer.com is part of a body of work called Springer, rather than reflect the fact that it's something published by Springer. Headbomb {t · c · p · b}17:10, 25 December 2020 (UTC)
Springer being a work is odd to say the least. Drawing the distinction between "BBC" vs "BBC News" vs "BBC HARDTalk" seems to be a contentious issue. Where does the publisher start and the work begin. I have had the list of "works" on the bot reduces now. AManWithNoPlan (talk) 17:37, 25 December 2020 (UTC)
I am not Trappist. Please do not ascribe his views to anyone but him. His view does match mine, and multiple others in the community. BBC News is a work composed of multiple segments (which may be lesser or greater works themselves ofc.) Especially, lesser works available at bbc.com/news are such of the website BBC.com. (I suppose an alternative might be to italize BBC or BBC.com with BBC News appearing as the publisher, but I suppose you and most others would find that more confusing rather than less. I think I include myself among those who would be confused by such a practice.)
Springer is indeed a fallacy. Almost always they are republishing a paper published in another work elsewhere. Works originally published at springer.com (of which I imagine there are few but non-zero quantities) or as part of one of their journals with their name should indeed receive some work of interest with the Springer name.
As for whether Wikipedia is made to look stupid, please take that opinion clearly your own elsewhere. I doubt anyone other than Wikipedians care, though I'm sure you could convince colleagues or friends of your own opinion without hearing from others with an opposing and currently consensus view of the matter.
Regarding "actual" works like All Things Considered, I agree those also are works (though of lesser size than the body of work named NPR). I am happy to agree such should be present as the work where applicable and trivially known. This bot is not that smart (but I guess could be if the likes of NPR make sane URLs). --Izno (talk) 18:16, 25 December 2020 (UTC)
Fundamentally though, your position is inconsistent with printed news works. You see no issue with italicizing The New York Times, but as soon as the work becomes broadcast or digital, now it's an issue? No, I reject that inconsistency. --Izno (talk) 18:49, 25 December 2020 (UTC)
Regarding Reuters and other agencies, those are works themselves when published by the entity called Reuters (or similar other agency). They should be reflected in the work field accordingly in such cases. --Izno (talk) 18:19, 25 December 2020 (UTC)
The distinction between publishers and publications is ancient, and the existence of the Internet and websites does not change anything. Harvard University is an organization or business, and if it issues a press release, we would use {{Cite press release}} with |publisher=Harvard University — even if the press release is found at harvard.edu. The Harvard Gazette is a former newspaper and now just a website from Harvard, and for anything there, we would use {{Cite news}} with |newspaper=The Harvard Gazette. CBS News was founded September 18, 1927, long before the Internet. It is an organization, and the fact that its official website at cbsnews.com has a very similar name does not change anything. What's on cbsnews.com is presumptively published by CBS News, just as what's on harvard.edu is presumptively published by Harvard University.
Until this is settled, I beseech AManWithNoPlan to stop using Citation bot to make mass changes of the publishers in question: ABC News, BBC News, CBS News, NBC News, and Reuters. Cheers! —Anomalocaris (talk) 21:56, 25 December 2020 (UTC)
I think separating out the underlying question of what italicization behavior we want here may be helpful. Once we've settled that, we can then figure out how to get the internal data structured in a way that produces that behavior. There's enough of a question here that I agree mass editing should stop until we figure it out. {{u|Sdkb}}talk10:29, 27 December 2020 (UTC)
In the case of BBC News, the publisher is the BBC (British Broadcasting Corporation): BBC News is the work in our parlance. I suspect ABC is the same. So (a) putting BBC News in italics is consistent and reasonable and (b) at the risk of looking a bit silly by having both, it is conceivable that the bot could extract a publisher= in these cases and set it to be BBC. In the case of Reuters, the publisher is Reuters and the work is reuters.com. But I admit that the model starts to break down at e.g. Harvard, because undeniably the publisher is Harvard University but to say that harvard.edu is the work does stretch credulity. I suppose what I am saying is that the bot shouldn't change all instances of publisher= to work= but it could flag up anomalies for attention. And maybe it could bifurcate the major sources like ABC, BBC and CBC? Which all goes to underline Sdkb's point that we need a consensus on what italicisation we want: my starting point would be to ask if there is a house style in major journals like Nature that we should emulate? --John Maynard Friedman (talk) 10:58, 27 December 2020 (UTC)
Indeed today the bot limits its activities on this point to major news sources like those in this thread (I don't know about CBC offhand). --Izno (talk) 19:24, 27 December 2020 (UTC)
We've been over this many times before, and the answer is always the same. See WP:CITALICSRFC in particular. I'll just copy-paste what I said at an essentially duplicate thread at WT:CS1#Italics 2: It's not an either/or, "use the one I like better" matter. The |work= is always required (|website= and |newspaper=, etc., are aliases of it); Wikipedia only cites published works (see WP:V and WP:CITE); it does not cite companies, persons, or other entities, only works by them. The |publisher= should be added, as additional source-identification information, only if significantly different from the title of the work (do |work=The New York Times not |work=The New York Times|publisher=The New York Times Company). If the name of the website is ABC News then that is in fact the title of the work, despite that also being part of the name of publisher. (It's also harmless to do |work=ABCNews.Go.com, though that's a bit sloppy.) The actual publisher is ABC News Internet Ventures, a division of ABC News Network, a division of American Broadcasting Company, a division of Walt Disney Television, a division of the Walt Disney Company (most or all of which also have corporate postfixes like "Inc." in their full names). None of these names need appear in a citation, because they are either redundant with the |work= at the lower levels, or too lost in financial-holdings arrangements, at the upper levels, to be meaningful to the reader in relation to a citation. (In most contexts, anyway. In a WP article about Disney or one of its other properties, it might in fact be pertinent to indicate that Disney is the ultimate publisher, either with that parameter or with a free-form note, so the reader has a clear indication of the source's lack of complete independence from the subject.) — SMcCandlish☏¢ 😼 17:48, 1 January 2021 (UTC)
You are using Citation bot to change |publisher= to |work= for a variety of business organizations that are actually publishers, not websites or newspapers or magazines or works. According to ABC News, BBC News, CBS News, NBC News, and Reuters these are all businesses! They are not websites! They are not magazines! They are not TV or radio programs! Note: You seem to be correctly leaving Fox News as a |publisher=. Thank you for that.
You should be flipping these the other way, changing |work= or any of its aliases to |publisher= for ABC News, BBC News, CBS News, NBC News. When the news item is on Reuters' website, it's |publisher=Reuters, otherwise it's |agency=Reuters.
Also, you are using Citation bot to change certain newspapers/websites correctly from |publisher= to |work=, but in some cases leaving them in an incorrect form, such as [[New York Times]] or New York Times.com instead of [[The New York Times]]. Also, |agency=''(Boston Globe)'' was corrected to |agency=(Boston Globe), but should be further corrected to |agency=The Boston Globe, with "The" and without parentheses.
Next, the purpose of |agency= is being completely misunderstood here. It is only for newswires, and only when they are acting as such in the context of this specific citation. Reuters and Associated Press and Agence France-Presse are often agencies for other publications, but they also publish material under their own names, so whether one of these is an agency in a particular citation depends on the details of that citation; it is not a blanket matter. While it's correct that |agency=(Boston Globe) is misformatted, |agency=The Boston Globe is also wrong, because that is a newspaper (|work=The Boston Globe, not a content-syndicating news agency. If you've got a situation where the original publisher was The Boston Globe but you found the content somewhere else, e.g. a newspaper archives site, then the way to WP:SAYWHEREYOUGOTIT is |work=The Boston Globe|via=NameOfArchiveSite. Please, just actually read the citation template documentation and Help:CS1, and do what it says instead of trying to come up with ways to avoid doing what it says. (Same applies, really, to all policy, guideline, process, and documentation matters).
If this bot started changing |work= to |publisher= as Anomalocaris suggests, then I and several others would move to shut the bot down as doing difficult-to-fix, mass-level harm to citation data. PS: Yes, Springer is a publisher; if we had to cite their website (e.g. for WP:ABOUTSELF basics about the company), that's probably best done as |work=Springer.com. It's not something we would normally cite otherwise, since it is not a news source, journal, or other such publication in the more usual sense. If the bot is blanket-changing all publishers to works that would obviously be a mistake, but in any of the cases highlighted above (ABC News, BBC News, etc.), such a change is correct. If there are cases of |work=ABC News|publisher=ABC News, those should be reduced to |work=ABC News (especially since the publisher name is not actually "ABC News" to begin with). Another side point that's been covered before: When any website is cited by WP, it is cited as a published work (by definition), not as a shop or server or corporate entity or whatever else the same name might refer to outside of a citation-to-published-work context, where it gets italicized, even if it would not be italicized in running text as a service or company or whatever. — SMcCandlish☏¢ 😼 19:14, 1 January 2021 (UTC)
The |work= is always required (|website= is an alias of it), this is plain false. Work is not always required, as many things are not published as part of larger works. Headbomb {t · c · p · b}18:00, 1 January 2021 (UTC)
Let's not be silly. Work is always required when it is applicable; nothing could possibly ever be required in cases in which it cannot even apply. This discussion is about swapping work for publisher when work is applicable. When the |work= parameter (or one of it aliases) does not apply, then |title= is the work. So, yes, the work is always required, just not necessarily in the form of the parameter by that name. — SMcCandlish☏¢ 😼 19:14, 1 January 2021 (UTC)
Although I totally agree with your general point that work= is not required and that publisher= with no work= is a perfectly valid combination of parameters, that particular example would be better cited as Caldwell, Robert R.; Gubser, Steven S. (March 2013). "Brief history of curvature". Physical Review D. 87 (6). 063523. arXiv:1302.1201. doi:10.1103/physrevd.87.063523. (I am omitting its publisher, the American Physical Society, because that's usual not helpful for publications in well-known journals.) —David Eppstein (talk) 20:14, 1 January 2021 (UTC)
That's the domain name, not the title. The title is clearly given at the page, both visually and (along with a typical marketing tagline) in the <title>...</title> element, and it is ABC News. It is true that in general Wikipedians really don't care if you use |work=ABCNews.Go.com instead of |work=ABC News, that's completely immaterial to this discussion. The confusion you are having is that ABC News is also the name of the news division of American Broadcasting Company (a division in turn of Walt Disney Television, a division of Walt Disney Company). Exact or close-enough correspondence between the work and publisher name is pretty common, and it simply doesn't matter. It is not a magically special case. In such cases, we omit the publisher as redundant, because what we are citing is the work; we are not citing an entity (we only provide the publishing entity as additional information to help correctly identify the source). An argument could be made in this case to do |work=ABC News|publisher=American Broadcasting Company (or |work=ABCNews.Go.com|publisher=American Broadcasting Company, if you really really wanna), since American Broadcasting Company is an actual legal entity, while it's not clear that ABC News, the division, remains one at all (it may well simply be a property/trademark at this point). — SMcCandlish☏¢ 😼 20:53, 3 January 2021 (UTC)
place: For news stories with a dateline, that is, the location where the story was written. In earlier versions of the template this was the publication place, and for compatibility, will be treated as the publication place if the publication-place parameter is absent; see that parameter for further information. Alias: location
publication-place: Geographical place of publication; generally not wikilinked; omit when the name of the work includes the publication place; examples: The Boston Globe, The Times of India. Displays after the title. If only one of publication-place, place, or location is defined, it will be treated as the publication place and will show after the title; if publication-place and place or location are defined, then place or location is shown before the title prefixed with "written at" and publication-place is shown after the title.
This one is going to be a pig to fix. In all the other {{cite}}s, "location" is the place of publication and we may assume that many (most?) instances of its use are intended to have that meaning. But equally, for stories from "war-torn X" or "famine-stricken Y", it must be probable that the wiki-editor would have used location=X and location=Y in this case without spotting the implicit error. It seems to me that to introduce "publication-location=" is definitely the wrong solution because it is inconsistent with the other cite templates. Maybe "dateline=" for where the story was filed? [though it is rather an Americanism, I don't know how international it is?] --John Maynard Friedman (talk) 11:15, 27 December 2020 (UTC)
Actually, |location= is the parameter to specify the written-at-place whereas |publication-place= is the correct parameter to specify the publication place. This applies to all CS1/CS2 templates.
If an editor explicitly used |publication-place= s/he actually meant to specify the publication place whereas if we find |location= in a citation, this is the dedicated parameter to specify the written-at-place but for quirky reasons burried in the historical development of the citation templates (trying to masquerade the underlying problem), the visible output of the templates differs only if both parameters are given. Ideally, we would have a semantically more meaningful parameter name for the written-at-place parameter as well (I suggested something like |write-place=, |writing-place= or |written-place=), but it won't be possible to automatically convert |location= to that new parameter because of the misleading use in historical citations. So every citation will have to be changed manually. However, given that it is difficult to fix, the bot should stop replacing the correct parameter |publication-place= by the potentially incorrect parameter |location=, as it removes vital information, weakens the quality of a citation and its machine-readability, and adds citations to the pool of those that need to be manually fixed eventually.
This has been recently discussed on this talk page, please review the archive. (I need to get around to the proposal to deprecate the one parameter over in Help talk:CS1.) --Izno (talk) 19:33, 27 December 2020 (UTC)
Is it feasible to temporarily leave location alone, change all or the citation templates to support |publication-date= and |publication-place=, then revise the bot to parse all of the parameters before making changes? Shmuel (Seymour J.) Metz Username:Chatul (talk) 02:30, 28 December 2020 (UTC)
Among many other places, this has been discussed here:
This is simply not true. |publicationplace= is an alias of |publication-place=, like |location= is an alias of |place=, but these two groups of parameters are not aliases of each other, and they shouldn't because they are for two different properties of a source.
For historical reasons the two parameters issue the same display output unless both parameters are being given at the same time, but this is not the same as being aliases (in fact, it would be impossible to give both parameters at the same time if they were aliases - the template implementation does not allow this for alias parameters). So, please have a look at the source code before you spread such falsehoods, as this causes confusion among editors and even leads to inappropriate bot tasks such as this one, weakening correct information in citations and invalidating reliable machine-readability. That's harmful.
It is fine. The issue with MonkBot 18 was that it was only doing cosmetic edits on a massive scale. Having general improvements to templates is fine. Headbomb {t · c · p · b}15:57, 13 February 2021 (UTC)
First off, the bot is continuing to do cosmetic-only edits - sample. Second, as I said, there is not consensus at this point to continue doing these edits by bot, unless you can point to a discussion or bot approval that says otherwise? Nikkimaria (talk) 16:06, 13 February 2021 (UTC)
Short of xkcd's metahumour (I have that book, it also numbers pages in base 3), do you have an actual example of an actual journal's actual volume number actually being 0? Headbomb {t · c · p · b}00:01, 16 February 2021 (UTC)
The fact that some authors put jokes into their metadata is not a valid reason to make that metadata unciteable. And this incessant push to make citation templates and the bots that manage them as rigid and doctrinaire as possible is making the citations they manage unusable by humans. Anyway, one of the better-known examples is Conway's On Numbers and Games, which numbers chapters starting with zero. I'm skeptical that this should count as a legitimate journal, but apparently Smarandache's "International Journal of Neutrosophic Science" started with volume zero: [34]. There are also plenty of books describing themselves on their covers as "Volume 0", whether as a way to attach a preamble to a series or for some other reason I'm not sure [35][36][37]. —David Eppstein (talk) 03:05, 16 February 2021 (UTC)
I know someone who had version 1, 1.0, 1.00, ... 1.00000000000000000000 of a report, until the document control department said "NEVER!" AManWithNoPlan (talk) 03:08, 16 February 2021 (UTC)
If the metadata returns |volume=0, and |volume=0 was there, TNT'ing it and re-filling it won't cause any change. I've manually inspected over 100 of those today and so far I've yet to see an instance where |volume=0 was anything but outdated metadata. Typically, this happens when article are in press, and don't yet have volume/issue numbering assigned to them. Headbomb {t · c · p · b}03:10, 16 February 2021 (UTC)
It is fine. The issue with MonkBot 18 was that it was only doing cosmetic edits on a massive scale. Having general improvements to templates is fine. Headbomb {t · c · p · b}15:57, 13 February 2021 (UTC)
First off, the bot is continuing to do cosmetic-only edits - sample. Second, as I said, there is not consensus at this point to continue doing these edits by bot, unless you can point to a discussion or bot approval that says otherwise? Nikkimaria (talk) 16:06, 13 February 2021 (UTC)
"Removed parameters. Some additions/deletions were actually parameter name changes." could probably be shortened to "Removed/renamed parameters." Otherwise, I don't really see an issue with the current edit summaries. Headbomb {t · c · p · b}14:57, 16 February 2021 (UTC)
OAuth callback URL not found in cache. This is probably an error in how the application makes requests to the server.
Hi Headbomb,
In order to complete your request, Citation bot needs permission to perform the following actions on your behalf on all projects of this site:
Interact with pages
Edit existing pages
cs1|2 can't simultaneously apply external link created from |pmc= and internal wikilink from |title= to |title=; default when this happens is that cs1|2 used the external link but shows |title= with wikilink markup and URL–wikilink conflict error message
What should happen
there are those who argue that wikilinking to the en.wiki article should have precedence; I disagree because we are citing the source, not an en.wiki article (which is not allowed anyway because en.wiki is not WP:RS)
This seems like a bug in the module. The title's link is a convenience link, not the thing we are citing itself. --Izno (talk) 23:09, 21 February 2021 (UTC)
The inability to handle citations with internal links on titles and pmcs because the module is too stupid to avoid overriding the courtesy-pmc-link with an explicit link is definitely a module bug, not a bot bug. But I strongly disagree with Trappist's contention that internal links are to be avoided, and I would even more strongly object to a workaround for this bug that caused the bot to strip the internal links. If we have an internal link, going into detail about the source, that is highly useful information for readers wanting to know about the validity of the source, and should be included; readers wanting the source itself can still easily find it from the PMC link that is still right there in the citation. —David Eppstein (talk) 00:42, 22 February 2021 (UTC)
If there's enough to say about a source that it itself is notable and has an article, then we should point to that article. How positively was a book reviewed? Was it so successful that it ran to many editions? Does it have a trans-generational reputation for difficulty? A link in the title doesn't actually imply that the Wikipedia article is being used as a source any more than a link on the author's name does. XOR'easter (talk) 19:39, 22 February 2021 (UTC)
Untitled_new_bug
Status
{{notabug}} - transcluded references should not have names
@AManWithNoPlan: A bot run of 40,000+ articles? Really? If there's a limit on mortals for how many articles you can request at once (which seems to be around ~1000), it should apply equally to everyone. Headbomb {t · c · p · b}05:25, 23 February 2021 (UTC)
It was supposed to be much smaller. I have been waiting for a chance to reboot the bot to stop it, but I did no want to kill other peoples jobs. AManWithNoPlan (talk) 12:10, 23 February 2021 (UTC)
I have upped the limit to 2000. The limit was removed for me so that I could let the bot run until it crashed and diagnose the crash. The crashes seem to be non-existent at this point. AManWithNoPlan (talk) 13:20, 23 February 2021 (UTC)
I was more referring to the specific use case above which was a bit of permanently commented out text. In most cases - but not all - comments are clearer and more easily understood. AManWithNoPlan (talk) 13:46, 24 February 2021 (UTC)
JSTOR links
Is the intention to add a JSTOR link to citations that already have them? [41][42][43] The result is two identical links in the same note, which seems unnecessary. I'm not sure what the solution is, though, since the title linking to the article is pretty standard, if having the identifier visible (rather than just as part of the URL) is desirable. I hope I can be forgiven, though, if I'm not sure I see why the link alone isn't sufficient. Regardless, just wanted to bring this redundancy to your attention. Thanks. blameless01:08, 24 February 2021 (UTC)
this is by design. Identifiers are added. We would like to remove the redundant links, but the generally feeling of wikipedians stopped us (title links are magic), but that does not stop you from removing the URLs. AManWithNoPlan (talk) 01:25, 24 February 2021 (UTC)
Playing around on toolforge, I stumbled onto the fact that several references labeled as dead are actually a syntax error of having "http://%5b" added to the front of the good link.
One example here: Special:Diff/1008791334
It also has a few cousins which I can provide if this is something you want to explore further, if CitationBot is capable of seeing these and correcting. Slywriter (talk) 02:41, 25 February 2021 (UTC)
IABot won't fix these. At best it could be WaybackMedic but it's unclear any bot should attempt it due to the complexity and number of cases. -- GreenC14:45, 25 February 2021 (UTC)
In publications with far more than 30 authors, the bot thinks that 29 authors should be displayed (display-authors = 29).
What should happen
Show only the collaboration when set, or only the first author (display-authors = 1), both are common conventions. The author lists are typically sorted by alphabet, author 2 is no more or less important than author 1463 and we clearly don't want to display all of them. The first example doesn't look like it would be sorted by alphabet, but it is sorted by alphabet of the country, institute and then name of the authors.
I will write code to look the existing templates on the page and see if there is a general trend and follow that. Sorting it impossible, since the meta-data does what it does. AManWithNoPlan (talk) 19:17, 25 February 2021 (UTC)
The general trend is probably bot-generated at this point. I don't think many humans set display-authors = 29. There are 994 articles using 29, but only 1 using 28, 4 using 30, and none using 31 or 27. There are 6300 articles using displayauthors = 1. No need to sort anything. I just mentioned that the collaborations sort their author lists. --mfb (talk) 08:53, 26 February 2021 (UTC)
29 may come from somewhere in the dark ages but it didn't come from the templates. The wikitext versions of the templates were constrained to nine author and four editor names; the ninth author and fourth editor names were automatically replaced with 'et al'; any other author and editor names were ignored. The very earliest versions of Module:Citation (now defunct) continued that practice until this edit which removed the constraints. 29 comes from somewhere and somewhen but wherever and whenever that is, it is not the templates.
This message from User:DMBanks1 is too lacking in context to be helpful for anything, but it appears to be referring to Special:Diff/1001877789, in which DMBanks1 insists on using the parameter |p= instead of |page= in a {{cite web}} template, and reverts the bot's change to the more readable parameter name. I was surprised to see that the template documentation does not actually deprecate |p=; I think it probably should be deprecated, but maybe that is more a matter for Help talk:Citation style 1 than for here. —David Eppstein (talk) 08:44, 29 January 2021 (UTC)
@David Eppstein: Thank your for your response. Being somewhat unfamiliar with this whole area, I was uncertain of how to query the issue. That being said, I am unclear of what is the "it" that should be deprecated. My understanding of literary citation style is to use "p" in the singular or "pp" in the plural, which is how the template is interpreted on Wikipedia pages. Therefore, I am assuming you are suggesting that the use of "page" in the template should be deprecated and not the converse. DMBanks1 (talk) 16:10, 29 January 2021 (UTC)
|p= and |pp= are short-hand. The bot normalizes them to the easier to understand standard forms |page= and |pages=. Similar to converting |accessdate= to the standard form |access-date=. AManWithNoPlan (talk) 12:57, 30 January 2021 (UTC)
@AManWithNoPlan: Can you help me out. I am unable to find any authority that states "pages" as opposed to "pp" is the standard form, or that it is shorthand as opposed to a common abbreviation. DMBanks1 (talk) 15:30, 30 January 2021 (UTC)
There's no 'authority', it's simply the canonical name for that parameter, and the one used by 95-99%+ of people. page is clearer than p. pages is clearer than pp. Headbomb {t · c · p · b}17:13, 30 January 2021 (UTC)
the last time this came up, people liked the ease of typing one letter parameters, and also liked them being conveted into human readable forms. AManWithNoPlan (talk) 18:28, 11 February 2021 (UTC)
Some comments:
Changing the parameter name alone is cosmetic.
The only changes made in the edit were this kind, which means the edit was cosmetic, which the bot should not do.
These parameters are not deprecated. That is indeed a discussion for another forum (I would not support deprecation).
I don't think the bot should make this change at all either, because the ouput is nearly equivalent to the use. That is not hard to grok. Moreover, we have other templates used for citation that share these parameter names as well.
this discussion moved to a bot discussion page, and it was agreed that this type of change improved readability and was {{notabug}} , but should be treated as cosmetic. AManWithNoPlan (talk) 22:14, 4 March 2021 (UTC)
Sciamvs is just Latin for something like "let us know" or "that we may know", correct? That is, it is a word, not an initialism. (The v is the same as a u — the Romans did not distinguish those two letters from each other.) So we should format it as a word regardless of how the journal likes to style it. —David Eppstein (talk) 00:08, 4 March 2021 (UTC)
It is fine. The issue with MonkBot 18 was that it was only doing cosmetic edits on a massive scale. Having general improvements to templates is fine. Headbomb {t · c · p · b}15:57, 13 February 2021 (UTC)
First off, the bot is continuing to do cosmetic-only edits - sample. Second, as I said, there is not consensus at this point to continue doing these edits by bot, unless you can point to a discussion or bot approval that says otherwise? Nikkimaria (talk) 16:06, 13 February 2021 (UTC)
Unless you are linking to the actual image on the front cover, I think that new link better reflect the reference. And, with the new google books, the old link no longer works that way anyway. AManWithNoPlan (talk) 16:28, 10 December 2020 (UTC)
Secondly, the bot actually fixes the links so that they work with the new google books, and they no longer depend upon javascript, and finally it removes user specific parts. AManWithNoPlan (talk) 16:30, 10 December 2020 (UTC)
I always link to the images on the front covers or title pages of books on Google Books where Google permits this. That page often contains the book's complete title, as well as the actual names of the author(s), editor(s), publisher and publisher's location(s). The main Google Books page does not always report these completely or accurately. The bot needs to retain this important option.
The bot also removes links to snippets of text in Google books when it deletes parts of URL's that follow the symbol "&". When the bot does this, readers can no longer verify information that editors have cited. Readers also cannot determine the context in which the cited information appeared in the book. Corker1 (talk) 21:57, 16 December 2020 (UTC)
There is no such thing as a reliable google book link. And, links often include multiple search parts and that means that different people will see different things when they click the link, which is bad. The bot reduces this, although a person should got through and remove all search terms for links to pages, and all pages for links to search terms. AManWithNoPlan (talk) 13:06, 26 December 2020 (UTC)
I agree with AManWithNoPlan. Furthermore if you need to cite specific pages, the last thing you want is a bot changing the link to some pretty frontispiece. To take a specific example, the Statutes at Large has many many pages and we really should not ask readers tp plough through it looking for a reference that we as editors have already found. (For detailed examples, see Calendar (New Style) Act 1750. --John Maynard Friedman (talk) 13:19, 26 December 2020 (UTC)
TNT the volume/issue, and only fill the relevant one (which I believe is |issue=). This is PhytoKeys specific, and a run against all existing PhytoKeys citations is needed to fix these errors.
We can't proceed until
Feedback from maintainers
I'm already running it against existing citations btw, so no need for anyone else to do a bot run for this. Headbomb {t · c · p · b}15:16, 12 March 2021 (UTC)
Addition of doi to cite arxiv creates bad cite journal with no journal
Status
{{fixed}} - will now not check for title match when adding a DOI based upon an arXiv ID, since titles often shift around a bit. If the DOI has already been added, then the title check will still be done.
If you're going to add a doi, the whole citation should be changed to match the doi: we need the rest of the journal metadata, not just slapping "cite journal" on it and calling it done.
The bug is that it changes a cite arxiv template to cite journal, but does not find and add the journal data. So your excuse for not fixing it is invalid — you can merely not change the template type in this case. Also doi:10.1112/blms.12460 resolves just fine and curl -LH "Accept: application/x-bibtex" http://dx.doi.org/10.1112/blms.12460 retrieves a valid-looking CrossRef record. —David Eppstein (talk) 19:35, 18 March 2021 (UTC)
In History of group theory a cite paper ref (English translation: Klein, Felix C. (2008) [1892].) with arxiv, translator & editor parameters was converted to cite arxiv for which the translator and editor parameters aren't valid.
What should happen
Conversions to cite arxiv should exclude cases with parameters that aren't supported in cite arxiv
Violation of MOS:PAGERANGE for page ranges. For example, |page=832–834 is replaced with |pages=832–4 instead of |pages=832–834. The former exception from the general rule in MOS:NUMRANGE for page ranges for specific citation styles ("may be used ... where a citation style formally requires it") has been removed some time ago (after this discussion), but even before that it was not formally required by the {{Citation}} template, so this sort of replacements was at least questionable even then. I would say that the bot should rather do the opposite.
What should happen
"...number ranges in general, such as page ranges, should state the full value of both the beginning and end of the range, with an en dash between, e.g. pp. 1902–1911 or entries 342–349. Except in quotations, avoid abbreviated forms such as 1901–11 and 342–9 as they are not understood universally, are sometimes ambiguous, and can cause inconsistent metadata to be created in citations."