This is an archive of past discussions with User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
Is it always the case that a user who enters one of these hard spaces truly wished to include a regular space? I think this is an instance where we would do well to respect manual input. I'm not aware of the bot introducing hard spaces from any of its own data sources. Martin(Smith609 – Talk)12:03, 28 September 2018 (UTC)
See MOS/Text formatting: The only invisible characters in the editable text should be spaces and tabs. However, other invisible characters are often inserted inadvertently by pasting from a word processor. – Jonesey95 (talk) 14:22, 28 September 2018 (UTC)
Any hard-coded non-regular spaces should be changed and normalized to regular spaces. If someone writes explicitly, sure, respect that, but a hard-coded ones should be converted to normal spaces. Headbomb {t · c · p · b}14:26, 28 September 2018 (UTC)
Doable, but more complicated than it looks. Will need to create do_the_url($url,$param) function that is called, where $param=FALSE for new ones. AManWithNoPlan (talk) 13:34, 21 September 2018 (UTC)
I don't have a problem with some user-activate limited cosmetic edits. The real problem with this edit is that |url= is arguably a parameter we likely want to be used down the road, so removing it doesn't prevent bad usage/encourage standard usage. Unlike, say removing an empty |page= when |pages= is set. Headbomb {t · c · p · b}11:18, 29 September 2018 (UTC)
Run bot against, for example AM Herculis, and it consistently claims no match for three of the bibcodes (2000A&A...361..952H, 1995A&AS..114..269D, and 1977S&T....53..351L). Several others are consistently found.
What should happen
The bibcodes exist, they should be located, checked, and expanded if necessary. Presumably there is something about them, and a number of other bibcodes, that is not valid for the method the bot uses to look them up.
Some more info: the problem is not specific to the bibcodes, but something related to the internal workings of the bot. See [2] for an example where bibcode 1995A&AS..114..269D was expanded without a problem. The bot wrote
For the AM Herculis case, it wrote - oh dear, it wrote something else that I've now lost. A rerun gives:
> Expanding from BibCodes via AdsAbs API
> AdsAbs 'big-query' request 26/1000:
> Found match for bibcode 1977ApJ...216L..45K
> Found match for bibcode 1977ApJ...212L.125T
> Found match for bibcode 1924AN....220..249H
! No match for bibcode identifier: 2000A&A...361..952H; 1995A&AS..114..269D; 1977S&T....53..351L
> Checking that DOI 10.1002/asna.19232201505 is operational... DOI ok.
The bigquery API accepts CSV-style form data in a POST request. However the bot is urlencoding it and I don't think this is correct. If so, it would explain the ampersand issue. Lithopsian (talk) 18:16, 3 October 2018 (UTC)
|title=Anti-inflammatory and analgesic effects of egg yolk: Mahmoudi M1 et.al., Eur Rev Med Pharmacol Sci. 2013 Feb;17(4):472-6
What should happen
|title=Anti-inflammatory and analgesic effects of egg yolk: A comparison between organic and machine made |journal=European Review for Medical and Pharmacological Sciences |volume=17 |issue=4 |pages=472–6
Is there anyway to detect this and have to bot "recreate" the title and remove all that metadata from the |title= when expanding? (t) Josve05a (c)19:40, 6 October 2018 (UTC)
That happens when there is a comment before the parameter name. parameters.php needs to put comments into the white space not the parameter name. Currently paramters with comments before the name are ignored completely. AManWithNoPlan (talk) 13:26, 9 October 2018 (UTC)
The bot does not recognize the 'encyclopedia=' option in cite encyclopedia. I removed the 'journal=' option from a reference because its contents were duplicate to the contents of 'encyclopedia=', but the bot added the title of the encyclopedia back as a 'journal=' option
What should happen
The bot shouldn't have to create an additional field with a duplicate parameter
I don't entirely understand why your first diff is using cite encyclopedia. Is that work actually an encyclopedia? --Izno (talk) 21:45, 10 October 2018 (UTC)
editor-firstn parameter replaced by editorn-first, and editor-lastn parameter replaced by editorn-last. Might also affect other parameters following this pattern (like editor-linkn or author-first/last/linkn, but not in this example)
What should happen
These parameters should be left alone. (It is okay to insert the hyphen into parameters (editorlast -> editor-last etc.). It is also okay to expand the old parameters last and first to author-last/first or editor-last/first, when it is known, that the person was either an author or an editor.)
If removing |url= from a cite template, also remove {{subscription required}} if it is the only other content of the <ref></ref>.
If not removing |url= from a cite template, replace {{subscription required}} with |subscription=yes (from ouside the cite template to be included in the cite template) if it is the only other content of the <ref></ref>.
Not really junk as the publisher is not specified. [s.n.] is for Sine nomine ie "without a name". Perhaps better to either omit - or include the text "Publisher not specified". - Aa77zz (talk) 18:48, 12 October 2018 (UTC)
This will require 19 arrays. One for each month and day of the week padded with spaces. Each one will include a bunch of non-English words. Then using unicode aware case-insesitive regex search and replace would run padding punctuation and the string itself with spaces, then search and replace on arrays, then de-pad, lastly call our date handler and pray. AManWithNoPlan (talk) 13:00, 14 October 2018 (UTC)
The bot is choking on essentially every page I try for a couple of days now. I don't see any obvious theme in the diagnostic output. Here's a couple of examples:
Despite the presence of a df=mdy-all card, the citation Bot incorrectly added a date in YYYY-MM-DD format
What should happen
Ideally, date should be added in the required format. This may be known from the {{use}} card, or the |df= in the citation. If the Citation Bot cannot determine what format the date should be in, then it should not add it.
|df= makes whatever date is present display in the format in question--it is not a requirement on that date to be X or Y format. This is not an incorrect behavior. --Izno (talk) 19:52, 15 October 2018 (UTC)
Also, this parameter was created specifically for bots and other automated tools so that they would not have to worry about date formatting. IABot, last time I looked, even provides a blank |df= parameter for editors to use.
The |df= parameter does not do that - it displays wrongly, as you can see.
You are going to have to prove that. Here are the templates that you modified in your edit immediately subsequent to the bot's edit; here these templates are as the bot left them:
legend: ✓ – template has |df=mdy-all; ✗ – template does not have |df=):
✓"Profile of John Glenn". NASA. December 5, 2016. Archived from the original on December 20, 2016. Retrieved January 28, 2017. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
✓"Glenn Orbits the Earth". NASA. February 20, 2015. Archived from the original on April 20, 2008. Retrieved June 10, 2008. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
✓ Christopher Hodapp (December 10, 2016). "Illus. Brother John H. Glenn Jr". FreemasonsForDummies.com. Archived from the original on December 21, 2016. Retrieved December 15, 2016. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
✗"Traditions". Ohio State University Marching and Athletic Bands. 2015-07-23. Retrieved September 10, 2017.
✓"Traditions". Ohio State University. July 23, 2015. Archived from the original on December 16, 2016. Retrieved December 8, 2016. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
✓"Glenn Research Center". NASA. February 13, 2015. Archived from the original on January 21, 2017. Retrieved January 28, 2017. {{cite web}}: Unknown parameter |deadurl= ignored (|url-status= suggested) (help)
The three templates where you changed |date= to |year= are excluded here as not relevant to this discussion.
Three of them were not archives, so there was no |df= parameter. (I've forgotten what the special meaning of |df= without a parameter is.) There was a {{Use mdy dates}} card which the Bot should have honoured. If the choice comes down to adding |df= to every citation, or a {{bots}} card to every article, then the latter wins hands down. Hawkeye7(discuss)01:57, 16 October 2018 (UTC)
That [three] of them were not archives does not prove your claim that the |df= parameter ... displays wrongly.
|df= has nothing to do with archives per se, just dates. When |df= is included in a cs1|2 template without a value, it has the same meaning as when the parameter is omitted entirely. Perhaps you are thinking of |dead-url= which empty means |dead-url=yes which is the default state when |dead-url= is omitted from the cs1|2 template. Because people forget this stuff, there is documentation at the template page. When you forget how a template parameter works, consult the documentation.
And I've spent a great deal of time updating template documentation that was missing or incorrect. In this case, the documentation doesn't say what meaning is when the |df= parameter is omitted entirely. It should default to the value of the {{use dmy dates}} or {{use mdy dates}} card, if present. The documentation implies that it does this, because it says: Use same format as other publication dates in the citations. I'm not going to update the documentation without confirmation from you. Hawkeye7(discuss)22:36, 16 October 2018 (UTC)
The only cs1|2 parameter that has meaning when empty or omitted is |dead-url= as I described above. cs1|2 templates cannot see what is outside of their bounding {{ and }}; for them, {{use dmy dates}} and {{use mdy dates}} do not exist. Use same format as other publication dates in the citations is a directive to the user, not an indication of what the template does.
Where does the term 'card' come from? You have used card in this discussion as a synonym for 'template' and 'parameter'.
Okay, I am withdrawing this. The title is wrong. Am raising two new bug reports. The bug report was wrong; I expected the Bot to use the correct date format and not rely on the df card, which is not normally present, or on its default behaviour when it is not, which is undocumented. But the CS template is correctly reformatting the date. Hawkeye7(discuss)22:58, 16 October 2018 (UTC)
Two references to books with google scans were changed to cite journal. The books were Klein 1795 and Stark (& Sclater) 1900. The bot confused the books themselves with reviews of the books published in the journal Nature. Aa77zz (talk) 08:18, 27 September 2018 (UTC)
{{cite web|url=https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1134094 |title=The Problem of False Confessions in the Post-DNA World| work=www.papers.ssrn.com}} That is really hard to fix. That will take some thought. AManWithNoPlan (talk) 16:04, 16 October 2018 (UTC)
When it is an edit assisted with Citation bot (i.e. using my account), those do not show up on the bots contrib-page. (t) Josve05a (c)13:12, 28 September 2018 (UTC)
Three steps. A tag would need created (no idea how) The bot could then tag its own edits (useless and duplicates contributions). To make it useful another bot would then find all edits with our assisted by cite bot text in the summary and then tag it (no idea who would do that, but I would think it would not be very hard. AManWithNoPlan (talk) 03:22, 11 October 2018 (UTC)
In step 3, it should be the user when making the edit who ‘adds the tag’ automatically (see first link as how that tool automatically add the tag if their (JavaScript?)tool was used when editing. Not a bot who post-actively tags the edits, but the users themselves in real time. (t) Josve05a (c)15:08, 11 October 2018 (UTC)
How does one find out which editor is actually running the bot for a particular edit? I cannot believe that in a transparent and collegiate editing environment such as we (presumably) have, it is possible to edit truly anonymously via Citationbot? I note that the page reminds editors that they are responsible for every edit they make with the bot, but I do not see how that can be enforced if no-one knows who it actually is! Hopefully, I'm missing something blindingly obvious. Any thoughts? —SerialNumber54129 paranoia /cheap sh*t room14:48, 2 October 2018 (UTC)
Indeed it is; and as you can probably imagine, it is those that do not that I am interested in :) viz, they that merely say (...You can use this bot yourself. Report bugs here.|User-activated.) (undo), if yousee what I mean... —SerialNumber54129 paranoia /cheap sh*t room16:10, 2 October 2018 (UTC)
Yeah, those are when a user runs the bot without &user=USERNAME in the URL. One could make that a prerequisite, but you can type whatever username you want...so not stopping anybody pretending to be "user foo". (t) Josve05a (c)16:25, 2 October 2018 (UTC)
While it is probably unlikely to happen, If I understand correctly from the explanation above it is also possible to put in a username of another user, which may be prone to abuse and could be seen as unintended behavior. Redalert2fan (talk) 13:35, 4 October 2018 (UTC)
In this edit the bot capitalized the first letter of each non-trivial word in a journal title. However, some citation styles use sentence case capitalization for titles. Also, the case in a citation is independent of how the source chooses to write the title, so grabbing it from some database is invalid. The citation style for templates does not specify whether titles should be so-called title case or sentence case. So why is the bot making this change? Jc3s5h (talk) 14:53, 15 October 2018 (UTC)
This has been the standard for over a decade -- even back when the bot ran automatically without a human requesting it. Others can chime in on this topic -- and we known that they will. AManWithNoPlan (talk) 15:53, 15 October 2018 (UTC)
They are. The reason why dates are singled out is because the CS1 templates will throw out errors when dates are badly presented, and the templates aren't smart enough to throw out errors when things aren't capitalized properly, so they're less of a need for explanations. MOS still applies though. Headbomb {t · c · p · b}22:55, 15 October 2018 (UTC)
Looking at Help:CS1 more closely, I see this passage for title case:
Use title case unless the cited source covers a scientific, legal or other technical topic and sentence case is the predominant style in journals on that topic. Use either title case or sentence case consistently throughout the article.
WP:Citing sources § Citation style permits the use of pre-defined, off-Wikipedia citation styles within Wikipedia, and some of these expect sentence case for certain titles (usually article and chapter titles). Title case should not be imposed on such titles under such a citation style when that style is the one consistently used in an article.
That's chapter titles, not work titles. And yes the bot should change them, per longstanding consensus to do so and other bots that do similar things. Headbomb {t · c · p · b}00:38, 16 October 2018 (UTC)
The type of work covered by the passage in Wikipedia:Manual of Style/Titles#Capital letters is journals. Journals don't have chapters, they have articles. It is common for titles of journal articles to be rendered in sentence case; the titles of the journals are typically title case. And bots designed to edit citation templates should obey the documentation for those citation templates. Jc3s5h (talk) 01:01, 16 October 2018 (UTC)
Same thing, chapter = article for journals, and bots follow both template docs and the MOS. And on Wikipedia, journal titles are capitalized in title case. See WP:JCW/Target1 for typical usage. Leaving obvious typos out, I count about 50ish cases out of 507819 citations (or <0.01%). And most of those were added by external tools by mistake. Headbomb {t · c · p · b}01:10, 16 October 2018 (UTC)
Directly linked Twitter posts (/status in Twitter link) get the full tweet posted in "title=". Tweets which have an external link in them cause the "external link in title" error because of this. Images included in tweets are also posted in the "title=" parameter (pic.twitter).
When something lst was updated should not matter in most cases if they were accessed prior to the last update. What matters is when it was published. If it was published after the accessdate, then something is wrong with either the date or access-date and the bot should disengage due to GiGo causing more GiGo. (t) Josve05a (c)20:37, 3 October 2018 (UTC)
Occasionally a work published near the end of a year will be assigned a publication date of the next calendar year. I've only seen this with printed books; traditionally access dates are not put in a citation for a book, and also, access dates are not used when there is no URL. But there could be other kinds of work where the publisher gives a publication date later than the date the work is actually available, and is of a type where an access date would be appropriate. Jc3s5h (talk) 22:55, 16 October 2018 (UTC)
Since all dates are added in add_if_new() it should not be too hard. I should note that when editing part of a page it will ignore the use template if it is not within the area being edited, but that is on user. AManWithNoPlan (talk) 03:35, 17 October 2018 (UTC)
Bourke, Richard Michael. "Edible indigenous nuts in Papua New Guinea". In Stevens, M.L.; Bourke, Richard Michael; Evans, Barry R. (eds.). South Pacific Indigenous Nuts. Proceedings of a workshop held from 31 October to 4 November 1994 at Le Lagon Resort, Port Vila, Vanuatu. Australian Centre for International Agricultural Research Proceedings. Vol. 69. Canberra: Australian Centre for International Agricultural Research. pp. 45–55. ISBN1 86320 485 7. OCLC38390455. Retrieved 27 September 2018. {{cite book}}: |format= requires |url= (help)
Is it possible to detect how big (in bytes) a page is or if there is visable content (or any content besides HTML tags) on it? (t) Josve05a (c)21:24, 20 October 2018 (UTC)
!UnsupportedresponseforURLhttps://websites.pmc.ucsc.edu/~fnimmo/website/White_Pluto.pdf: {"url":"https://websites.pmc.ucsc.edu/~fnimmo/website/White_Pluto.pdf","session":"bZzvBT3v3FjS5bA","items":{"10.1016/j.apal.2014.04.005":"Definable functions continuous on curves in o-minimal structures","10.1016/j.icarus.2017.01.011":"Geological mapping of Sputnik Planitia on Pluto"}}
{{Cite journal |last1=Todd |first1=Peter M |year=1994 |title=Music and Connectionism |journal=Acoustical Society of America Journal |volume=96 |issue=2 |pages=1218 |bibcode=1994ASAJ...96.1218T |doi=10.1121/1.410341}}
It should expand it to include the other authors
{{Cite journal |last1=Todd |first1=Peter M |last2=Loy |first2=D. Gareth |last3=Dipalma |first3=Louis P |last4=Hamilton |first4=David J |year=1994 |title=Music and Connectionism |journal=Acoustical Society of America Journal |volume=96 |issue=2 |pages=1218 |bibcode=1994ASAJ...96.1218T |doi=10.1121/1.410341}}
However, if |display-authors=etal is set, it shouldn't expand the authors. And if |display-authors=n is set, then it should only expand up to |lastn= and |firstn=.
Likewise, if it comes accross
{{Cite journal |last1=Todd |first1=Peter M |last2=Loy |first2=D. Gareth |last3=Dipalma |first3=Louis P |last4=Hamilton |first4=David J |last5= |first5= |last6= |first6= |last7= |first7= |year=1994 |title=Music and Connectionism |journal=Acoustical Society of America Journal |volume=96 |issue=2 |pages=1218 |bibcode=1994ASAJ...96.1218T |doi=10.1121/1.410341}}
Past complaints about the bot's behaviour with respect to authors was because it messed with style, and added authors when |display-authors=etal was specified, or beyond |lastn/firstn/authorn= when |display-authors=n was specified. I should know, I was one of those making those complaints. Headbomb {t · c · p · b}21:29, 18 October 2018 (UTC)
// If we already have name parameters for author, don't add more
if ($this->initial_author_params && in_array($param_name, FLATTENED_AUTHOR_PARAMETERS)) {
return FALSE;
}
we have to write quite a bit of code to deal with all the crazy existing data possibilities. pages with last1,2, and 3 and authors 4-7 all in last4 with commas AManWithNoPlan (talk) 02:12, 19 October 2018 (UTC)
In the edit below, an isbn was spotted and a cite journal was translated to a cite book. The "Journal" parameter in that instance could have been converted to a "series" parameter. It's not clear whether this will always be the desired behaviour, but I'm putting it out there as a possibility; thoughts welcome.
I cited a source, Summary of World Broadcasts: Non-Arab Africa and put under the publisher parameter "BBC Monitoring", the division of the British Broadcasting Corporation that compiled the radio transcripts and published them in the journal. The citation bot eliminated the publisher parameter and attempted to move the info to the author name parameters: as "last1 = Monitoring Service| first1 = British Broadcasting Corporation". This is quite incorrect, as BBC Monitoring should not be interpreted as an author in this case and, even if it was, it quite clearly doesn't divide according to a naming scheme for persons.
thats mostly because the wrong citation template was used: cite journal instead of news or book. The bot is a litte over trusting of humans at times. AManWithNoPlan (talk) 17:26, 26 October 2018 (UTC)
Treat <ref>[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/ https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/]</ref> the same as <ref>https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/</ref>
Treat {{cite foo|url=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/ |title=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/}} the same as {{cite foo|url=https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5474099/ |title=Empty}}
Do not change the PDF url (which is open source) to a mostly useless PMC link.
We can't proceed until
Feedback from maintainers
That's horrible of them. Other than checking for meta-data or scraping a webpage i cannot see any way to tell. Am i missing an obvious clue?. AManWithNoPlan (talk) 22:38, 22 October 2018 (UTC)
changes |doi=10.1002/1097-0142(19920315)69:6+<1578::AID-CNCR2820691312>3.0.CO;2-K to |doi=10.1002/1097-0142(19920315)69:6 <1578::AID-CNCR2820691312>3.0.CO;2-K
Valid printed ISBN-10 are replaced by calculated ISBN-13 for book published before 2000-01-01.
What should happen
In citations we should use the ISBN as printed on the book in front of us, not some ISBN found in the net, not some calculated ISBN. ISBN-13s were introduced on 2000-01-01, older books only contained an ISBN-10. Therefore the bot should not automatically replace an ISBN-10 by a calculated ISBN-13 for books published before 2000 (unless there would be a reprint edition actually using the ISBN-13). For books published after this date, it is okay to replace a ISBN-10 by a ISBN-13, but only if the ISBN-13 was found printed on the book as well. If both ISBNs are given on the book, we use the ISBN-13. While both ISBNs correlate with each other, using the "wrong" ISBN makes it difficult for humans to search for matches (not a problem for machines, which can calculate the ISBN).
The agreement is to cite the actually used source per WP:SAYWHERE. There is no agreement to systematically change ISBN-10s to ISBN-13s unless the source provided an ISBN-13 as well, in particular not automatically. It wouldn't be a problem if ISBN-13s were a super-set of ISBN-10s, but the somewhat odd application of the checksum causes it to be different enough from the original number to no longer match searches - thereby making it more difficult for humans to look up and verify information. (You can't expect them to understand the inner semantics of an ISBN number or use an ISBN calculator.) This problem does not occur when SBNs are zero-expanded to ISBN-10s. --Matthiaspaul (talk) 16:04, 17 October 2018 (UTC)
During Featured Article nominations, we are always asked to change to ISBN 13, which indicates there is a consensus for that. If the bot can do this for us, it's a bonus. So this proposal needs to be made at MOS level, not here. Only if a new consensus is reached at MOS, the bot should be changed. FunkMonk (talk) 11:25, 19 October 2018 (UTC)
It's not necessarily a feature proposal, but more a request to refrain from doing something that's causing inconvenience to readers and editors (not to machines, because they can easily convert between the two schemes), and therefore is undesirable. WP:SAYWHERE is a guideline, and I can't find anything in the MOS which would override it.
AFAIR we also have a policy for bots not to carry out unnecessary edits, and while there are cases where switching out ISBNs is perfectly fine (within the parameters given above), systematically changing ISBN-10s into ISBN-13s (without even knowing if they can be actually found by humans printed on the book) is neither necessary nor an improvement. After all, the project is for humans, not machines.
It is not as if ISBN-10s would lack some vital information. So, if the bot cannot adhere to a ruleset similar to that suggested above, it should better just leave it alone and only add a known ISBN when a reference is lacking one (because that's an improvement). --Matthiaspaul (talk) 12:19, 20 October 2018 (UTC)
If the book is reprinted, it will have the isbn 13. Converting the isbn is like adding the area code to a phone number - sadly the last number might change or might not. ISBN organization does want people to use the 13 everywhere. AManWithNoPlan (talk) 13:24, 20 October 2018 (UTC)
Regarding "reprint edition", yes, they will very likely have ISBN-13s (however, I am also aware of a few examples, where this has not been the case). If so, and if the editor actually cites from the reprint edition, using the ISBN-13 is fine. I'm also fine with using the ISBN-13 from a reprint edition even if the editor cites from the original edition, for as long as the reprint is really a 1:1 reproduction of the original including all errata etc. - many reprints, however, have known errata corrected (sometimes even "silently"), so it is not identical and therefore the ISBN from the actually cited source should be used.
Regarding "ISBN organization", while they are not authorative for us, can you point to anything official from them saying so? Most probably, they just mean that new books should use the ISBN-13 (obvious). After all, they can't change the fact that books used shorter ISBNs for decades, and those books don't disappear or somehow magically change, so ISBN-10s will have to be supported ad infinitum. As an encyclopedia, we have the duty to not rewrite history either. --Matthiaspaul (talk) 21:53, 20 October 2018 (UTC)
When checking citations on the page shown on the link, the bot changed the template for the BBC link from website to news. However, it then listed the BBC as a newspaper and it isn't?
Handle treated as if it were a journal, which they normally are not
What should happen
Treat as a web site
Relevant diffs/links
See below
We can't proceed until
Feedback from maintainers
PS: I don't even know what {{Cite journal|url=https://kb.osu.edu/dspace/handle/1811/50348|title=John Glenn standing beside his F-86 Sabre|journal=John Glenn Archives, the Ohio State University. Original Photo, 4 X 5 Inches|access-date=January 28, 2017|deadurl=no|archiveurl=https://web.archive.org/web/20170202120226/https://kb.osu.edu/dspace/handle/1811/50348|archivedate=February 2, 2017|df=mdy-all|year=1953}} is about. Why does it think that is a journal? Hawkeye7(discuss)01:57, 16 October 2018 (UTC)
Apparently hdls are not used for journals, but for ephemeral web sites. I'm not sure whether they should be used in cite web templates. In any case, the reviewers want access-dates so the sites can be retrieved from archive, and that requires URLs. Hawkeye7(discuss)11:06, 16 October 2018 (UTC)
many of the options being discussed are actually the same template, just aliases. Since journal is not an alias, it was the choice made. AManWithNoPlan (talk) 19:53, 17 October 2018 (UTC)
I have no objections regarding the usage of {{cite document}} if it supports all provided parameters. However, {{cite journal}} is wrong, because "John Glenn Archives" is no journal. (I would probably use |work=John Glenn Archives |publisher=Ohio State University.) If the bot actually changed {{cite web}} to {{cite journal}} because of the existance of a handle, than that's wrong as well, because {{cite web}} might not have been the best possible choice, but it is not a wrong choice.
In cases where there is no 100% clear solution (or it is not known), the best solution for a bot is to just leave it alone because of the high risk of causing much damage in little time if it doesn't work properly. I mean, there certainly are clear-cut cases and it is a relief if a bot can fix them for us, however, it is counter-productive if we cannot trust in a near-perfect behaviour of a bot and have to monitor and clean up after it. I'm somewhat shocked by the large number of reported issues recently. --Matthiaspaul (talk) 21:42, 17 October 2018 (UTC)
Now that I think about it, what is even the point of adding |type=Submitted manuscript? I've seen no WP:MOS describing that this should be done, and nobody but this bot has ever added such comments about URLs. (t) Josve05a (c)00:12, 21 October 2018 (UTC)
Issue: Citation bot seems to be mistaking a single page as an error with the dash/etc and then changes that single page to a page range.
Issue: Citation bot is deleting the names of publishers in this article and in at least some of the cases I *know* - because I did the initial research - that the previous form of the publisher was not incorrect.
For instance:
journal=[[Slate (magazine)|Slate]], January 18, 2006 was changed to journal=Slate, January 18, 2006,
|publisher=Omohundro Institute of Early American History and Culture -> deleted,
|publisher=Presidential Studies Quarterly; Center for the Study of the Presidency and Congress -> deleted,
if you feel that a journal is obscure enough that people need publisher information, then the correct solution is to create a wikipage for the journal and wikilink to it and fix the problem once and for all globally. AManWithNoPlan (talk) 22:31, 23 October 2018 (UTC)
No, that's the wrong attitude. While an article about a journal is always appreciated, this is not a solution to the problem. By default, the publisher information belongs into a reference as much as the journal info. The solution is that your bot should simply refrain from performing actions, which remove info from citations humans felt useful or necessary to add in the first place. Your bot is not entitled to perform any actions overruling humans, except for correcting obvious errors. --Matthiaspaul (talk) 10:46, 24 October 2018 (UTC)
This is suboptimal. If he has a specific page, that is not only sufficient but preferred. The bot should not be making a change here. --Izno (talk) 04:17, 24 October 2018 (UTC)
Please disable the removal of publishers/publication locations in journals. I've been silently watching and I think we're at the point where if that is the way the bot should operate, that consensus should be assessed by an RFC or similar. I'm willing to walk over to WP:BOTN to see the bot blocked over this issue given how many complaints have come up here. --Izno (talk) 04:15, 24 October 2018 (UTC)
I second this, it is a bug, not a "feature". The whole idea of removing parameters with valid contents is silly, and it becomes outright dangerous if it is performed by a bot. Seeing edit summaries in articles and the wall of complaints on this page, it seems as if this bot is causing more damage to the project than doing good stuff - it is in no time destroying the work of human editors, who spent a lot of time to research proper references. In the case of rare references or less frequently visited articles, it means that it is causing damage which is likely to remain permanent. This disruption is not acceptable.
Regarding publishers, some users feel that the publisher is redundant info if it is named almost identical to the name of the journal, but other users don't agree with it. The template parameters exist not only for display purposes, but also to populate meta data, and if there really would be consensus (which I don't think it is) that the publisher name should not show up in rendered citations when it is identical to the journal, it is the citation template that should suppress it in the rendered display output, not a bot to remove the information from the reference at all. It's trying to fix a (perceived) problem at the wrong level.
(edit conflict) I don't understand why the deletion of the publisher parameter is a feature. And being "obscure" has nothing to do with it, I thought the whole point of cites was to give readers as much information about the source as possible, to make it easy for readers to verify asserted facts. Why does the bot delete the publisher? If that is a clearly-approved part of that particular citation template why does Citation bot over-write and remove editors' valid contributions? I don't understand the logic of the deletion.
The past content/edit was "page=" and the bot changed that parameter to "pages=". Template: Cite journal/Template:Cite journal#In-source locations states "page=" "The number of a single page in the source that supports the content" and doesn't say "at=" is preferred. Also, the "Templates" option in the editing window's toolbar only gives editors "page=", there is no "at=" included for "cite journal"... Maybe that's one reason why "at=" doesn't appear within these cites.
AManWithNoPlan As an aside, I was posting here because I didn't understand why something was happening, I thought it might be a bug in the bot. I understand that you might get a lot of queries about issues that possibly seem self-evident to you but people ask questions or post about a possible problem because they don't understand, because they want to know and want to learn. None of us came to Wikipedia knowing everything there is to know about it, even the most experienced editor around here was a complete Wikibaby at some point and there is so much Wikicoding and so many areas to edit in, we all continue to be Wikibabies to some degree. Shearonink (talk) 05:48, 24 October 2018 (UTC)
I appreciate your complaint. If i had a dollar for eveytime someone said that they had seen this bug for years and were only now reporting it...... AManWithNoPlan (talk) 13:13, 24 October 2018 (UTC)
What style guide out there requires/recommends putting the publisher for a journal citation? None. So that's why the bot does what it does concerning publishers in journal citations. For the other thing, that's due to parameter misuse. Put the date in |date= and the bot will behave.Headbomb {t · c · p · b}13:21, 24 October 2018 (UTC)
What style guide out there requires/recommends putting the publisher for a journal citation? Irrelevant. If there is evidence of non-consensus regarding some action of the bot, WP:BOTPOL is clear. --Izno (talk) 13:37, 24 October 2018 (UTC)
I'd ask for consensus to include that information in the first place. No style guide out there recommends that. No mainstream professional publications includes them in citation. Not even our own Wikipedia:Citing sources#Journal articles mentions including publishers (see also CS1 documentation). The only people who want to include it are people under the misguided impression that just because a parameter exist, it must be used, and that citations need maximal information. By that logic, we'd include author emails, author addresses, ... just because this too is information. But it's not pertinent information. No one goes to a library and ask "I need Tattoli et al (2012) 'Bacterial autophagy'... I don't know the journal, but at the time, it was published by Landes Bioscience, who was acquired by Taylor & Francis." Headbomb {t · c · p · b}14:44, 24 October 2018 (UTC)
I'd ask for consensus to include that information in the first place No, that's not how BOTPOL works. Do I actually need to recommend a block on the bot at BOTN? @AManWithNoPlan: --Izno (talk) 14:45, 24 October 2018 (UTC)
You're the one that wants to change longstanding behaviour, I'd argue the onus is on you to show that consensus changed. Headbomb {t · c · p · b}14:53, 24 October 2018 (UTC)
As for blocking the bot, it does not make edits on its own. It is always user initiated. It is authorized to run unattended, but we do not do that at this time. AManWithNoPlan (talk) 15:22, 24 October 2018 (UTC)
That doesn't answer the question. Will you disable the specific functionality related to removal or will I need to go to BOTN? --Izno (talk) 16:00, 24 October 2018 (UTC)
Still irrelevant. "Longstanding behavior" is actually "Headbomb made this request solo within the past month or 3" and since that time several people have objected to it. That means it clearly does not have consensus at this time. BOTPOL is clear on the point. --Izno (talk) 16:00, 24 October 2018 (UTC)
I did not make that request, and, as AManWithNoPlan said, the bot is user-activated. Whoever activates it is responsible for its edits. If they want to have a special snowflake article that violates every style manual out there, that's on them. Headbomb {t · c · p · b}16:31, 24 October 2018 (UTC)
It is only now that I have seen this bot removing publisher information and doing all kind of other questionable things, and I'm around for much longer than a decade. So, either its behaviour has changed or it is used much more than in the past, or it is now used by people, who do use it to get rid of publisher info because that's their preferred style. Either case, fact is that there are now several complaints regarding the removal of publisher information on this page, indicating that this behaviour is not wanted. Therefore, remove this behaviour. --Matthiaspaul (talk) 22:39, 24 October 2018 (UTC)
scholarly journals do NOT include the publisher of journals in their footnotes. Style manuals like the CHICAGO MANUAL do not include recommend publishers for journals. One big problem is that publishers change very often and the current publisher had nothing to do with the article in question. Rjensen (talk) 00:13, 25 October 2018 (UTC)
Who cares about Chicago style? We are Wikipedia and have our own style(s), which allow such info to be included because it is useful to build the web (inside and outside of WP) and helps further research and reverse lookup. We are electronic, we are machine readable, space is no issue.
While it is true that publishers often change, even this is important information for historical research. There have been several cases already where knowing a publisher helped me to locate historical journal articles I would not have been able to identify without this information because of abbreviations and liberal spelling changes. And since we cannot predict the future, what might seem redundant info now might help future readers in a couple of decades to locate present sources. So, by default, publisher info is definitely useful and must not be removed.
Nobody can force you to add it if you just don't want to include it, but it is nothing but hybris to remove publisher info added by another editor because you don't find it useful. The other editor obviously did.
the publisher of a journal article is not useful info in any way for Wiki readers or editors and no one here has claimed it to be useful. When it comes to books the publisher is useful and important information because the publisher makes the decision on the publication and content of the book. In Scholarly journals, on the other hand, the publisher only handles subscriptions, printing, and mailing and online distribution of current issues. They were in no way responsible for issues before they became publisher and it is seriously misleading to suggest that to readers. Editorial decisions about the content are not made by the publisher but by an entirely separate organization called the editorial board of the Journal. Rjensen (talk) 05:11, 25 October 2018 (UTC)
If a journal is obscure enough that you need publisher information to find it then please create a page for that journal and help the world. The wiki style guides state even ISBNs are of questionable usefulness, so there certainly presidence for not adding every citation parameter. AManWithNoPlan (talk) 13:29, 25 October 2018 (UTC)
It is shocking to see that someone operating a bot has this attitude to problem solving - you are thereby serving your bot, but not the project.
Not every editor citing from a journal source is prepared to create an article about the journal, and why should s/he, anyway? As much as I appreciate it when someone writes an article, it is not necessary. --Matthiaspaul (talk) 21:32, 25 October 2018 (UTC)
It happens that I am one of those editors who find them useful for research, including the research necessary to further improve Wikipedia. I even gave examples. You will simply have to accept that different people have different expectations and needs. If you remove (correct) publisher info added by other editors, this is disruptive.
There is one exception: If the publisher name is identical to the journal name, this looks a bit odd in a citation (although it is technically correct and not redundant). Only in this case the publisher info can be suppressed, but this is something that should happen in the code of the citation template, not by removing the parameter value itself (and thereby losing the information that they are identical). --Matthiaspaul (talk) 21:32, 25 October 2018 (UTC)
The real solution is wikilinking to page about the journal. The publisher is relevent to the journal itself, not the page it is referenced on. AManWithNoPlan (talk) 02:10, 26 October 2018 (UTC)
Didn't really want to get involved here, but I think it is quite clear there is disagreement over the cite bot's ability to alter publisher info and it should be suspended pending further discussion. I don't think wikilinking every journal name is the solution, especially when there are journals that have no apparent notability. As for the comments about the limited usefulness of ISBNs (as if to say one style guide comment represents full consensus on a matter), I'd like to point out that every time I've brought an article through FA or A-class review (at the MilHist project) I've always been asked to provide a number identifier for books and journals. I would also like to note that the citation bot is removing publication location info too, and in my experience my peers have also preferred it when I include this info. -Indy beetle (talk) 07:01, 26 October 2018 (UTC)
Solution: Don't use the bot if you want to have an article that violates every style guide out there. The bot only removes locations for journals, since that too is useless. It leaves them where style guides recommends them (e.g. books)Headbomb {t · c · p · b}11:12, 26 October 2018 (UTC)
Then talk to that editor and gain consensus for having non-standard citations that violate style guides. Or use {{nobots}} or equivalent.Headbomb {t · c · p · b}01:15, 28 October 2018 (UTC)
Because it's still a {{cite journal}}, and that information is still useless for journals. What people said was to wikilink the journal (i.e. Warship International) to have readers find information about the publication if they want to know who the publisher is. Alternatively, you could {{nobots}} or {{cite journal<!-- Deny Citation Bot-->|...}}, or use {{cite magazine}} to cite it as magazine. Headbomb {t · c · p · b}12:46, 28 October 2018 (UTC)
"It is shocking to see that someone operating a bot...." odd comment considering that the operator is not involved in this conversation. AManWithNoPlan (talk) 01:31, 28 October 2018 (UTC)
"And linking the publisher makes no difference; it still gets deleted" that was never suggested by anyone that i saw. Linking thr journal was. AManWithNoPlan (talk) 16:16, 28 October 2018 (UTC)
There are two points. In addition to removing via if url is empty, also remove via (and the url) if the url points to the same page as pmid. Those are not the same thing. Is citation bot already doing the later? Boghog (talk) 16:38, 28 October 2018 (UTC)
No, not always. |asin= always link to Amazon.com, not Amazon.co.uk. They may differ and sometimes do not carry the same titles, which might make a former .co.uk link to a |asin= become a dead link. (t) Josve05a (c)20:37, 20 October 2018 (UTC)
If you must link to amazon and you must link to the uk amazon then you should use |asin= and set |asin-tld=co.uk. But, in this:
{{Cite book|title=Molecular Modelling: Principles and Applications|last=Leach|first=Dr Andrew|date=30 January 2001|publisher=Prentice Hall|isbn=9780582382107|edition= 2nd|location=Harlow|language=English|id= {{ASIN|0582382106|country=uk}}}}
we have |isbn=9780582382107 which links to Special:BookSources where there are links to all of the amazon tlds and which holds the first 9 digits of the value in {{asin}} so {{asin}} can and should be deleted (we are not here to feed prospective customers to amazon or to any other book monger).
I keep removing this DOI (since a free PDF version is available and linked) and Citation Bot keeps putting it back. I first reported the broken DOI to the publisher in 2016; clearly, they're not going to fix it! MeegsC (talk) 13:35, 22 October 2018 (UTC)
<ref>[https://web.archive.org/web/20060503182230/http://www.britannica.com/eb/article-9015241]</ref> to <ref>{{cite web | url=https://web.archive.org/web/20060503182230/http://www.britannica.com/eb/article-9015241 | title=Bell Laboratories --� Encyclop�dia Britannica| date=2006-05-03}}</ref>
and
<ref>[https://web.archive.org/web/19970116221538/http://www.bell-labs.com/project/dali/]</ref> to <ref>{{cite web | url=https://web.archive.org/web/19970116221538/http://www.bell-labs.com/project/dali/ | title=The Dali Home Page| date=1997-01-16}}</ref>
What should happen
If the URL is https://web.archive.org/web/, add it in |archvie-url= and ad the original URL as the URL, etc.
Archive.org is the most common, but there is also webarchive.org and archive.is and others -- see WP:WEBARCHIVES for domain name particulars. Also they should have |dead-url=yes -- GreenC19:36, 8 October 2018 (UTC)
No need to add |dead-url=yes because it does nothing; yes is the default state when |dead-url= is empty or omitted.
Agreed the date is wrong and non-fixable by bot. IABot and WaybackMedic will do the rest but no guarantees if or when they get to it, they don't seek them out, it's incidental. It is involved to get it right due to the many archive services and URL patterns to extract the source URL and identify an archive URL. Should have a standard library for web archives, I have one but it's in a language no one else on Wikipedia uses. Some day I should learn PHP to port it for wider use. -- GreenC14:44, 1 November 2018 (UTC)
It was said ---- No bot corrects |date=1997-01-16 to |archive-date=1997-01-16, so we should at least not add |date= for such URLs.---- I am curious your rational. In this case, the date and archive-date should be the same. It is the date that the URL is from. AManWithNoPlan (talk) 21:21, 1 November 2018 (UTC)
Think "a book published on March 1912, but added on Google Books in 2017". In |date= we would add "March 1912" not "2017". Same with archvie dates. Just because archive.org archived it a specific date, that is not the date the document/page/nnewsarticle was published. (t) Josve05a (c)21:35, 1 November 2018 (UTC)
updating a citation on page 2018 World Series from here, thought author name was "Hunter Felt at Fenway Park" (last1=Park, first1=Hunter Felt at Fenway)
<pclass="byline"data-link-name="byline"data-component="meta-byline"><spanitemscope=""itemtype="http://schema.org/Person"itemprop="author"><arel="author"class="tone-colour"itemprop="sameAs"data-link-name="auto tag link"href="https://www.theguardian.com/profile/hunter-felt"><spanitemprop="name">Hunter Felt</span></a></span> at Fenway Park</p>
I should note that the above HTML is irrelevant since the code in question uses meta-data and that is sadly "byline":"Hunter Felt at Fenway Park". AManWithNoPlan (talk) 16:25, 30 October 2018 (UTC)
{{cite web |title=Estados de acordo com a percentagem dos negros em 2009. |url=https://pt.wikipedia.org/wiki/Afro-brasileiros#/media/File:Pretos_no_Brasil_2009.png |website=Wikipedia |accessdate=10/28/2018|date=2018-10-18 }}
What should happen
{{cite web |title=Estados de acordo com a percentagem dos negros em 2009. |url=https://pt.wikipedia.org/wiki/Afro-brasileiros#/media/File:Pretos_no_Brasil_2009.png |website=Wikipedia |accessdate=10/28/2018}}
We can't proceed until
Feedback from maintainers
Do not add |date= to Wikipeida links, since, as we know, Wikipeid may be updated dayily. What counts is the |accessdate=. (t) Josve05a (c)23:17, 31 October 2018 (UTC)
A, why are we citing Wikipedia, and B, why is the correct fix not to point to a permanent version of the page instead, if there is some specific reason to cite Wikipedia? Citation bot shouldn't make a specific change regarding Wikipedia. --Izno (talk) 23:38, 31 October 2018 (UTC)
A & B: Ask the writers of the articles with Wikipedia references (there are a lot). The bot doe snot touch most references, however these are formatted as a cite template without dates, and that is a common parameter whcih should always otherwise be added, however, in this case it will not work. (t) Josve05a (c)23:40, 31 October 2018 (UTC)