Past AI-generated content debacle in Wikiproject Video games
Back in August, there was an event where an editor over at WP:VGgenerated 24 articles entirely with AI. Some of these were deleted entirely, but the majority were redirected with still accessible page histories, and around two articles still stand now (though trimmed). Only one article has been completely rewritten and repaired, and that's Cybermania '94. The editor in question was also blocked.
This incident may be something worth noting somewhere in this project, whether to have more examples of AI generated content, to reconstruct articles that formerly used AI from the ground up, or whatever other reason. NegativeMP101:58, 7 December 2023 (UTC)
Added as well, thanks! I've noticed they seem to use "stunning" a lot when describing places, but that by itself contains too many false positives. ARandomName123 (talk)Ping me!14:56, 6 December 2023 (UTC)
Agreed. "In conclusion..." is a similar tell to this I feel - lots of false positives for the phrase, but when a GPTed section appears, it really sticks out like a sore thumb. Andrew Gray (talk) 01:15, 7 December 2023 (UTC)
Yes, I think the tell for the final paragraph isn't any particular phrase so much as it is "Conclusion phrase, followed by a brief paragraph." You know it when you see it. Looks like an undergraduate exam paper. -- asilvering (talk) 02:09, 8 December 2023 (UTC)
Hello, I'm part of a research project as part of Stanford's OVAL. We are studying building tools that are factually grounded which I'm sure you can imagine is quite a challenge. We have built a model that appears to be relatively accurate and are hoping for Wikipedia Collaborators to participate in evaluation. We have built a UI tool to display a human written article and an article from our model and would score both. The UI tool has been built to streamline the evaluation process, even including the snippets of cited sources relevant. We have monetary compensation available for participants.
Thanks a lot for this project! This sounds very interesting indeed, and we would be glad to collaborate with your project if needed. ChaotıċEnby(t · c)20:47, 9 December 2023 (UTC)
I am suspicious of the many offline references and further reading. But the author has been blocked so I suppose no point asking them. I don’t know much about Chat GPT etc. Is there a formal investigation process to look at all the other stuff created by User:Torshavn1337 and their sockpuppets? I only intend to fix Poverty in Turkey (no need to delete article as subject is notable) not any other articles such as Foreign relations of Turkey. Wikipedia:WikiProject Turkey seems pretty moribund so I think I would be wasting my time asking them anything. Any ideas? Chidgk1 (talk) 11:14, 24 December 2023 (UTC)
Driveby comments: The tone of this article strikes me as awkward, but not AI-generated; if was AI-generated it wasn't a major LLM. Courtesy ping: 3df, who is more experienced on this. I don't have time to check the references. There also isn't an official investigation process (yet™) but here works fine. QueenofHearts ❤️ (she/they 🎄 🏳️⚧️) 23:33, 24 December 2023 (UTC)
I think this was actually originally a copyright violation of this report, but with the sources scrambled in some random order. The article is likely too early to be AI, which wouldn't have been that coherent at the time. 3df (talk) 01:41, 25 December 2023 (UTC)
Ah I see thanks. I was wondering why all the sources were from 2016 and before when the article was created in 2020. Chidgk1 (talk) 12:57, 26 December 2023 (UTC)
I'm inclined to agree with QoH, but I agree that the article is suspicious nonetheless. The sources certainly need to be checked. sawyer * he/they * talk23:53, 24 December 2023 (UTC)
Can these phrases really be used to identify AI-generated content?
I have some doubts that most of the phrases at Wikipedia:WikiProject_AI_Cleanup/AI_Catchphrases are useful for identifying AI-generated content. As a test, I clicked on the first link (stand as a testament) and opened the first 3 pages (Domenico Selvo, Chifley Research Centre, and Apollo (dog)). In each case, the catchphrase was already present in 2021 (see [2], [3], and [4]), i.e. before the official release of all the main LLMs today. So it is very unlikely that the phrases in these articles were created using AI.
Another reason for doubt is that AI output is based on the frequency of formulations used in the training set. Since Wikipedia is a big part of the training set, any phrases that are frequently used on Wikipedia may also be frequently used in AI output.
There may be some rather obvious phrases useful to identify AI content, such "As a large language model, I...", "As an AI language model, I...", and the like. But most of the phrases listed here do not fall into that category. Phlsph7 (talk) 08:28, 25 December 2023 (UTC)
There were far more good examples in these search results a month ago, but everyone's been doing a great job of cleaning it all up and leaving the acceptable stuff. Those searches might not have any problematic results left. 3df (talk) 16:49, 25 December 2023 (UTC)
In that case, it might be best to remove the phrases. The page gives the impression that these phrases can be used as an easy and reliable way to identify AI-generated contents. Since the great majority of the search results are false positives, this is likely to do more harm than good. Except for the obvious phrases mentioned before, I don't think there are any catchphrases that could be used to reliably identify AI-generated contents. Phlsph7 (talk) 17:03, 25 December 2023 (UTC)
Yes, I think it's time to put these away. A written guide to finding AI content would be better. I'll get a start on it. 3df (talk) 20:04, 25 December 2023 (UTC)
That sounds like a good idea. You should probably mention made-up references and obvious hallucinations, like events that never took place. Editor behavior could be another factor, such as when a high number of substantial content additions are made in significantly less time then it would take to type them. But generally speaking, I think AI involvement is very difficult to detect and online detectors are far to unreliable to be of use. Phlsph7 (talk) 21:01, 25 December 2023 (UTC)
I appreciate the effort in trying to help editors identify ChatGPT responses but I'm not sure that the recent adjustments solve the problem. Depending on the prompt used, the responses can have all kinds of linguistic problems or none at all. For example, I used the prompt write a wikipedia article on the topic "Metaphysics" and got the following result:
ChatGPT response
Metaphysics is a branch of philosophy that explores the fundamental nature of reality, including the relationship between mind and matter, substance and attribute, potentiality and actuality. The word "metaphysics" comes from two Greek words that, together, literally mean "after physics". The name was given c.70 B.C.E. by Andronicus Rhodus, the editor of the works of Aristotle, because in his list of Aristotle's works, the Physics comes before the works dealing with metaphysics.
Overview
Metaphysics attempts to answer two basic questions in the broadest possible terms:
1. "What is there?"
2. "What is it like?"
A person who studies metaphysics is called a metaphysicist or a metaphysician. The metaphysician tries to clarify the fundamental notions by which people understand the world, including existence, objects and their properties, space and time, cause and effect, and possibility.
A central branch of metaphysics is ontology, the investigation into what types of things there are in the world and what relations these things bear to one another. The metaphysician also attempts to clarify the notions by which people understand the world, including existence, objecthood, property, space, time, causality,
After a first initial look at the response, I don't think it has any of the "typical" problems discussed here. My suggestion would be to be very careful with any concrete guides on how to identify AI output. It might also be a good idea to follow reliable sources concerning how to identify it rather than presenting our personal research as a definite guide. I assume many editors have very little background knowledge on LLMs so we should not give them the false impression that there are generally accepted methods for identifying LLM output. Phlsph7 (talk) 08:57, 26 December 2023 (UTC)
Yeah, there aren't any definite method to identify LLM output, and the best detectors will always lag months or years behind the LLMs themselves (in a very crude way, it can be seen as similar to how GAN work). Of course, there are a few words that make it 100% certain that a LLM wrote it (e.g. As of my last knowledge update in January 2022), but there isn't any criterion or tool that can reliably decide both ways (and, since LLMs can get closer to human speech than the variance inside each group, and text can't be easily watermarked like images, it's likely there won't be anytime soon). ChaotıċEnby(t · c)10:22, 26 December 2023 (UTC)
The stuff I'd written about so far are problems we keep seeing exhaustively in practice. The list is turning out more like a "what do AI edits usually do incorrectly that need to be fixed" than a "how can you tell if text was written by AI" guide. I can add wording to clarify that, and also that we can't trust those detectors. Several examples for each section would be very helpful, but I'm really not looking forward to sifting through the hundreds of AI diffs for them. 3df (talk) 20:41, 26 December 2023 (UTC)
I think it's a good idea to have a guide on what editors are supposed to do once they have identified AI-generated text even if the instructions cannot be used to identify whether a text is AI-generated.
I like the idea of keeping things coherent and avoiding duplication. One possible concern would be that the purposes of WP:LLM and WikiProject AI Cleanup are not identical. The purpose of the cleanup project is more narrow since it is mainly concerned with cleaning up problems created by AI-assisted contributions. The purpose of the essay is wider since, in addition to that, it contains advice on how LLMs can be used productively and how to avoid some of its pitfalls in the process. Phlsph7 (talk) 09:41, 10 January 2024 (UTC)
I am concerned that some things like Every edit that incorporates LLM output should be marked as LLM-assisted by identifying the name and, if possible, version of the AI in the edit summary. This applies to all namespaces. is worded as if it was policy, but it is not. And In biographies of living persons, such content should be removed immediately—without waiting for discussion, or for someone else to resolve the tagged issue. is actually not supported by policy. If you are reverting content exclusively because you think it is AI-generated and you have no specific concern about accuracy, sourcing, or copyright violations, then that revert goes against policy. MarioGom (talk) 11:12, 11 January 2024 (UTC)
Yes, actually, that paragraph was intended to mean that non-policy compliant LLM-generated BLP content should be removed, specifically, not just any LLM-originated content, which I have clarified in this edit.—Alalch E.17:54, 12 January 2024 (UTC)
"Conclusion" sections in AI generated content - one caught in the wild here?
Hi all,
First of all: I am waaaaay out of my depth there, and my apologies if this goes nowhere - fine with that. Please see pretty any much of my contributions where I poke fun at myself for being a "Sysop" who doesn't actually understand how the internet works.
It would appear to me that there are any number of AI "conclusions" or "summary" generators out there in the wild.
Yep, the whole draft you point to appears to be very ChatGPT-like. The key things are the "Book Title: Subtitle" style in the first section, which ChatGPT nearly invariably generates, but also having a plan-like structure with many short subsections restating their title in one or two fluffy sentences (a product of formatting to Wikipedia the bullet lists of "key points" that ChatGPT generates), and of course the "Conclusion: blahblah" last part which you aptly found. Unfortunately, tools to detect whether a text is AI or not are often less than reliable (if not completely unreliable), as they lag months or even years behind the generative LLMs themselves. ChaotıċEnby(talk · contribs) 10:13, 15 January 2024 (UTC)
Would be great if there were a reliable tool to check these with; I use this GPT-2 Output Detector Demo, and it must be an AI shill because it always thinks everything is fine and nothing is AI-generated.
Unfortunately, GPT-2 tools aren't too reliable given that most stuff generated from GPT today is from GPT-3.5 (including ChatGPT) or even GPT-4 (a completely different model). The sad reality is that, for now, LLM detectors have had to play catch-up with generative LLMs, in a way reminiscent of what happens inside generative adversarial networks (although I don't think generative LLMs use LLM detectors in their training, but their rate of improvement is nonetheless high enough for the effect to be similar).And this is one of the reasons we're here as a project – to build such a tool where none exists before (at least in the more specific, and likely much easier, Wikipedia use case), to assist us with this in the future! ChaotıċEnby(talk · contribs) 12:07, 15 January 2024 (UTC)
(by the way, if there's a better place to bring things like this to attention, please let me know; this is the first wikiproject i've been apart of and i am inexperienced.) EspWikiped (talk) 15:44, 20 December 2023 (UTC)
One issue I can think of is that of the edge cases, like human-generated images that are later enhanced by AI tools. What do you propose for these? To look at the much bigger picture, a strong categorization of human vs AI images on Commons could achieve the same results as what you suggest without the need for a redundant project, and better handle edge cases than having the whole thing divided into two different projects. We already have various kinds of media (images, sounds, videos, etc.) on Commons, why can't we deal with having both human and AI-generated media if they are explicitly distinguished as such?Another (small) issue: I don't think you can have a domain name in .ai.org as the second-level domain appears to have already been registered. ChaotıċEnby(talk · contribs) 09:41, 20 January 2024 (UTC)
It depends on the case, there are lots of articles where an illustration made using AI could be very valuable and appropriate, given that it doesn't have misgeneration issues and is clearly labeled as made using AI.
Once there is a better image it can still be replaced and it shouldn't replace but complement existing images. If there was no image showing how the art style cubism looked like an AI-made image would be useful and better than no image. It's a tool and people are also adding images made or modified using the tool Photoshop to articles sometimes when that's due. Prototyperspective (talk) 17:32, 21 January 2024 (UTC)
Prototyperspective, we have artists in the Wikimedia community. Why not just ask them to make an image instead of using software trained on copyrighted material (especially when the holders of said works were not compensated and/or have given explicit permission to be used in such manner)? — Davest3r08>:)(talk)17:40, 21 January 2024 (UTC)
I know that very well since I even created the Wikimedia Commons category for that. Illustrations and artworks are very much missing. Those two things are not mutually exclusive. I would very much support and welcome better interfacing between editors / people who know which images are missing and people who have the artistic skills to implement any of the requested illustrations. These I have tried to so earlier listing many science-related images that are missing even in very popular articles of major subjects. AI software are very useful tools to close visualization gaps and they can be replaced with better ones. They can also serve to make people become aware which images are currently missing so they see an AI image and think "conceptually that image was missing but it isn't an illustration as good as it could or should be, so I'll replace it". There could be a project that seeks to replace AI images with better images made manually (or add missing illustrations) such as via asking artists to license an identified relevant image under CCBY per mail. If you'd like to I could give a long list of science-related articles in need of illustrations that is not close to being exhaustive that I posted to a Wikipedia community earlier. Human artists are also inspired by and learn from copyrighted works which they usually can't and don't all list. I'm interested in how things are and can be done in the real world in practice – if you have an idea how to get more illustrators onboard or how to better engage artists, please go ahead and if possible let me know about it since I always come across lots of articles in need of illustrations (often where a visualization/illustration would be particularly useful). Prototyperspective (talk) 17:55, 21 January 2024 (UTC)
AI-upscaling image cleanup template
Should there be an equivalent of {{AI-generated}} for images, flagging that an article has multiple upscaled historical images that should, per MOS:IMAGES, be replaced with their originals? Either a separate template or an option on {{AI-generated}} that changes the message.
I spent the last 15 minutes or so trying to figure out how to boldly reintroduce the collapsible feature of the marquee that was removed in this edit in December, but I couldn't figure out a way that preserved its "look". I'm bringing this up rather than just abandoning the idea of it being (re)hidden because it seems to just be present for "fun" (i.e. unless I'm missing something it doesn't seem to serve a clear or unique purpose in the context of the WikiProject) and something about it caused some rather immediate nausea for me (maybe the way it's moving, but I usually need more like 15 to 30 minutes for that kind of motion sensitivity, not three seconds :-/). Is there any way for collapsibility to be reintroduced by someone who has more of an idea of what could be done to collapse the marquee without compromising the way it looks when unhidden (or compromising the ability to re-hide the content, as {{show}} would do)? Or no, and then my recourse is to hide it in my own user CSS? - Purplewowies (talk) 23:21, 8 February 2024 (UTC)
Sorry for that, unfortunately the collapsible feature broke the marquee on some devices. I'm thinking of ways to have it work while being able to hide it, I'll update you! (I'll remove it in the meanwhile as accessibility is more of a priority than marquees) Chaotıċ Enby (talk · contribs) 23:24, 8 February 2024 (UTC)
Wow, that was fast! I had considered just removing it myself, honestly, but that solution felt too no-fun-allowed for me to do boldly instead of asking about what to do instead. :P Thanks for the quick response, and I hope you manage to find a way for it to work! - Purplewowies (talk) 23:42, 8 February 2024 (UTC)
Looking at their articles, the is clearly using low-quality WP:LLM to quickly generate Wikipedia articles claim seems false to me, they had only created six articles (although I might be missing some articles created from redirects) in January before this post, none of which look like AI. Now, the and even using to generate [sic] robotic rationales to nominate Wikipedia articles (i.e. Wikipedia:Articles for deletion/Sher Afzal Marwat (2nd nomination)) claim. The AfD you linked does read AI, but their articles do not, and either way, we can't really do anything about behavioral issues. The accused also has not nominated an AfD since, so I'd just drop it. Queen of Hearts (chat • stalk • they/she) 01:47, 16 February 2024 (UTC)
Reporting page?
There's a bit of a discussion on Bluesky of statements in Wikipedia being sourced to LLMs. One reader asks for "Advice on how to report AI-Generated rubbish to Wikipedia so it can be purged."
I've said to just edit it, noting that you removed a claim sourced to LLM output. But unsure not-yet-editors are perennial.
This is a bit of a related topic that I haven't seen many people touch on so far. There's been a rise in websites like BNN Breaking (which is on the WP spam list) that simply reword existing news articles or make up fake news entirely (as opposed to established sources like CNET that have some articles written by AI). Some cases even involve cybersquatting on domains owned by defunct news sources. Should we keep track of the use of these sources in articles (likely by good faith editors who believe the site is legitimate)?
I was looking through the notice board and I saw about the project, I was a bit intrested to join. Can you give a bit of introduction like what are the criteria to be a participant, what do you expect a participant to know or be good in and is there any like fixed goal to stay in the project and am I eligible. I have gone through the page lightly but was intrested if I could get some basic understanding so I can decide wether to join or not.
Hello! Like any WikiProject, there are no eligibility criteria for participants, you are free to participate whether or not you put your name on the list :). Cheers! Remsense诉13:27, 15 March 2024 (UTC)
The goal is to help spot articles that have been generated by AI without human verification, and verify if they are accurate and conform to our policies (which they very, very often don't—you'll likely see peacock words and other non-encyclopedic language sprinkled around ChatGPT-made "articles"). Chaotıċ Enby (talk · contribs) 13:30, 17 March 2024 (UTC)
Hiya! I got pointed toward this project when I asked about declaration of AI-generated media in an external group. I noticed that the article for Kemonā uses a Stable Diffusion-generated image, which has not been declared. I noticed it, as the file has previously been up for deletion-discussion on Commons, but was kept as it was "in use". If used, shouldn't AI-generated media be declared in its description / image legend? EdoAug (talk) 23:24, 10 April 2024 (UTC)
@EdoAug I don't know that there's a guideline about this in specific but I'd say so. The copyright of Stable Diffusion images is still in the courts afaik, so we might end up having to remove all of those images in the future. -- asilvering (talk) 02:50, 28 April 2024 (UTC)
Possible use of AI to engage in Wikipedia content dispute discussions
It was suggested to me that this maybe a good place to ask. A response seemed particularly hollow at Talk:Canadian_AIDS_Society so I checked on GPTZero and ZeroGPT. The first says 100% AI, and latter says about 25% likely. Quillbot says ~75% likely. So, the results vary widely based on the checker used. Is it actually likely that a certain 100% manually written contents would get tagged as 100% AI on GPTZero? Do any of human observers here feel the response in question here could be 100% human written? Graywalls (talk) 00:19, 28 April 2024 (UTC)
These detectors are really unreliable, but from looking at the linked comment (and only this comment), I'm certain that it is AI generated. 3df (talk) 02:47, 28 April 2024 (UTC)
You mean the one that starts "I appreciate your third-party perspective and the insights you provided...", right? There's almost no way an actual human wrote that. -- asilvering (talk) 02:49, 28 April 2024 (UTC)
I really recommend not caring about the detectors. A broken clock saying it's midnight isn't more convincing to me than saying it's 4:30. Remsense诉16:12, 28 April 2024 (UTC)
Yeah, I recommend just eschewing the detectors entirely. Point being, "if it quacks like a duck", and all that. Remsense诉03:34, 28 April 2024 (UTC)
There are quite a lot of citations on that section, though, so the best action here is simply to see if they verify the text. -- asilvering (talk) 23:17, 28 April 2024 (UTC)
Wikipedia policy on AI generated images
I found an article about a historical individual that contained a fully AI generated image. I mentioned this on the Teapot page and the image eventually got removed because it was original research. I tried to find some Wikipedia guideline or rule about the use of AI images but I couldn't find any. Since this WikiProject is about AI content, I came here to ask about the official Wikipedia policy on AI images, if there is any. Are AI images supposed to be removed simply because they're original research or is there something specific regarding AI images that warrants their removal? I'm looking for details regarding the use of AI images on Wikipedia and when are AI images acceptable to use. Thank you all in advance for your responses. Broadhead Arrow (talk) 15:19, 5 May 2024 (UTC)
On their own, the presence of these phrases do not necessarily indicate that the text is likely to be AI-generated. However, if multiple catchphrases are found together, there is a far greater likelihood of the text being AI-generated. For example:
Panel on Wikipedia & Gen AI at WikiConference North America?
Hi, I'm working on putting together a roundtable discussion for WikiConference North America this year about generative AI and Wikipedia. If any participants in this WikiProject are planning to be there, I'd love to have your voice! Program (and scholarship) submissions are due Friday (May 31), so if you are interested, please reach out to me by Thursday (May 30), ideally at liannawikiedu.org so I can share the draft of what we're proposing and see if you want to participate. --LiAnna (Wiki Ed) (talk) 18:25, 28 May 2024 (UTC)
How can I check big additions to an article please?
Is there a tool I or their tutor or @Ian (Wiki Ed): can use to check whether the new text was AI generated please? If not what are your opinions please? Chidgk1 (talk) 16:18, 14 May 2024 (UTC)
@Chidgk1 and Ian (Wiki Ed): Yes, I'm pretty confident that text was generated by AI. It has a lot of the key indicators I'd look for. It's probably too late to do anything about it, but I've reverted it to the prior version. The WordsmithTalk to me00:01, 30 May 2024 (UTC)
@The Wordsmith I agree, it reads like LLM writing. @Chidgk1 I've had some success with ZeroGPT, and also by asking ChatGPT to create the article in question and look at how the tool words it. I'm seeing more this term, but I suspect it's because I'm developing more of an eye for it. Ian (Wiki Ed) (talk) 20:31, 30 May 2024 (UTC)
Tracking of removed content and/or users who added chatgpt/AI content?
Is there any desire to track which articles had AI-generated content removed from them, or who the offending users were? I recently did my first removal of AI content, in this edit. That content was added in this edit on 9 Dec 2023 by a new user User:NuclearDesignEngineer who apparently tried this on 4-5 other articles, got promptly reverted on many (but not all). Hasn't edited since. I'm not sure if I should complain, or just quietly revert, or what. 67.198.37.16 (talk) 04:33, 1 June 2024 (UTC)
I was going through userscripts today when I found User:Phlsph7/WikiChatbot. It seems to use ChatGPT to embed a chatbot into Wikipedia pages, which can give editing advice. I'm not sure if there should be a wider discussion on whether this sort of thing should be allowed to be installed, but figured I'd raise it here first. The WordsmithTalk to me02:42, 1 June 2024 (UTC)
I definitely see how it can be useful in the right hands, I use Generative AI in my personal and professional lives all the time. Mostly to give myself ideas, summarize things or edit documents/emails for tone. Never for text that gets submitted on Wikipedia, that just seems too dangerous even if I know what I'm doing. There should probably be some safeguards around it's use.
Is there a way that we can monitor the pages it is used on? Something like how Twinkle or SPIhelper can log activity to a file in userspace, but ideally it would be automatic rather than toggling it on/off. I know we can use Special:WhatLinksHere/User:Phlsph7/WikiChatbot.js to see who has it installed, but that doesn't tell us where it's being used. A mandatory edit summary tag or edit filter entry might also be ideas, or limiting it to certain usergroups. Courtesy ping to @JPxG: who has it instaleld and is also a member here, maybe he can give some insight on how it can be used or suggestions on safeguards. The WordsmithTalk to me18:30, 2 June 2024 (UTC)
I think an edit filter entry would probably be the best solution, if that can be implemented. I'm also a bit concerned about some non-editor-facing features, like the chatbot giving quizzes to readers (apparently with no independent verification of the quiz contents). Chaotic Enby (talk · contribs) 21:32, 2 June 2024 (UTC)
Thanks for all the suggestions. I removed the quiz-button (the quiz content was based on the article text selected by the user).
Twinkle and spihelper directly perform edits to wikipedia pages: roughly simplified, you press a button and then the script makes an edit on your behalf. Since the edits are directly managed by these scripts, they can add tags and adjust the edit summary. This function is absent from WikiChatbot: it does not make any edits for the user, it only shows them messages. All edits have to be made manually by the user without assistance from the script (the documentation tells editors to mention in their edit summaries if they include output from the script in their edits). In this regard, the script is similar to Microsoft Copilot, which is an LLM directly integrated into the Edge browser to talk about the webpage one is currently visiting without making changes to it.
Another safeguard is that WikiChatbot keeps warning the user. Every time it is started, it shows the following message to the user:
Bot: How can I assist you? (Please scrutinize all my responses before making changes to the article. See [[WP:LLM]] for more information.)
It also shows more specific warning messages for certain queries. For example, when asking for expansion suggestions, its response always starts with
Bot: (Please consult reliable sources to verify the following information) ...
Good points, and the safeguards look pretty neat! Regarding the edit summary, I know that some helpers like Wikipedia:ProveIt add default edit summaries when they're invoked (which can be edited by the user), even if they don't make the whole edit by themselves, so that could be something to look into! Chaotic Enby (talk · contribs) 09:23, 3 June 2024 (UTC)
The comparison with Proveit is helpful, I'll look into it. One possibly relevant difference may be that the purpose of Proveit is to change wikitext in the edit area. When this text is changed, it automatically adds an edit summary remark. WikiChatbot is intended for interaction with the regular article view (the rendered HTML code) and does not make changes to the wikitext in the edit area. Phlsph7 (talk) 07:31, 4 June 2024 (UTC)
Regarding user groups, it would be possible to limit the script to autoconfirmed users. In that case, if the user is not autoconfirmed, they get an error message. I checked a few of its current users and they are all autoconfirmed so, on a practical level, this would make little to no difference. The hurdles to using this script are high since each user has to obtain their personal OpenAI API key, without which no responses from the LLM model can be obtained. So the script is unlikely to attract many inexperienced casual users. Phlsph7 (talk) 09:15, 3 June 2024 (UTC)
I am genuinely confused about some of the functions provided by this chatbot, such as Ask quiz question: Asks the reader a quiz question about the selected text. (how is this encyclopedic?) Also, functions such as Suggest expansion: Suggest ideas how the selected text could be expanded., or Write new article outline: Writes a general outline of the topic of this article. Ignores the content of the article and the selected text. appear to be the kind of generative use of LLMs that are usually frowned upon. While the documentation mentions that editors using the chatbot should take care of not adding hallucinations it can generate into the article, the fact that the chatbot is explicitly also intended for readers makes it even more worrying, as there would be no human verification of the answers it gives to the reader. Chaotic Enby (talk · contribs) 15:32, 1 June 2024 (UTC)
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
User:Davecorbray is a new editor adding a lot of AI-generated text to articles about 19th century British prime ministers. I happened to have one, Spencer Perceval, on my watchlist as I had done a lot of work on the article some years ago. I thought there was something odd about the additions and eventually went through each paragraph checking the text against the sources and deleting the paragraphs where the sources did not support the text. That turned out to be all of them. I only thought of ChatGPT at that stage and the editor admitted on their talk page to using it, although rather downplayed their use of it. I replied with what I see as the problems [5]. As for the other articles - I have done a few spot checks and the additions seem likewise to be ChatGPT, with inappropriate "sources". I have never come across this before, and I wondered if someone with more experience could take a look at it. Southdevonian (talk) 22:22, 13 June 2024 (UTC)
Thanks a lot for signaling this! Yeah, adding false information and/or false references is just as much of a problem when it's done with ChatGPT (even more, as the person can do it at scale much easier). If they keep doing it after what you told them, best to formally give them something like {{uw-ai3}}, which looks like this:
@Chaotic Enby So does that mean I can’t edit Wikipedia? Not to be rude, but I think that you’re taking this a step up. I only used ChatGPT fairly recently (around a week from now). I only used it to help me with writing and researching rather than using it to spread falsehoods. I followed up on @Southdevonian your suggestion that ChatGPT can be tricky to use in terms of research and writing, as a machine it could be inconsistent and inaccurate sometimes to some degree. If any information or sources was false or misleading, I accept the responsibility for it and I apologise sincerely. Also I would remove information that is indeed irrelevant and not use further AI-generated content. But you should know that all the edits I have made since last month are all written by me and they have been fact-checked earlier beforehand, I only used ChatGPT only to help me out with paraphrasing long sentences and conducting certain research to accurately confirm some sources (which I accepted above as being incorrect and wrong). It isn’t that simple undoing edits that are frustratingly hard for the reader to understand and yes it is also similarly frustrating sometimes to turn up in dead ends when doing research on these topics. So that’s why I used ChatGPT and I didn’t intentionally use it to make misleading statements or anything else. Again, I apologise for any grievances caused by my edits. Davecorbray (talk) 23:29, 13 June 2024 (UTC)
If you are relying on ChatGPT's information for conducting certain research when you turn up in dead ends when doing research on these topics, and you didn't realize ChatGPT often gave you inaccurate or fully incorrect information, it's a mistake – but don't worry, we all make mistakes, and Southdevonian explained the situation to you. Now, you shouldn't do it, and write your Wikipedia edits in your own words without relying on information given by ChatGPT. That doesn't mean you can't edit Wikipedia, only that you shouldn't use ChatGPT for it. Not just "it's tricky so I should be careful", no, it spreads enough subtle falsehoods and fake references to basically be net zero information.However, if you continued doing it after it has been explained to you, then it would not be a mistake but actively disruptive, and that is why I mentioned ANI.Also, when you mention that your edits have been fact-checked earlier beforehand, was it with ChatGPT or by doing your own research and verifying inside the sources? ChatGPT is often known to make up sources that just don't exist, or to quote sources that don't say anything it claims. Chaotic Enby (talk · contribs) 00:24, 14 June 2024 (UTC)
@Chaotic Enby Thank you for your support and advice. Now I understand that the negative impact this has had the articles themselves and the need to fact-check any source that does not support the research. To answer your question “was it with ChatGPT or by doing your own research and verifying inside the sources”: yes, I do verify sources before using them in any form of reports, articles, essays or say summaries. But as I have noted in my previous statement, I only used ChatGPT about 2/1 weeks ago from now. That means that I was simply wasn’t using it before that time and again I only used it to either paraphrase or simplify sentences and words that might be unclear. It might have gotten quite mixed up in the end, I presume, but I don’t use ChatGPT in every one of my edits. Sources in this case, also similarly, have been inappropriately misused. For instance, I have asked Chat for sources on Spencer Perceval’s tenure as Attorney General and it returned sources that I, mistakenly believed, were actual because of assurances of it’s accuracy. But now I know that was a false alarm. So I am indeed very wrong in this aspect of the situation. So I would discontinue to use any ChatGPT for that matter then. Davecorbray (talk) 01:06, 15 June 2024 (UTC)
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
Help with AI-written articles
An editor admitted to using AI to write two aircraft articles; Caproni Ca.104 and Focke-Wulf W 4, and has agreed to stop using AI to write more. Both articles have been determined to be largely inaccurate, but I am unsure about the proper course of action for dealing with such cases. My first instinct is to nominate them for CSD G3, but given the unfamiliar circumstances, I thought I'd bring it up here first. - ZLEAT\C00:08, 1 July 2024 (UTC)
FYI: While investigating the CSD tag on Caproni Ca.104 image as a copyvio (and subsequently deleting it), I looked at the Caproni Ca.104 article which was tagged as a possible hoax. Because of the discussion on the talk page and the discussion at User talk:Sir MemeGod, I tagged and deleted the article as a G3 hoax. If the Focke-Wulf W 4 article has some valid text, I suggest deleting everything else and leaving what can be salvaged. Otherwise, ZLEA, I agree that the article should be tagged G3 as a AI-generated hoax. Afterwards it can be created from scratch using valid sources. — CactusWriter (talk)01:21, 1 July 2024 (UTC)
Adding a category to users warned with the user templates
Hi all,
I was looking at the list of people supected of using AI, and it seems a bit outdated. Couldn't we just make the AI warning templates automatically add the users to a category? Acebulf(talk | contribs)01:35, 17 June 2024 (UTC)
Is it possible to specifically tell LLM-written text from encyclopedically written articles?
The WikiProject page says "Automatic AI detectors like GPTZero are unreliable and should not be used." However, those detectors are full of false positives because LLM-written text stylistically overlap with human-written text. But Wikipedia doesn't seek to cover all breadth of human writing, only a very narrow strand (encyclopedic writing) that is very far from natural conservation. Is it possible to specifically train a model on (high-quality) Wikipedia text vs. average LLM output? Any false positive would likely be unencyclopedic and needing to be fixed regardless. MatriceJacobine (talk) 13:29, 10 October 2024 (UTC)
That would definitely be a possibility, as the two output styles are stylistically different enough to be reliably distinguished most of the time. If we can make a good corpus of both (from output of the most common LLMs on Wikipedia-related prompts on one side, and Wikipedia articles on the other), which should definitely be feasible, we could indeed train such a detector. I'd be more than happy to help work on this! Chaotic Enby (talk · contribs) 14:50, 10 October 2024 (UTC)
That is entirely possible, a corpus of both "Genuine" Articles and articles generated by LLMs would be better though, as the writing style of for example ChatGPT can still vary depending on prompting. Someone should collect/archive articles found to be certainly generated by Language Models and open-source it so the community can contribute. 92.105.144.184 (talk) 15:10, 10 October 2024 (UTC)