Wikipedia talk:Link rot/Archive 1
SuggestionThis project is a really cool idea! Just one suggestion - the listings would be more useful for collaboration if they were simply posted to the wiki. That way, people could "click to test" and have handy links right to the broken pages. (Of course, the bot would need to exclude these listings from its next run.) There are too many listings to post all at once (or at least all on one page), but we can certainly start working on a chunk of them. I'm sure we could clear out some of the less-populated categories entirely. By the way, anyone can do this; you don't have to be Marumari. (I would do it myself if I weren't busy updating other such reports.) -- Beland 03:23, 5 October 2005 (UTC)
#!/bin/bash AWKPROG='{desc=" "$3;code=$4;rpt=$5;if(NF==4){desc="";code=$3;rpt=$4;} printf "#[[%s]], [%s%s], %s %s\n",$1,$2,desc,code,rpt}' (for i in a b c d e f g h i j k l m n o p q r s t u v w x y z; do zgrep -i ^$i 404-links.txt.gz | awk -F '\t' "$AWKPROG" > 404s/$i; done) zgrep -i -v '^[a-zA-Z]' 404-links.txt.gz | awk -F '\t' "$AWKPROG" > 404s/misc; 301 redirectsI was trying to cleanup 301 redirects. I noticed a number of pages were posting links to google cache, which sometimes failed as 404. In some cases the link to cache is used to convert from doc/pdf to html. Does annyone know the officiial wiki policy on using google cache links? I feel that since google links are often not valid, they should be discouraged. Pointer to the direct webpage is a better idea. 301 redirects botI am planning to ask for permission to fix some of the 301 redirects. It will be a manual process, obtaining the pages to be changed. After that have the bot perform the changes. Example of this is 114 instances of http://www.ex.ac.uk/trol/scol/ccleng.htm . coomments? Khivi 09:13, 13 October 2005 (UTC)
RefreshesHow often do folks think I should update the listings? Perhaps I should just wait til Khivi creates his 301 bot? Jumping aboutKhivi, I appreciate your edits (to the 301 section). Can you please try to work in a "range" of links, instead of jumping about like that? It does make it much more difficult to read. Thanks! 404 Fixing PolicyWhat is the policy to fixing the 404 links. Should they just be deleted. Also what is the policy for linking to the internet archive. Should one link to a specific version. e.g. which is better
Probably a lot of linking to the internet archive can be (semi) automised with some kind of bot?
404 cleanup questionShould we update the individual pages where the 404 linkrot entries are listed? With strikeout text? I am doing so, and see a couple of others have as well, but don't see where there are specific directions on 404 errors, unless I missed it somewhere. SailorfromNH 23:36, 26 November 2005 (UTC)
Time for an update? Again?We're working on the list of dead links generated from the September 13th database dump here. A lot of the ones in this list will have been repaired by someone by now and also new dead links will have appeared. --Spondoolicks 18:05, 29 November 2005 (UTC)
I am not clear where we are in the database dump cycle, but these lists are feeling a little stale. A lot of the ones in this list will have been repaired by someone by now and also new dead links will have appeared. Is it time for a new list? Open2universe 13:07, 24 February 2006 (UTC) Question re link checkerI've noticed that quite a few of the links listed as 404 errors are of the form that goes to a section of the target page - e.g. http://www.hostkingdom.net/Holyland.html#Samaria (I'm sure there's some technical term for this type of link which I ought to know). I've tried a dozen or so of these and they all seem to be working fine so I was wondering if the link checker had not managed to process these correctly. --Spondoolicks 14:18, 11 January 2006 (UTC)
Bot request to fix malformed linksI've just put a request on Wikipedia:Bot requests for a bot to fix those links which are not working because someone used the pipe symbol (|) thinking it was used the same as for internal links - e.g. [http://www.bbc.co.uk|BBC website]. At a rough estimate based on a small sample I'd say about 2% of the 404 errors are due to this mistake. --Spondoolicks 17:26, 16 January 2006 (UTC)
What should we do with [captionless url] links?Is there any guideline, tradition, or advice for dealing with dead URLs in square brackets by themselves without any caption text? I'm just replacing them with their Internet Archive link, when it yeilds results, without any further commentary such as is produced by {{dlw}} and {{dlw-inline}}. --James S. 06:23, 17 January 2006 (UTC)
403 codeAm I blind...or does the page mention nothing about code 403 (Forbidden)? Bloodshedder 05:44, 4 February 2006 (UTC)
How should I repair dead news article links?A lot of news sites don't keep their articles available for long and don't allow the Internet Archive to capture them, which results in a lot of non-repairable dead links. However, if the article was from a news agency like AP or Reuters then it will be available from many other places, some of which might have either permanently accessible articles or will be available from the Internet Archive. I've just replaced a dead link to an article in the Washington Post with a link to the same article which is still available at ABC News but I don't know how permanent that is. Does anyone know of a major news site which uses AP/Reuters reports and has permanent links? --Spondoolicks 17:43, 6 February 2006 (UTC)
301 code countThe main page indicates that there are >30,000 of the 301-type errors. However, in opening the page I only get a count of slightly less than 12,000. Is this because the page is partitioned; the first one (/301) runs through the H's. Thanks. User:Ceyockey (talk to me)
Tips from experienced userHi, I was just hopping around for a project to work on I found this Link rot page. I am not sure if it is supposed to be too simple but would it be possible for experienced contributors to list efficiency tips for newbies for this project? For example, this guide from Wikipedia:Disambiguation pages with links was quite useful when I started that (until I got bored). Ashish G 00:47, 1 March 2006 (UTC) Re:Dead external linksHello there, Do you think you would have time to regenerate the files for the dead external links project? Not that we finished them all, but they are feeling a little stale. Thanks so much Open2universe 12:21, 1 March 2006 (UTC)
Pages too largeI came here to remove a link to an AFD deleted page, but the page was over a megabyte long. I couldn't find the entry I wanted when I went into edit mode (Firefox doesn't search within text boxes) so I left it. == way too long. --kingboyk 10:28, 21 March 2006 (UTC)
Dead links cited as sourcesI was asked not to delete dead links to pages that listed population figures. I am wondering what folks think is the best way to handle this when the website cannot be found in the archive. I understand that the website was the original source, but I am reluctant to leave broken links. Should I leave the URL but not as a link? Or should I simply state that the link is unavailable? Any guidance is appreciated. Open2universe 14:57, 27 March 2006 (UTC) Internet Archive mentioned but not WebCiteI made some edits on Dec 7th to the effect of mentioning WebCite [1] alongside the Internet Archive as a means to recover broken links, in particular if they were prospectively archived with WebCite; unfortunately, these changes were reverted by another user as "spam/self-promotion". I will not re-revert these changes to avoid an edit-war, but I do request to give this matter some serious consideration and I am seeking some support through the Wiki community. Internet Archive and WebCite are not competitors, but complement each other, and both are non-profit. I do think that WebCite could help Wikipedia a lot to avoid broken links in the first place (or to cache cited material so that it is recoverable). BEFORE:
MY SUGGESTED EDITED VERSION
There were other edits I made (which can be seen in the history) to include hints to the effect of avoiding 404s in the first place if all cited links would be cached prospectively using WebCite, for which somebody could write a bot (see Wikipedia:Bot_requests). I hope wikipedians will support the proposal to include hints to WebCite as well, or help in rephrasing how this should be done, and hopefully put these edits back in. I will withdraw myself from further discussions on this (except perhaps correcting factual errors in the subsequent discussion), but just want to throw the suggestion out there. --Eysen 14:39, 8 December 2006 (UTC)
Database UpdateOkay, the entire page has been updated with new information from the November 6, 2006 database. Take all the numbers on the page and double or triple them. :( Sorry for being so neglectful - I rarely have an opportunity to monopolize my internet connection for the week or so that it takes to check all these links. --Marumari 17:54, 13 January 2007 (UTC) Edit summariesSome cleanup projects have had generic edit summaries to use when editing. They're useful for saving time when making many edits of a similar nature, as well as for advertising the project. For example: Stubsensor cleanup project; you can help!
HelloJust thought I'd post a quick hello message. Although not a newbie to Wikipedia, I am a newly registered user. I've decided to pitch in by getting involved with this effort. I've been working my way through the 404s. I'd appreciate it if one of the regulars here would review some of my work to make sure that I'm following guidelines. I'm a little confused by the dl templates and am not completely clear on when/how to use them. —The preceding unsigned comment was added by Sanfranman59 (talk • contribs). US House of Reps Clerk URLsMany of the articles about various sessions of the US Congress have links to "Rules of the House" pages on clerk.house.gov. The URL has changed from clerk.house.gov/legisAct/legisProc/etc to clerk.house.gov/legislative/etc. Is there a way to do a global replace to fix this on every Wiki page? Sanfranman59 23:29, 2 February 2007 (UTC)
300 LinksI have resolved most of these.Matthew Hill, Steve Godsey and Jerome Cochran reference www.timesnews.net a subscription only news site. I can not verify the links and none are archived. London Buses route 106 contains a references to a document which seems to have been inadvertently renamed by the site admin to "xxx.pdf". Some how I don't think it's going to stay that way.Phatom87 22:20, 8 February 2007 (UTC) Sorting by link hostIt would be useful to have an alternative view of broken links sorted by host. This would help make it apparent when a host of links to a popular external source break due to some large scale changes on the host; for example, when Pitchfork decides to move its articles around (again), breaking tons of links. Once situations like this are recognized, they can be possibly handled by bots, allowing editors to focus their linkrot effects on more unique cases. Pimlottc 14:11, 20 February 2007 (UTC)
404 botHello, I've just my completed work on PDFbot and during it's run I've noted a lot of the links 404ed. Most of them are fairly easy to find, just very tedious to fix. My bot code is in a state where it should be fairly easy to adapt for this. Here are some ideas on link resolution:
Is there anything against this, and more idea of what this bot should be doing? PS: This project's visibility is extremely low, none of it's templates are linked here. —Dispenser 03:48, 12 March 2007 (UTC) What bot is used?A question: presumably some kind of bot goes actually calls up the links in question to check them? How does this bot identify itself to the websites? I ask because I've come across several links which are marked as 404 but which definitively exist; the websites they are on have a very strict policy towards bots and might be giving (unwillingly) false results. —The preceding unsigned comment was added by 84.190.114.119 (talk) 22:06, 13 March 2007 (UTC).
This project and my botWithin the past two weeks I've programmed a bot (User:Ocobot) that detects broken links at Wikipedia. It has just been approved for trials. Now I just stumbled across this site, and that makes me basically feel like an idiot, like I wasted a lot of time and that Wikipedia is even much bigger than I had thought. Ah, it's a pitty I didn't find this earlier. This project's bot is not listed at the bot status page, is it? I consider withdrawing my request for approval now. But I'm not sure yet, because my idea is slightly different. Users can put areas of their interest on the bot's schedule for example and the bot will then check these articles. Could someone from this project please browse the bot's user page and maybe the request for approval to see if it can be of use despite the existence of this project? Shall I run my bot, change it or drop it? Thanks. — Ocolon 17:52, 18 March 2007 (UTC)
WikiProject External linksEvery day I discover something new. Today it was WikiProject External links. The project seems to be rather inactive, unfortunately. However, I hope it can be revived. Would you like to contribute? Wikipedia:Dead external links fits very well into this project — it should have a central role actually. I think the WikiProject could be used to coordinate all external links discussion and activity. And having an active WikiProject External links would also strengthen the efforts of this site here. Your support will be appreciated a lot! You can make suggestions or comments on this matter on the WikiProject's talk page. — Ocolon 18:35, 20 March 2007 (UTC) ToolI've programmed a little tool for Ocobot that might be helpful for you, too. You can also suggest how to make it more useful for you! It's a very short script that checks if an URL can be retrieved by the wayback machine. The basic difference between directly using the wayback machine and my tool is that the tool outputs Wikipedia tags, which might make editing a little faster :-). Here's an example on how to use it to retrieve Wikipedia at the wayback machine: http://ocobot.kaor.in/url.php?http://www.wikipedia.org/ But you can also go to http://ocobot.kaor.in/url.php and use the form to check an url. What should a bot do?I'm currently writing a bot which will search for dead links however, I'm not quite sure what the bot should do once it's actually found one. I was going to simply list the dead pages on a user subpage of the bot, I'm not sure if this is the right thing to do now. Any suggestions? PeteMarsh 22:23, 29 May 2007 (UTC) Some ideas
Some dead links may be good links and should not delete simply. But when more and more links one day there may be too much links in a page.Gaia2767spm 13:43, 7 June 2007 (UTC) Wikipedia:Requests for verificationPlease see: Wikipedia:Requests for verification A proposal designed as a process similar to {{prod}} to delete articles without sources if no sources are provided in 30 days. It reads: Some editors see this as necessary to improve Wikipedia as a whole and assert that this idea is supported by policy, and others see this as a negative thing for the project with the potential of loss of articles that could be easily sourced. I would encourage your comments in that page's talk or Mailing list thread on this proposal WikiEN-l: Proposed "prod" for articles with no sources Signed Jeepday (talk) 14:09, 18 July 2007 (UTC) 404sJust removed a big block of repaired (struck-out) links from the U list. (Copied to its Discussion page in case anyone minds.) Seemed to date from a while ago and I had to keep scrolling past them ! thisisace 22:53, 8 August 2007 (UTC) Independent online newspaper dead linkshttp://en.wikipedia.org/w/index.php?title=Special%3ALinksearch&target=enjoyment.independent.co.uk Many of these dead links could be recovered by substituting ... "enjoyment.independent.co.uk" with "arts.independent.co.uk" |
Portal di Ensiklopedia Dunia