This is an archive of past discussions on Wikipedia:AutoWikiBrowser. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
I would love to try out the new AutoWikiBrowser and provide feedback on it. Send an email to floresg2 AT gmail.com. Thank you. GfloresTalk22:20, 8 December 2005 (UTC)
You auto-thingy wasn't just doing a recat in this edit [1]. You can see that it also added a space after the initial asterisk in lists (which is unnecessary), and also doing some other changes that I couldn't figure out what they were (removing spaces?). BlankVerse22:54, 10 December 2005 (UTC)
I removed some white space, and added some spaces to the bullet points to make the wiki text more readable at the same time as re-cating, whats weird about that? Martin23:05, 10 December 2005 (UTC)
Since I'd never seen your edits before yesterday, I decided to look at one today. Instead of the expected re-cat, the dif shows a whole bunch of changes plus the re-cat. That's why it looked weird to me. BlankVerse23:35, 10 December 2005 (UTC)
Yeah, among other improvements I am making more descriptive edit summaries easier ;) Also I think I won't bother with the very very minor edits from now on anyway. Martin23:48, 10 December 2005 (UTC)
Works well
The bot works quite well. The only issues I have had is when a blank page loads i.e. image with no tag. The bot carries the information from the previous edit and trys to fill in the blank page. — KaiserB23:31, 11 December 2005 (UTC)
I suspect that this is a problem I have encountered before, whereby I can't see the stuff right at the bottom of the AWB window; there's no scroll bar so I simply can't get at it.
My windows toolbar is double-height, to allow more stuff to be displayed in the "clock" area, and to allow more window buttons.
You can see a sliver of the top of my toolbar right at the bottom of the screenshot.
I suspect that AWB positions its buttons either relative to the top of the window, or based on incorrect information as to where the bottom of the window actually lies.
It's really frustrating, since this looks like an excellent tool…I'd like to see and use all of it
In the meantime, I've a couple of questions:
There are no access keys: "Make list" could be "Make list" maybe…
Tools→Options does nothing for me: should it?
There's a new tab: what does "messaging" do?
What are the "general fixes"? Can we pick and choose?
It can be resized by "grabbing" the middle horizontal border between the buttons area and the browser, hopefully I'll fix it properly soon.
Yup, I still need to add shortcut keys and stuff (admittadly I had forgotten, as I never use them).
The "Options" should not do anything, I was fiddling and forgot to remove it ;)
Messaging is for appending a message to the bottom of a page, i.e. for spamming user talk pages (someone asked me for this feature). I have tried to make the tooltip text (you know, the popup help text that you get when you hold the pointer over something) as explanatory as possible.
General fixes are mainly errors in the "See also" and "External links" sections (e.g. mis-capitalisation, or a non-standard name), removing excess white space, I need to add more to these general fixes.
Let me know if there are other features that I coud add, I have some that I haven't enabled yet as well.
I came across a user's signature which had been broken by the brief outage of HTML Tidy a while back, and I thought that it would provide an ideal exercise for me to get to know AWB.
However, AWB doesn't seem to be able to find the broken text, even here when I can see it right there in the textbox!
The text in question is:
(note that in the wikitext I have had to double-escape the first version: the second version is actually the broken one which is rendering just fine here :-).
Are there some settings I should be tweaking?
Should I be double-escaping the HTML entities like I had to just above?
—Phil | Talk14:30, 14 December 2005 (UTC)
hmmm it was because it has an \ in it, which the AWB interpreted as an escape sequence I think, so I escaped the escape sequence by sticking another \ into the "find" and it was OK. I'll work out how to fix it, hopefully it won't be a problem in the mean time. Martin14:59, 14 December 2005 (UTC)
Actually, on second thought it's pretty good needing to escape some characters; it means the search can be much more advanced. Martin16:01, 14 December 2005 (UTC)
More escaping
I just discovered you have to take great care sometimes…
I was searching for:
[[Box of Delights]]
which I had just turned into a REDIRECT, so I could short-circuit it.
I fetched the list of "links to" articles and told AWB to replace that with:
[[The Box of Delights]]
which is the proper article.
Imagine my horrified interest when the first article appeared with every single instance of "?]]" replaced with [[The Box of Delights]]:
it was obviously treating the [ ] like a character range.
But I didn't have "Is regex" selected: should it really be doing this?
If that option is switched off, shouldn't it treat the string as completely raw?
It works fine, BTW, if you use
Yeah I found that as well, I didnt realise that it would treat it like a range, but I guess I could call it a new feature! Martin16:37, 14 December 2005 (UTC)
Whoops, I just realised that what i had done initially was correct (I was starting to get worried), but I was passing the wrong parameter to my find and replace method, and as such regex was permanently on (as you suggested). Its fixed now (simple find and repace really is simple now!) Martin16:56, 14 December 2005 (UTC)
Question
For the option, 'Ignore if contains:', is that only for the title of the page or for anything in the article? Also, is there a way to add several entries (like User: Wikipedia: talk:), basically I just want the main article namespace.
Secondly, I think this tool would be great for disambiguating links, but I really don't know how that would function in the program. Perhaps using the what links here option, and somehow getting a list of links from the term's disambiguation page (or the person could just create this in notepad), then the person could enter some link to disambiguate in the search field and AWB will search for it in the small textarea and perhaps highlight it. Upon which, the user will change the link to point to its appropriate location. I don't know that may be difficult to implement. If I'm not being clear, let me know. I'm way too tired right now. BTW, can't wait for the local text file support. Thanks. GfloresTalk04:37, 15 December 2005 (UTC)
It ignores article text, to seperate main space; sort them alphabetically then remove them from the list. Martin08:58, 15 December 2005 (UTC)
I will add regex to the "ignore if contains" as well so you can search for as much or as little as you like. I am also working on a filter to leave only main space articles, but it is easy to do it manually so its not a big priority. Martin09:02, 15 December 2005 (UTC)
Previewing manual changes
Is it possible to preview the effect of manual changes?
I'm not that confident in my typing to be certain I always get it right first time…
HTH HAND —Phil | Talk08:55, 15 December 2005 (UTC)
So I haven't a bot account, but I would like to have AWB sit in the background, touching a whole list of articles, fixing "links to" issues for updated templates.
Can I set it to go off at 1 minute intervals, just wandering down the list of articles?
What's the best way to make sure it just touches, doing nothing substantive at all?
HTH HAND —Phil | Talk09:56, 15 December 2005 (UTC)
I would like to delink solitary years e.g. [[2005]]. But I need to avoid years if they are not solitary. I have been searching for 'in [[2005]]' and ' of [[2005]]' but that is laborious. Presumably I can make use of regex to avoid ']] [[2005]]' and ']], [[2005]]' etc.
I know that you are not a regex helpdesk but could you let me know how regex find square brackets and how it does ignore? Bobblewik10:05, 15 December 2005 (UTC)
Add to Watch-list
I would like the option to be able to inhibit adding articles to my watchlist.
If it were possible to preserve the current situation (i.e. only "add" them if I am already watching), that would be great.
In the meantime, I'm hoping that, despite the abjurations not to click in the browser pane, if I untick "add to watchlist" it will be respected
HTH HAND Phil | Talk10:12, 15 December 2005 (UTC)
I'd like AWB to take care of it, so that I could just set it off ticking away in the background: maybe it could be an option in the menu? Of course, this issue of side-effects in other applications might need some attentnion before I can do that :-( HTH HAND Phil | Talk12:57, 19 December 2005 (UTC)
Interacting with the webpage is difficult, no one I asked had any idea how to do it, getting text from the edit form is not to bad, "clicking" the save and diff buttons is a bit of a hack, but I have yet to master how to check and uncheck the tickboxes. Martin13:06, 19 December 2005 (UTC)
It's a while since I managed any serious VBA programming, and I stopped doing it for money shortly before VB.NET arrived, but I might be able to help (if only to make some really obvious observations and make sure every base has been covered :-). Is the source code anywhere I could take a peek? How do you get the text out of the edit form? Could that be modified to access the other "controls" on the web-page? —Phil | Talk15:11, 19 December 2005 (UTC)
Getting text from a form is quite easy when you know how; something like webBrowser.Document.GetElementById("wpTextbox1").InnerText. To be honest I havent looked into this issue that much, I had other priorities (such as getting the "preview" button etc. ) now that's done, I'll give it some more thought. I doubt anyone other than myself could understand the source as it needs to be tidied up a lot and noted properly ;( If there is anything specific you want to know, just ask though! I know that when I find the answer it will be simple, such as the code is above. thanks Martin15:29, 19 December 2005 (UTC)
I assume that it would be something along the lines of (very roughly) webBrowser.Document.GetElementById("wpWatchthis").Checked=False. How wrong am I? Phil | Talk16:37, 19 December 2005 (UTC)
Along the right lines, c# sees it as an element with properties, so you have to set the properties like webBrowser.Document.GetElementById("wpMinoredit").SetAttribute(string attributeName, string value);Martin16:47, 19 December 2005 (UTC)
Inhibit adding
I note that there is now an option to "add to watchlist".
Actually what I need is the opposite:
I already have this option set in my Preferences.
What I want is the option that the pages I deal with through AWB not show up in my watchlist unless I specifically say so.
Sorry to be a pest, but would this be difficult now you've figured out how to control that check-box?
—Phil | Talk10:48, 20 December 2005 (UTC)
I think for the moment the best way might be to temporarily change one's prefernces, to remove the automatic "add to watchlist" while Using AWB. Then it will presumably not do so. If edsiting manually in a different window at the same time you would need to remember to set this as i would not be auto-set. then after closing AWB re-set the prefernce. DES(talk)21:10, 21 December 2005 (UTC)
Startling drop-out when attempting to cut/paste
Yikes!
I think I just discovered that the editing box doesn't use the standard "Ctrl-X = cut" shortcut:
I tried "cutting" out some text to move it down a line and AWB dropped out.
What was really disturbing is the fade effect you seem to have applied: for a significant number of seconds it gave the impression that this machine had blown several fuses.
Maybe you could put something into those "Options" you left blank, and inhibiting the fade-out could be the first…
So what is permissible for editing in that box?
—Phil | Talk11:36, 15 December 2005 (UTC)
I'll change quit to cntr-q instead, like I said I never use the keyboard shortcuts so I didnt know it was cut as well. thanks Martin15:53, 15 December 2005 (UTC)
That's odd anyway: I wasn't aware that you didn't need an actual newline after a section heading. Looks like AWB is cleaning out the "extraneous" white-space after the "="s but failing to replace it with a newline. Maybe it should always stick a newline in?—Phil | Talk14:49, 15 December 2005 (UTC)
Erm, I am not responsible for cleaning up after you, go and fix it. The way that page was done is dumb anyway, there is no point in me fixing that problem, as it is extremely rare, and you are actually supposed to check what your saving. Martin15:21, 15 December 2005 (UTC)
OK. I did not know that it was extremely rare. Now I see that the article originally had a single space character instead of a new line. It seems that Wikipedia accepts either a single space character or a newline as an 'end of section heading'. So a possible solution would be to add a single space character. That would be unnecessary in almost all cases. If, as you say, the problem is rarely encountered, then that is ok by me.
I imagine its probably more to do with the wiki server side of things than the program, not that i couldnt do something to avoid it, unfortunately it is impossible to replicate the problem, I'll get around to making it re-load or skip the page if it thinks its blank. But you really should check what you are saving! Martin00:35, 16 December 2005 (UTC)
Hence my request for a preview option. This is especially important when manual changes are added in the edit box. It would also be helpful to be able to see if someone has been messing with the templates you might be using (grrr!) whilst you're in the middle of using them (GRRRRR!) —Phil | Talk09:46, 16 December 2005 (UTC)
AWB seems to be affecting other applications when it's working.
If I send it off to fetch a page for processing, and switch to another application during the pause, I seem to be getting an "enter" keystroke being spontaneously generated which is making my other application do stuff.
Is this a side-effect of how AWB works?
If so, can it be stopped?
Please?
—Phil | Talk09:54, 16 December 2005 (UTC)
If you add more entries to the list of articles, AWB should check whether an entry is already there and not add it twice.
(As a side-comment, I've had some interesting phenomena when removing items from the list, but I'm unable to reproduce them: removed items staying put, selections becoming multiple, random stuff. I'll let you know if it recurs.)
HTH HAND —Phil | Talk13:59, 16 December 2005 (UTC)
Whenever I use this program I get a wierd bug. See the screencap. BrokenS 19:27, 17 December 2005 (UTC) By the way this program is really sweet (I am still using it even though I have to click through the error messages [3 errors per page fixed]). BrokenS19:30, 17 December 2005 (UTC)
That is a problem with Internet Explorer (hence the error "Internet explorer script error" ;) ), there is something I can do to avoid it though (I think). thanks Martin20:03, 17 December 2005 (UTC)
While doing a touch run (doing null edits) to update the list of used templates in articles I noticed that AWB on redirects opens the redirect page and not the target page of the redirect (Example Plzen). For this specific kind of run I would have needed that AWB opens the target page, not the redirect page. I made the list from the "What links here" of template:if. – Adrian | Talk13:53, 18 December 2005 (UTC)
Ok, what features can I add to make your lives easier? Are there any specific jobs that the program could be adapted to better suit? Martin00:36, 19 December 2005 (UTC)
how about a way to tag images...I'm not exactly sure how I would work it. Some should have {{unverified|~~~~~}} other might need {{unknown}}. And others you might be able to decide fairuse or maybe they forgot to tag it gfdl (but wrote it down in plaing text). Maybe a drop down box? BrokenS00:42, 19 December 2005 (UTC)
I would love a way to import a list from a text file. Has this feature been scrapped? This would be very helpful for fixing typo, since most people in the WP:Typos use the more effective google search to find pages with typos. GfloresTalk00:59, 19 December 2005 (UTC)
also I'm still getting the thumbs error from above in the new version (.90). A redirect fixer would be nice. You tell it to fix links pointing at redirect. You can sort of aproximate that now, but it can be confused by piped links (if "link" is moved to "linkname" and I use your program to fix the redirects I get [[link|linkname]]. being changed to [[linkname|linkname]]. BrokenS01:32, 19 December 2005 (UTC)
What is the "thumbs error"? If you mean the script error, then I am 99% sure that is IE's fault, make sure you have the newest version. An automatic link fixer is a bit difficult, but I'll think about it! Martin09:31, 19 December 2005 (UTC)
How about being able to plug into a webservice or something like that which then corrects the fixes (and also stores the list)? Indeed, then it would be User:Humanbot!
Seriously though, the User:Humanbot script needs a major rewrite and if your program could be a better interface, perhaps integrating them would be the way to go. :) r3m0ttalk10:58, 19 December 2005 (UTC)
(sorry for delayed response) I think it would be difficult to intergrate them, my program would probably work more easily from a list of articles with typos generated from the database, with the spelling correction code built in to the program. While on the subject, are you planning on running Humanbot again soon? Martin21:18, 20 December 2005 (UTC)
There's no point - a recent Greasemonkey release did good things in general, but broke my script. I could modify it but the interface wouldn't be very good. I seriously think that Humanbot and this could go well together - didn't you know that Humanbot worked with a list of articles on a central server? :) Perhaps the correcting function can be at the client-side, but that isn't always a good idea. Consider, for example, automatically adding links (which requires a lot of data), and... well... maybe that's about it. But I would like it anyway! ;) r3m0ttalk21:34, 20 December 2005 (UTC)
Just some random feedback
The 'Filter' or 'Sort' buttons are no longer there. I can't find them in the menu. So I can't get a list that is sorted and has no user or talk pages in it.
Could it identify the 'Inuse' tag or other pages that should not or cannot be edited?
Something odd happens when I reach the last page in the list. I think it just comes to a halt and will not edit it. I may be mistaken.
Aha. I found them. Thanks. A general usability recommendation (I haven't got a reference) is that contextual menus are a convenience but not a replacement for drop down menus. This is so that users can explore all functions without having to right click in all parts of the interface. If you get time, could you add them to the drop down menus too?
Also, can I suggest 'Filter out duplicates' -> 'Remove duplicates', 'Sort alphebetically' -> 'Sort alphabetically'.
It would be useful (for me anyway) if it could do an initial 'Remove duplicates'just after the list is loaded. Links on page often produce duplicates. I cannot imagine any benefit in a non-alphabetical order so an initial 'Sort alphabetically' would be useful too. I know this will add complexity and duration so I understand if you don't want to add it to the wishlist.
Small request... can you add shortcuts to some of the functions, mainly Save and Ignore, please. BTW, thanks for the textfile support, it sure makes fixing typos a lot easier. Appreciate it. GfloresTalk00:58, 20 December 2005 (UTC)
I'd love to be able to preview selected articles in the browser screen to see if I need to remove them from my list. I hate to switch between my regular browser and the program so much. - Mgm|(talk)22:46, 27 December 2005 (UTC)
It has an option in the menu to go to the preview rather than diff to start with, if thats what you mean. Martin22:49, 27 December 2005 (UTC)
I know you said somewhere else that AWB only works with the en wikipedia, but I was wondering if you could add the feature that converts old-style characters like &12345; to the appropriate unicode symbols. See This edit] for an example of why this is useful. (You don't need to be able to read kanji to understand that the post-edit is much easier to edit than the pre-edit) Neier08:57, 31 December 2005 (UTC)
Also, instead of removing the year links, maybe it would be better to offer the choice to change them to something more appropriate. On the Indianapolis Colts page, someone took out the links but it struck me as a good idea to link to XXXX NFL Season where appropriate. So, 1995 -> 1995 NFL season (or 1995 in politics etc.) could be set as a default replacement by the tool. In context, editors should be able to determine which year links are worth keeping. Neier08:57, 31 December 2005 (UTC)
In regard to the first suggestion, I would love to do that but I just dont know how I would find out what all the old form and new unicode characters are, I wonder if anyone knows how the pywikibots do it? In regard to the second suggestion, it wouldnt really be practical, it would have to be at the editors discretion. Martin19:55, 31 December 2005 (UTC)
release the source code?
You should really consider releasing the source code to the AutoWikiBrowser. That way, other people can help improve it! :) --Ixfd6410:47, 19 December 2005 (UTC)
Information for users (Martin feel free to copy and use this info any way you want):
Martin has been kind enough to use my regex in the date delinking section. I am not as good at regex as I would like to be. Here are the concepts and the details:
The idea is to delink date elements that fail the date preference test. There are exceptions (see below). I do not use date preferences (personally: I tolerate any sequence when the month is non-numeric). Nor do I know how the date preference code works. But some info is at:
Any day of the week: (Monday|Tuesday|Wednesday|Thursday|Friday|Saturday|Sunday|Mondays|Tuesdays|Wednesdays|Thursdays|Fridays|Saturdays|Sundays)
Any month: (January|February|March|April|May|June|July|August|September|October|November|December)
Any decade: ([0-9]{4}s)
Any three digit or four digit year: ([0-9]{3}|[0-9]{4})
Any century: ([0-9][a-z]{2} [Cc]entury|[0-9]{2}[a-z]{2} [Cc]entury|)
Any month/year combination such as 'February 2002'
I also try to avoid pages that discuss calendars and the origins of week/month names. My crude way is to search for the word 'calendar' and 'god'. But that could be tightened.
The part for three or four digit years could be combined to [0-9]{3,4} which'd make the expression a tad bit shorter. --Mairi22:02, 19 December 2005 (UTC)
Thanks. Very useful. Perhaps you also know a way to shorten the search for centuries. Would ([0-9]{1,2}[a-z]{2} [Cc]entury) work instead of the one I suggested above? Could we do a similar thing for days of the week?
I am particularly keen on finding out how to avoid adjacent valid elements. The piece ([^\]]{4}) is supposed to trap 11 January2005 by looking for the square brackets. Unfortunately it also traps London2005, January2005 and 1990-1995. Furthermore, there appears to be no limit to the number of spaces a valid date can have and it will not trap [[January 11]], [[2005]] because it has 5 consecutive character spaces. A similar problem applies when the year is the first element i.e. 2005-January 11. Bobblewik23:10, 19 December 2005 (UTC)
Discussion of possible new regex (century, decade, year, month, day):
That would work for centuries. Although you might want to use [_ ] instead of the space, as links with underscores work just as well. Another way of shortening things would be to replace all the [0-9] with \d (they function pretty much the same way).
There's also three digit decades, but I suspect links to them are far less common.
As far as false positives, you might want to avoid changing anything in articles with titles that are dates (such as 1010s), as they seem to have alot of [desirable] year, decade and century links.
I have amended the century bit as you suggest. I replaced the space with an underscore as you suggest. I had not thought of 3 digit decades, that is now included but it could come out again. I do not know how to trap date titles but I simply do not list them for processing. Bobblewik00:53, 20 December 2005 (UTC)
You don't want to use .* for anything intended to be within [[ ]] as it will continue "eating" characters past the first set of ]] (and probably continue til the last set of ]] on the page). It'd also end up matching things like [[Monday Night Football]]. Is it just, say, Monday and Mondays that you want to match? --Mairi01:49, 20 December 2005 (UTC)
Yes. I just want to match Monday and Mondays. I definitely don't want an unlimited match. Thanks for the warning. Feel free to amend it anyway you think would work. Bobblewik01:53, 20 December 2005 (UTC)
That ought to work. It's a little non-intuitive, tho. It won't change the group numbering for replacement either, which is why it uses (?: ) instead of ( ).
It might also be a good idea to make the whole expression non-case-sensitive (if you can do that easily). --Mairi02:10, 20 December 2005 (UTC)
Thank you very much. I bow to your superior knowledge on this. Feel free to do whatever you think will improve it. As long as it works, I don't care how. You are a great help. Bobblewik02:27, 20 December 2005 (UTC)
Martin has just told me that it can do multiple passes. Here is my latest proposed regex:
First pass looks for century, decade, month, day:
Search: \[\[(\d*.. [Cc]entury|\d{3,4}s|January|February|March|April|May|June|July|August|September|October|November|December
|(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)days?)\]\]
Replace: $1
Second pass looks for year
Search:::([^\]]{4})\[\[(\d{3,4})\]\]([^-\[])
Replace: $1$2$3
Ought to have fewer misses. (I'm not as sure about the part in the first set of parens. if it doesn't work, just switch it to what was there before and use $1$2). Do you want it to remove links for things like [[1992]]-[[1993]] also? I also made a subpage of my userpage with a bunch of different dates for testing. Feel free to use it and add other cases to it. --Mairi05:19, 21 December 2005 (UTC)
Yes, it should remove anything that is not valid for date preferences. So with a pair of dates such as [[1992]]-[[1993]], both should be delinked. That has been a particularly frustrating 'miss' and it looks bad to others. I was thinking that we should simply make it do a second pass to solve that.
I have also been wondering if our preceding link detection should actually look inside the preceding link for 'y]]' or '\d]]'. How about (?<![yhletr\d]\]\]\s*,?\s*) That could catch a lot of things like London2005. We could do a similar thing when we check the following link e.g. (?!-|\s*,?\s*\[\[[JFMASONDjfmasond\d]) Bobblewik08:40, 21 December 2005 (UTC)
I forgot to add anything about solitary month/year combinations. We would just add something like \d* (?:January|February|March|April|May|June|July|August|September|October|November|December
Latest proposed regex:
First pass delinks century, decade, month, day:
Search: \[\[(\d*.. [Cc]entury|\d{3,4}s|January|February|March|April|May|June|July|August|September|October|November|December
|(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)days?)\]\]
Replace: $1
Second pass delinks month/year combination like [[January 2002]]:
Search: \[\[((?:January|February|March|April|May|June|July|August|September|October|November|December) [\d]{3,4})\]\]
\]\]
Replace: $1
Third pass delinks years that are not part of a date preference target. Should be no false positives and only a few misses:
Search:::(?<![yhletr\d]\]\]\s*,?\s*)\[\[(\d{3,4})\]\](?!-|\s*,?\s*\[\[[JFMASONDjfmasond\d])
Replace: $1$2$3
Fourth pass is a repeat of the regex in the third pass. This is to delink the second link of [[2002]]-[2005]]. Should be no false positives and only a few misses:
Search:::(?<![yhletr\d]\]\]\s*,?\s*)\[\[(\d{3,4})\]\](?!-|\s*,?\s*\[\[[JFMASONDjfmasond\d])
\[\[(January|February|March|April|May|June|July|August|September|October|November|December) ([\d]{3,4})\]\]
Think you want to change (?!-|\s*,?\s*\[\[[JFMASONDjfmasond\d]) to (?!-\[\[\d\d-\d\d\]\]|\s*,?\s*\[\[[JFMASONDjfmasond\d]) which ought to get rid of all the misses for dates followed by a hyphen. And I think the 4th pass ought not to be necessary, but I'm not sure. --Mairi19:49, 21 December 2005 (UTC)
I will take your word for it. I don't fully understand the regex. I think the second pass (month/year) combinations can be merged into the first. If the 4th pass is not needed, that is good too. Change the proposed regex in any way you think best and most efficient. We should test it and then perhaps it will be time to ask Martin if he will adopt it. Thanks. Bobblewik18:43, 22 December 2005 (UTC)
I tried this yesterday and it apperead that the forth pass was needed, at elast it was cathing things not found on the third. DES(talk)20:36, 22 December 2005 (UTC)
Have you tried the most recent set of regexes? I cant get any results from them, the older ones seem better though. Martin22:38, 22 December 2005 (UTC)
The ones I have been using are the 4-pass set listed above under "Latest proposed regex", as they were when first psoted to this page -- i haven't checked for any edits inm place after i copied them to a text file for easy access. I typically get hits on passes 1, 3, & 4, rarewly if ever have I seen a hit on pass 2 so far. I am doing a larger run now, -- I'll report on my results. i note that I have had to check the "remove all date links" option on the beta tab for these to work -- so far, based on insufficient testing, I need Both regegex in the set options tab and the chcekbox on the beta tab enabled -- i will confirm that with a specially constructed test page later today or tomorrow -- i have to log off in a minute. DES(talk)22:47, 22 December 2005 (UTC)
I paste them into the Find field in AutoWikiBrowswer and run it on: User:Mairi/Date formatting. The first, third and fourth passes seem to work for me. The second pass does not work because it is faulty. Replace the second pass with:
Second pass delinks month/year combination like [[January 2002]]:
Ah, thanks, could you just clarify what the best regexes are, as the 4th pass doesnt have a "replace" now, and do you use Mairi's change, thanks Martin00:21, 23 December 2005 (UTC)
Yes, I use Mairi's change. The 4th replace is identical to the 3rd. To be explicit, here it is:
First pass:
Search: (?i)\[\[(\d*.. century|\d{3,4}s|January|February|March|April|May|June|July|August|September|October|November|December|(?:Mon|Tues|Wednes|Thurs|Fri|Satur|Sun)days?)\]\]
Replace: $1
Second pass:
Search: (?i)\[\[(January|February|March|April|May|June|July|August|September|October|November|December) (\d{3,4})\]\]
Replace (there is space character before second dollar character): $1 $2
Third pass:
Search: (?i)(?<![yhletr\d]\]\]\s*,?\s*)\[\[(\d{3,4})\]\](?!-|\s*,?\s*\[\[[jfmasond\d])
Replace: $1
Please do not use your tool in order to enforce style decisions that do not have broad consensus. Some authors (me included) like to link years in order point readers to background about the discussed period; other authors do not. This is an issue similar to British English vs. American English: it should be left to individual editors; marching in with overpowering technology is out of place. Thanks, AxelBoldt18:22, 23 December 2005 (UTC)
After further tests with the latest version of the date regex in the find/replace box, as shown above, I find there are soem interesting limits. Here is a diff showign the net effect of 4-passes: diff.
The points I note are:
There seem to have been no false positives -- that is, no dates wwere unlinked where date prefernces would have functioned.
[[19th century]] was not unlinked.
Single years followed by hyphens without spaces were not unlinked, while single years followed by endashes or emdashes impemented as html entities were.
Years with fewer than 4 digits were not unlinked
Day of weeek abbreviations [[Mon]], [[Tues]], [[Wed]], [[Thurs]], [[Fri]], [[Sat]], and [[Sun]] were not unlinked.
Thanks for the extensive testing. You provided an important benefit. Taking your points in turn:
The absence of false positives is very welcome.
The failure to delink [[19th century]] is because that regex is case sensitive. The proposed new regex is case insensitive. That is what the (?i) piece at the beginning of the regex does.
Single years followed by hyphens are a known 'miss'. A hyphen is used in ISO format dates ([[2005]]-[[02-30]]). So it avoids dates with a hyphen. The new regex will reduce the number of misses a bit further. The ndash and mdash are not part of an ISO date so it does not need to avoid them.
The first regex only acted on 4 digit years. The proposed new regex will act on 3 or 4 digit years.
There are no plans to act on abbreviations of days of the week. If they are common, more work could be done. The proposed new regex will increase the scope to include plurals ([[Mondays]]) and be case insensitive to act on ([[monday]] etc).
Many thanks. That was useful. A test like that of the new one when it comes out will be very welcome. The more people testing and using it, the better. Bobblewik23:53, 23 December 2005 (UTC)
First impressions are that it is much better. However, with year pairs separated by a hyphen([[1990]]-[[1995]]), it still only captures one of them. The 4th pass was supposed to capture the second of the pair. Hmm. Bobblewik00:09, 24 December 2005 (UTC)
Ah I see about the ISO format dates. Note that the guideline now reccomends sinlge-part ISO format, such as [[2005-02-30]], but the two part form is still valid and must be allowed for. However, an ISO date is always a hyphen folowed by a 2-digit number, followed byanother hyphen -- [[1990]]-[[1995]] is not a valid ISO format. whether teh regex can relaibly be tweaked to make this distinction I'm not sure -- note that the roblem was with the year followed by the hyphen i.e. the first yeart in the range. This is far more likely to come up in biography articles than the reverse: birthdates known only to a year while death dates are exact are common, while the reverse is rather less common. DES(talk)00:51, 24 December 2005 (UTC)
It is possible to improve the regex. It could look for [[1990]]-[[1995]] but it would have to avoid [[1990]]-[[1995]]-[[02-30]]. Determining the topic of an article (biography), or of the link (birth) is unlikely to be part of an efficient solution. There are a *lot* of permutations of false positives and misses. Having a pair of dates with only one linked looks very bad. A solution must be found. I don't think it will be that difficult, but I just do not know how to do it. I am hoping somebody else does. Bobblewik13:32, 24 December 2005 (UTC)
I wasn't suggestign that lookign for biographies should be part of the solution, jsut that that was a context in which the issue commonly arises. DES(talk)01:07, 26 December 2005 (UTC)
Additional false negative cases:
Piped dates, such as [[1923]]—[[1927|7]] or [[1923]]—[[1934|34]] are not converted. this is not infrequently done to express a date range.
Lists of dates in serial comma form, such as [[1987]], [[1992]], [[1995]], and [[2002]] -- only the last date is delinked. I presume this is due to the logic for detecting [[1995]], [[23 March]]
You can see my latest test with version 0.9.9.5 in this edit
Anyone may feel free to use User:DESiegel/Date Test for tests, please revert to a fully linked version after performing any tests. DES(talk)01:07, 26 December 2005 (UTC)
Proposed improvements (tackles years in one pass instead of two, includes consecutive linked years, includes piped years): The third pass would become:
Search: (?i)(?<!(?:january|february|march|april|may|june|july|august|september|october|november|december| \d{1,2})\]\]\s*,?\s*)\[\[(?:(\d{3,4})|\d{3,4}\|(\d{1,2}))\]\](?!-\[\[\d\d-|\s*,?\s*\[\[(?:january|february|march|april|may|june|july|august|september|october|november|december|\d{1,2} ))
Replace: $1$2
Note that the search regex contains two space characters.
Martin, can you confirm if this is the regex version in AutoWikiBrowser? I just tested 0.9.9.5 and it does not appear to be. Bobblewik20:39, 26 December 2005 (UTC)
I could not before. Perhaps it was something to do with page caching. I have version 1.0 now and from a quick check, it seems to be working as expected. Bobblewik01:31, 27 December 2005 (UTC)
User approved list
I think it is letting me edit even if my username isn't on the approved list (I checked it using another one of my accounts). BrokenS22:04, 19 December 2005 (UTC)
This is an awesome tool. I have one request for isntruction (or a feature request...): I tried setting the "New category" to a blank box, but it's too clever. Is there a way I can carry out a blanket category removal without replace? -Splashtalk22:35, 19 December 2005 (UTC)
Its an annoying problem, so far I havent been able to narrow down the problem, or reproduce it at all in such a way as to make the problem visible. Thankfully it only affects very few articles. Martin19:24, 20 December 2005 (UTC)
which is clearly wrong. I dont know why it does this, seeing as Firefox doesnt have the problem maybe its a bug in IE. Martin12:29, 21 December 2005 (UTC)
Not in my version of IE (IE6.0.2900.2180) it doesn't. I shift-clicked on that first edit URL and the correct page came up just fine: interestingly the URL in the address bar remains the same. The second URL does the same (although obviously showing the different URL in the address bar). So there's obviously something cronky going down :-( Phil | Talk13:25, 21 December 2005 (UTC)
Try typing the URLs into the browser (or copy and pasting at least), that has a different result to clicking on the link, which further indicates that something fishy is going on. It doesnt affect all pages with unusual fonts in the title so it isnt a massive problem. Martin13:42, 21 December 2005 (UTC)
I am not sure how AutoWikiBrowser works. Can you advise?
The date delinking regex has a lot of 'misses'. It is difficult to distinguish between 11 January2005 and January2005, so I simply avoid consecutive links. Thus it will delink the January but it leaves the 2005 intact. There are many other permutations that I miss with the huge regex. Unless I do all articles twice, there are lots of misses.
If it operated sequentially, I could do a lot more. For example, it could tackle day, month, decade, century links first. I do not have to check for consecutive links in those cases. Then if it did another search of the same article, a search for year links could be more focussed and effective.
I expect that it AutoWikiBrowser simply has a huge regex. But I don't actually know what it does. Does it, or could it, go through the article more than once? Bobblewik23:45, 20 December 2005 (UTC)
There is no huge regex, the one I copied from you is the biggest one. It can a run through an article as many times as you want. If you create the regexes you want to use then I can implement them resonably easily. Martin23:51, 20 December 2005 (UTC)
Would it be a bad idea to make a "lite" version (with limited features) for people without authorization? --Ixfd6402:36, 21 December 2005 (UTC)
If you can cycle through articles quickly it is a good tool to vandalise with, and if I disabled that then it wouldnt really have any benefit. thanks Martin09:49, 21 December 2005 (UTC)
BUG: another limitation
The article In The End: Live & Rare just came up for processing, but AWB can't deal with it.
It keeps trying to load the article In The End: Live which obviously gives it gyp.
The list contains "In The End: Live & Rare" which doesn't necessarily help: fixing it did nothing.
Another encoding problem?
HTH HAND Phil | Talk15:56, 21 December 2005 (UTC)
How difficult would it be to arrange that we could search/replace more than one thing at a time?
I'm thinking that it would be nice if we could do these at the same time, maybe as a side-effect of other stuff:
I think what would be nice would be if the "general fixes" could be presented as an option list, available from the "Options" menu, so that we could pick/choose which we wanted to apply at any given time: I've been wondering exactly what was involved in this, and I gerenally simply switch it off. You could have a little table of "entity replacements" like the above to which we could add our favourites: it should be a snip to include a tick-box for each saying "yes, do this one". HTH HAND —Phil | Talk10:15, 22 December 2005 (UTC)
The idea of the general fixes is that they are minor things that can always be applied to main namespace articles, being able to turn them off individually would just add coding and usability complications. I have added the above conversions in now. Martin10:54, 22 December 2005 (UTC)
It would, however, be nioce if there was a list of the geenral fixs avalable somewhere, so we knew what we are potentially doing. Also, if no change is found for the primary task, (such as date unlinking) but a general fix is found, the edit comment will be misleading at best. Perahsp no general changes should be applied if a specific change is specified and not found? that may be too much work to be worthwhile, the user can always click "skip". DES(talk)20:39, 22 December 2005 (UTC)
Logging in
When i first started to use AWB yesterday, it soemhow operated as User:205.210.232.62 (my usual IP whn not logged in) until i realized this, and clicked on the log-in link in the AWB browser window. Must one normally log-in separately from AWB even if already logged in on another browser? Your demo video does not show this. Or did I do soemthing incorrect? DES(talk)20:42, 22 December 2005 (UTC)
It only checks that you are logged in when you first start it, so if it became logged out it would carry on working, I'll change this at some point in the future. Also, it uses the Internet Explorer core, so if you are logged in in that then you will remain logged in in this program. Martin20:52, 22 December 2005 (UTC)
I use IE. I was logged in on an instance of IE, when i started the program. as far as i know (but I did not verify absolutely) I did not become logged out in my regualr IE sessions. But the first edit done using AWB was done logged out. it is posisble that while I figured out how to do things my cookie expired, but when i switched back to an IE window i still seemed to be loged in -- it was an absence of the AWB edits in "My contibutions" that prompted mne to check for non-logged-in-ness of AWB. I report this for what it is worth, if anything. If it recurs repeatably I will let you know -- i know how hard an unrepeatable issue is to address, and i know we are all volunteers. Again, MANY thanks for AWB -- just downloded ver 0.99 and i'm about to try it. DES(talk)21:11, 22 December 2005 (UTC)
At the moment the verification for logging in is pretty basic, I'll change it soon so it checks cookies, possibly on every edit. thanks. Martin21:15, 22 December 2005 (UTC)
I am getting problems with login. I am not at my usual computer. I was logged in to Wikipedia in my normal browser, then I started AutoWikiBrowser and tried to process a page. It complained that I was not logged in and showed the Wikipedia login page. I went to Wikipedia in my normal browser, logged out then logged in again. I closed AutoWikiBrowser and launched it again. But it gave the same symptoms. Bobblewik11:30, 29 December 2005 (UTC)
What is your normal browser? when it showed the wikipedia login page, were you actually logged in then (i.e. did it have all the "Sign in / create account")? and did you try entering your login details in the wndow it loaded for you? thanks Martin11:40, 29 December 2005 (UTC)
I use Firefox. When AutoWikiBrowser showed the login page, I was not logged in and it showed the "Sign in/create account" page. I entered my login details in AutoWikiBrowser and it worked. Bobblewik12:36, 29 December 2005 (UTC)
Really? It is the first time I have had to do it. It had me fooled for a while because I have followed the instructions not to click in the window. Well, at least I know the exception to the rule now. Thanks. Bobblewik13:01, 29 December 2005 (UTC)
When trying out AWB for the first time yyesterday (I made a date-delink run on Category:Biography) I attempted to make a manual change on one article (correcting an incorrect category) in the AWB browser. It appeared as if this chage was undone when i saved the articel, but I am not sure. Are manual changes supposed to be saved? I gather that there is no way to do a diff or preview for manual changes at the moment, is this correct? DES(talk)20:45, 22 December 2005 (UTC)
It saves all manual changes, if you click the "Diff" button (or "preview") it will show you the extra changes you have made, but it will always save them. Martin20:50, 22 December 2005 (UTC)
Thanks. I was confused by the "but" in your wording. I now realize that I incorrectly made changes via the browseer window above, not the edit box on the right, and that is why they were not saved. Sorry for the false alarm. DES(talk)23:05, 22 December 2005 (UTC)
Was it worth it?
I noticed that B/AWB recently made the earth-shattering change from
A truly Delphic response, Martin. It's not the outer limits represented by "a lot more than that" that bother me, however, but the inner pickiness of a routine that goes to all the bother of removing two spaces with the result, as far as I can see, of affecting what appears on screen not one jot... Is it not legitimate to comment upon that? -- Picapica00:49, 23 December 2005 (UTC)
It removes the spaces to make other changes easier, not specifically because it is directly a good thing to do. It is just as easy to click ignore as it is save. Martin00:53, 23 December 2005 (UTC)
Many thanks for the speedy response, Martin, but this "click ignore" is a new thing to me, even though I've been editing Wikipedia for what seems like yonks now. How does it work? And how precisely does your having removed spaces which make not one jot of difference to what appears on screen (see "weird edits" above) "make other changes easier"? -- Picapica01:06, 23 December 2005 (UTC)
The person running the script should tell the program to skip a page (click ignore) if the only change is somthing as minor as you described. Stadnard style is not to use the spaces so if the program is doing somthing on the page it might as well fix small nitpicky things that aren't really woth making edits for. Have you even used the program? I think you are being a bit unfair, the program is really quite good. BrokenS05:29, 23 December 2005 (UTC)
Not really "unfair" (I hope), BrokenSeque, just plain ignorant: I don't even know what "running the script" means. Clearly I've stumbled into a parallel wikipedia world involving some kind of automated editing. Me, I just do the old-fashioned one-man look-think-and-if-necessary edit routine: I haven't come across anything in the "advice to editors" introductory pages dealing with "using programs" which "remove spaces to make other changes easier". I was just wondering why anyone/anything would go to the bother of carrying out makes-no-difference changes to articles when there is so much else that needs to be checked. I shall have to investigate further... Merry Christmas, anyway. -- Picapica16:16, 25 December 2005 (UTC)
OK, I'll just clarify a few things:
"Running the script" means running the program, (it isn't a script. but it's just a technical difference).
The software is new and still being developed, so its not a surprise you havent heard about it.
We still do the old fashioned way of editting too!
It isnt designed for making trivial changes, it is designed to make repetitive tasks easier (e.g. stub sorting, re-categorisation...) and thus leave more time to do editting the normal way!
"Ignore if contains" does not appear to work if the text is in the title.
It would be nice to have the ability to have two fields: one to match the title text; and another field (as now) to match body text. [unsigned comment by Bobblewik]
A reasonable assumption but not true for all articles. This page contains User talk:Bluemoose in the title but not the body. Not a very good example, but it disproves the assumption. The "ignore if contains" field is a way of avoiding false positives. Processing a page that has relevant text in the title would be a false positive. It would be rare though. Bobblewik00:38, 23 December 2005 (UTC)
I asked User:Bobblewik about this on his talk page, and he directed me here. In removing standalone year links, there are some which should not be removed because the article on the year contains a factoid about the original article, and (especially in earlier years) it helps frame the current article with other events of the same year (century links would apply, also). The example I cited is the link to 1117 from Mii-dera.
So, since the Special:Whatlinkshere/Mii-dera shows the reverse link from 1117, if AWB can limit its date regexp matches to anything NOT found in the Whatlinkshere pages, it would prevent the wrongful removal of years. If it can't, then anyone using AWB for date link removal needs to be careful not to remove anything important.
It is possible, but overly complicated and beyond the scope of this software, it would also use an enormous amount of bandwidth. sorry Martin09:14, 23 December 2005 (UTC)
So, basically we have several projects that are busily linking dates, and then this one de-linking them. Why? Sombody just delinked 1967 as the year of the Six-Day War. Not useful!
What we really need is making sure that any year near any month and day is linked. Please don't de-link dates.
This isnt a project to de-link dates. It is a piece of software that can be used to help with some tasks. Also, I dont know of any project to link dates, if there is it is going against guidlines. Martin12:45, 30 December 2005 (UTC)
Extend functionality of replace specification
I'm floundering a bit for the actual name for this, but it'll probably come to me just after I click "Save" .
What I'd like is to be able to specify sub-expressions in the Search box and refer to them in the Replace box. like in MS Word. For example:
Search for : {{sodium}}<sub>([01-9]*)</sub>
Replace with: {{sodium|\1}}
Regular expressions can handle this type of thing, try;
Search for : {{sodium}}<sub>([01-9]*)</sub>
Replace with: {{sodium|$1}}
Template replacement
Do you think it would ever be possible for AWB to be taught to understand templates and parameters? This would make it much easier to mass-change templates which have parameters renamed, or which have been moved or replaced.
So given a template name, it would look for
{{template name
and then search for the matching "}}".
If you told it how many parameters and what the new name for each one should be, it could match up the number of "|" symbols.
Obviously this is a bit blue sky right now but I thought I'd set it down for consideration.
HTH HAND —Phil | Talk10:58, 23 December 2005 (UTC)
Certainly worth thinking about, problem with this type of this is reliability, I'll look into it though. Martin11:32, 23 December 2005 (UTC)
License, source code?
Under what license is this software published? I take it from discussion on this page that the source code is not available? AxelBoldt18:19, 23 December 2005 (UTC)
Normally, IME such things are published under some version of shareware or freeware -- that is the creator retains copyright but allows others to freely use the software. Soemthing like GPL in intent but less formal. often just "Copyright by PersonX, permisison to user is granted, proveded..." with whatever restrictions the creator wants -- non-removal of copyright notice, non-commercial use, whatever. In thsi case I would suggest limiting use to within-policy edits by approved users on wikipedia, and make license revokable on notice. That gives you all the protection you might plausibley need. But it is up to you,m as the creator, to set your terms. DES(talk)21:34, 23 December 2005 (UTC)
If you want to release it into the public domain, you have to actively do it (with a note to that effect somewhere on the AutoWikiBrowser page for instance); by default you retain all copyrights and nobody but you is allowed to copy the program. AxelBoldt19:02, 24 December 2005 (UTC)
en dash to hyphen change?
Comment moved from User talk:Bobblewik Your bot is changing birth and death dates from use of a en-dash to a hyphen. Not only is this change gramatically incorrect, it seems like a rather controversial change to assign to a bot. Where did the decision to do this gain consensus?--Alabamaboy21:00, 23 December 2005 (UTC)
Aha. So there was no change. He must be assuming that the ndash he sees on the screen is a hyphen. It is an easy mistake to make, both characters look the same to me too. Bobblewik21:41, 23 December 2005 (UTC)
ItHe just went through and changed all the –s and —s in Butter, too. I realize either style is allowed, but I prefer the former and find this change slightly disruptive. —Bunchofgrapes (talk)
I dont see how that is anything other than helpful, if you could explain otherwise it would be good (p.s. "it" is a "he") Martin22:29, 23 December 2005 (UTC)
It seems to be changing the html entities – and — to the equilivant actual characters. this change has no effect on display, but does have effect on later editing. Using the html entity makes it clear to an editor which kind of dash has been used, using the actual character makes this much harder for a later editor to determine. DES(talk)22:32, 23 December 2005 (UTC)</nowiki>
I'll remove this option from the nect version then, although I wish the devs would create some wiki markup to make these characters, as the html looks terrible, and 90% of people would have no idea what it is. Martin11:56, 24 December 2005 (UTC)
Yes I agree that html versions are difficult to read and should be made easier if possible. I have long thought that this should be applied to superscript characters. There was a discussion in the Manual of Style (now archived) about this. My main concern was that it should be browser independent and accessible (e.g. to screen readers). I think there was agreement for editing superscript html to something else but it got a bit too technical for me. Bobblewik13:58, 24 December 2005 (UTC)
This is a problem with the regex used to find the dates, see the above section "Publishing the date delinking regex". thanks Martin10:29, 26 December 2005 (UTC)
Thank you very much for the bug report. Just to be more explicit: We are aware of the problem and can replicate it easily in test articles. So we don't need those particular articles for debugging, feel free to delink them properly if you wish. The problem is solvable and we are working on it. But we have not yet solved it. Join in the "Publishing the date delinking regex" discussion. Bobblewik16:34, 26 December 2005 (UTC)
Minor bug in .995 Template alpha sort
Looks like it's taking out template links. When it alpha sorts during cleanups See: [6]
It didnt remove them, but put them at the bottom of the page, because it thought they were stubs, I'll get it to ignore stubs templates that are of the form {{tl|mil-vehicle-stub}}. thanks Martin09:47, 27 December 2005 (UTC)
In addition to the ordering and the alphebetical listing of categories and language links it would be good if the same could be done with {{Link FA}} templates since even though a lot of articles have non or have so few that it isn't worth it there are some articles that have more than enough to make it worthwhile especially as it will fall into wider usage. JtkieferT | C | @ ---- 09:59, 27 December 2005 (UTC)
One place to choose binary options. For example "Apply general fixes" and "Remove all date links" are binary choices.
One place to set "Search in namespace:" options. The current options would presumably be "Template" and "Main". Instead of removing unwanted articles after the search, the user would simply modify the default options before the search.
Arrangement of widgets:
The 'Save' and 'Ignore' buttons are like 'OK' and 'Cancel'. It might be nice to arrange the buttons so that they are aligned along the bottom *after* the 'Summary'. OK should be to the left of Cancel, so 'Save' would be to the left of 'Ignore'.
The 'Category' field seems to me to be the first logical thing and I think it should be above the 'Make from' field (which seems to me to be the second thing).
The 'Make list' button seems to me to be just like a 'Search' button. Perhaps that might be a more obvious label.
The 'Diff' button seems to have the same function as the 'Show changes' button in the Wikipedia editor. The latter may be a more obvious label.
The 'Start!' button seems to me to be just like an 'Edit' button. Perhaps that might be a more obvious label, even though it also does other things.
The 'Messaging' bit fooled me when we first started. If it deals with the talk page, perhaps we should say that explicitly like 'Edit talk page'.
The list can get very long, it would be nice to be able to increase its length and width, perhaps with scroll bars and/or by actually increasing the size.
Thanks for the suggestions, the problem with a few of them is simply not having the space (e.g. "Show changes" is just too long to put on the button). Also the list box does have scroll bars. I'll see what I can do though! Martin12:30, 27 December 2005 (UTC)
I agree there is limited space. That is partly why I suggested copying the Wikipedia edit layout and putting the buttons horizontally. The whole thing might look more immediately recognisable if it copied elements of the Wikipedia edit design. You could put search functions at the top and/or left (perhaps using the whole vertical space) and edit functions at the bottom. Bobblewik13:33, 27 December 2005 (UTC)
Thanks. Nice improvements of detail. I also just noticed the standard summaries, I was going to suggest that after somebody complained to me about using a generic summary. If we could do it all without tabs, that would be nice too. Bobblewik17:44, 27 December 2005 (UTC)
another minor bug
I've only tested it out when changing over lists of articles from one category to another but when using make list with a category with an ampersand(&) in it it doesn't give any results despite the fact there are articles in the category. JtkieferT | C | @ ---- 22:52, 27 December 2005 (UTC)
One of my favorite cleanups is to remove all occurences of a link after the first one. Is this straightforward enough to do? I could write it in VFP (and I still might), but if it were included in AWB, it would make things easier for me.
That would be a great feature. A complication would be that it might have to avoid double links used for date preferences. Bobblewik11:30, 28 December 2005 (UTC)
Good point, to start with I'll make it so it tells you where the extra links are, rather than actually removing them automatically. Martin11:36, 28 December 2005 (UTC)
A crude way to avoid valid configurable dates would be to avoid any link that contains either a digit or a complete month name. If you publish the regex, we could work together to make it less crude. Bobblewik16:32, 28 December 2005 (UTC)
I feel that in a long articel it is often valid to link to the same destination agian after severla paragraphs, particualrly after more than one screen-full. So I think that "removal of reduandant links" would be better as a pointer than an automated tool. DES(talk)16:46, 28 December 2005 (UTC)
The regexs do not seem to be useing "classic" basic regular expression syntax, as described in Regular expression but some extension (in particualr "(?i)" for case insensative does nbot appear in any of the versions in our article. Exactly which version of regualr expressions does AWB use, and is the syntax documeted anywhere online? DES(talk)16:52, 28 December 2005 (UTC)
According to MS it is very similar to the Perl 5 implementation, I have used this page for a few tips, but I'm not really knowedgable on them. There is an extra parameter in c# that can specify the regex to be case insensitive if that helps. Martin17:00, 28 December 2005 (UTC)
I am seeing quite a few misses related to split dates like: [[January]] [[18th]], [[January]] [[18]] and [[18th]] and [[19th century|19th centuries]]. I may want to update the regex to cope with these if that is ok. Bobblewik14:53, 29 December 2005 (UTC)
Martin, The regex section on this talk page is quite complicated now. Would be kind enough to confirm what is actually being used? Also, do you think that there would be any benefit in dividing the current single date delink option into multiple options (e.g. delink solitary months, delink centuries, delink solitary days etc)? Bobblewik18:34, 1 January 2006 (UTC)
Template subst'ing
AWB's description says it will add auto template subst'ing in the future. Using WP:SUB, I have come up with this regexp:
Replace:
{{(bio-cats)}}|{{(clear)}}|{{(clearleft)}}|{{(clearright)}}|{{(copyvio)}}|{{(lived)}}|{{(Lifetime)}}|{{(lifespan)}}|{{(Prettytable)}}|{{(sub)}}|{{(sup)}}|{{(moved)}}|{{(moved-n)}}|{{(tmfrom)}}|{{(tmto)}}|{{(unsigned)}}|{{(unsigned2)}}|{{(3RR)}}|{{(3RR2)}}|{{(3RR3)}}|{{(nn-warn)}}|{{(nothanks)}}|{{(nothanks-sd)}}|{{(obscene)}}|{{(selftest)}}|{{(test-n)}}|{test2-n)}}|{{(test2a)}}|{{(test2a-n)}}|{{(test2b)}}|{{(test3-n)}}|{{(test4a)}}|{{(test4-n)}}|{{(test)}}|{{(test0)}}|{{(test1)}}|{{(test2)}}|{{(test2a)}}|{{(test3)}}|{{(test4)}}|{{(test5)}}|{{(test6)}}|{{(blatantvandal)}}|{{(bv)}}|{{(attack)}}|{{(No personal attacks)}}|{{(Npa)}}|{{(Npa2)}}|{{(Npa3)}}|{{(Npa4)}}|{{(blanking1)}}|{{(blanking2)}}|{{(blanking3)}}|{{(blanking4)}}|{{(drmafd)}}|{{(drmafd2)}}|{{(drmafd3)}}|{{(drmafd4)}}|{{(drmafd5)}}|{{(MIPblock)}}|{{(multipleIPs)}}|{{(spam)}}|{{(spam2)}}|{{(spam2a)}}|{{(spam3)}}|{{(spam4)}}|{{(vanity)}}|{{(vblock)}}|{{(verror)}}|{{(verror2)}}|{{(verror3)}}|{{(verror4)}}|{{(Edit summary personal)}}|{{(Editsummarynew)}}|{{(sofixit)}}|{{(Summary)}}|{{(Edit summary)}}|{{(name your images)}}|{{(image source)}}|{{(image copyright)}}|{{(subst)}}|{{(SharedIP)}}|{{(ISP)}}|{{(AOL)}}|{{(repeat vandal)}}|{{(anon vandal)}}|{{(vw)}}|{{(Award)}}|{{(newvoter)}}|{{(welcome)}}|{{(welcome2)}}|{{(welcome3)}}|{{(welcome4)}}|{{(welcomeip)}}|{{(anon)}}|{{(Album Image)}}|{{(afd)}}|{{(afd2)}}|{{(afd3)}}|{{(afd|bottom)}}|{{(afd|top)}}|{{(tfd2)}}|{{(tfdnotice)}}|{{(ifd)}}|{{(ifd2)}}|{{(idw)}}|{{(idw-uo)}}|{{(idw-pui)}}|{{(idw-cp)}}|{{(cfd)}}|{{(cfd2)}}|{{(cfdu)}}|{{(cfr)}}|{{(cfr2)}}|{{(cfru)}}|{{(cfm)}}|{{(cfd-article)}}|{{(cfr-speedy)}}|{{(tfd-keep)}}|{{(Actinium)}}|{{(Aluminum)}}|{{(Americium)}}|{{(Antimony)}}|{{(Argon)}}|{{(Arsenic)}}|{{(Astatine)}}|{{(Barium)}}|{{(Berkelium)}}|{{(Beryllium)}}|{{(Bismuth)}}|{{(Bohrium)}}|{{(Boron)}}|{{(Bromine)}}|{{(Cadmium)}}|{{(Caesium)}}|{{(Calcium)}}|{{(Californium)}}|{{(Carbon)}}|{{(Cerium)}}|{{(Chlorine)}}|{{(Chromium)}}|{{(Cobalt)}}|{{(Copper)}}|{{(Curium)}}|{{(Darmstadtium)}}|{{(Dubnium)}}|{{(Dysprosium)}}|{{(Einsteinium)}}|{{(Erbium)}}|{{(Europium)}}|{{(Fermium)}}|{{(Fluorine)}}|{{(Francium)}}|{{(Gadolinium)}}|{{(Gallium)}}|{{(Germanium)}}|{{(Gold)}}|{{(Hafnium)}}|{{(Hassium)}}|{{(Helium)}}|{{(Holmium)}}|{{(Hydrogen)}}|{{(Indium)}}|{{(Iodine)}}|{{(Iridium)}}|{{(Iron)}}|{{(Lanthanum)}}|{{(Lawrencium)}}|{{(Lead)}}|{{(Lithium)}}|{{(Lutetium)}}|{{(Magnesium)}}|{{(Manganese)}}|{{(Meitnerium)}}|{{(Mendelevium)}}|{{(Mercury)}}|{{(Molybdenum)}}|{{(Neodymium)}}|{{(Neon)}}|{{(Neptunium)}}|{{(Niobium)}}|{{(Nitrogen)}}|{{(Nobelium)}}|{{(Osmium)}}|{{(Oxygen)}}|{{(Palladium)}}|{{(Phosphorus)}}|{{(Platinum)}}|{{(Plutonium)}}|{{(Polonium)}}|{{(Potassium)}}|{{(Praseodymium)}}|{{(Promethium)}}|{{(Protactinium)}}|{{(Radium)}}|{{(Radon)}}|{{(Rhenium)}}|{{(Rhodium)}}|{{(Roentgenium)}}|{{(Rubidium)}}|{{(Ruthenium)}}|{{(Rutherfordium)}}|{{(Samarium)}}|{{(Scandium)}}|{{(Seaborgium)}}|{{(Selenium)}}|{{(Silicon)}}|{{(Silver)}}|{{(Sodium)}}|{{(Strontium)}}|{{(Sulfur)}}|{{(Tantalum)}}|{{(Technetium)}}|{{(Tellurium)}}|{{(Terbium)}}|{{(Thallium)}}|{{(Thorium)}}|{{(Thulium)}}|{{(Tin)}}|{{(Titanium)}}|{{(Tungsten)}}|{{(Ununbium)}}|{{(Ununhexium)}}|{{(Ununoctium)}}|{{(Ununpentium)}}|{{(Ununquadium)}}|{{(Ununseptium)}}|{{(Ununtrium)}}|{{(Uranium)}}|{{(Vanadium)}}|{{(Xenon)}}|{{(Ytterbium)}}|{{(Yttrium)}}|{{(Zinc)}}|{{(Zirconium)}}|{{(WP:RM)}}|{{(Move2)}}|{{(TFAfooter)}}|{{(article)}}|{{(See also)}}|{{(ll)}}|{{(language link)}}|{{(ed)}}|{{(doctl)}}
I just tried it. First, you have to use the g and i flags. Second of all, you can't use $1. $1 only words with the first subexpression, $2 with the second, etc. Is there someway to do an arbitrary "$x", where x changes? — MATHWIZ2020TALK | CONTRIBS20:22, 29 December 2005 (UTC)
Dont worry about it, I have already made it do substing, just not on the release version, as there isnt much need for it yet, and I havent tested it completely. Martin21:01, 29 December 2005 (UTC)
Interwiki link sorting
Hello. I see you're been sorting interwiki links by their language codes. That's not good, because English Wikipedia uses different sorting order where links are sorted alphabetically, based on local language (for the correct order, see this page, second option).--Jyril21:12, 29 December 2005 (UTC)
The whole reason that page exists is to show the different options. There has been no agreement that we should use any particular one of them, and it has been discussed somewhere. Why do you say that we should use the second option?
In fact, if you follow the link in the See also section on that page to Wikipedia:Language order poll, you will see that there has been no consensus on English Wikipedia, and that the second option (of the Meta page, the 5th option on the language order poll) is not the most preferred one. Gene Nygaard21:53, 29 December 2005 (UTC)
Alphabetical was the most popular choice, and at the moment most pages are just in random order, so any order is better. Martin22:19, 29 December 2005 (UTC)
But by two letter code—that's what was most popular—and that's the way Bobblewik was doing it, not the way Jyril was telling him he was supposed to be doing it. Gene Nygaard22:47, 29 December 2005 (UTC)
I just say that the two letter code sort is really awkward if your language happens to be in a completely wrong place. I thought that there was an agreement about the issue, since most wikilink lists I've seen seem to follow the order I recommended. I checked a few articles, and saw that Bobblewik's edits changed language name sorting to two-letter code sort. If AWB did this, I beg that this feature is removed.--Jyril00:36, 30 December 2005 (UTC)
Yes, but what if the language doesn't use the Roman alphabet? Seems to me that sorting on the 2-letter code is the only really practical solution.--SarekOfVulcan00:39, 30 December 2005 (UTC)
Use the transliterated name (for example, Nihongo for ja:), or check the list. And if you're not sure, use 2-letter code and let bots to handle the sorting.--Jyril00:55, 30 December 2005 (UTC)
When I coded it, I had to choose one order, so I just chose the order that was most popular, I know there is no consensus, but I don't follow the logic that this means I should use the second most popular choice. Martin10:45, 30 December 2005 (UTC)
While this isn't the place to make this debate, I think it's better to say it here than in a poll that hasn't been updated in four months. If AWB works on other language wikis (I don't know -- AWB doesn't work on my computer), then I don't think there is any choice but to order by 2-letter code. Sometimes, I edit in the Japanese twiki; and, I don't want to have to remember that in Japanese English (or, eigo) comes after Italian (Itariago). (Japanese ordering starts out a i u e o and progresses on). I doubt very much that Martin wants to make AWB aware of all the different language ordering options either. The 2-letter code was developed for a reason, and I think putting things in a globally accepted abcdefg order is not a large sacrifice. (While the aiueo order is standard for Japanese syllables, they still order the roman letters in the same way as everyone else). Neier11:11, 30 December 2005 (UTC)
(Just a clarification -- my comments above are about editing the source of an article. The different interwikis all sort the language links as they feel appropriate when they are displayed; and that is the way it should be.) Neier11:14, 30 December 2005 (UTC)
I note that AWB is sorting stub tags to tghe very end. I thought that WP:STUB recomended stub tags after all text, but before non-stub category links. DES(talk)22:10, 29 December 2005 (UTC)
I have always thought they looked best at the very bottom, out of the actual article and further from the text. It's easy to change, but I dont see how it would be any better. thanks Martin22:16, 29 December 2005 (UTC)
I belive the thought was that the tag was easier to edit/remove when appropriate if it came before category links. This would not change the display appearence in any way. DES(talk)06:14, 31 December 2005 (UTC)
Minor option
There is an option to mark all edits as minor. I have set this option as true - however, every time I close and reopen AWB, it resets my options. Can you fix this by saving all options in a .dat file that is loaded upon opening the program? Thanks. — MATHWIZ2020TALK | CONTRIBS19:56, 30 December 2005 (UTC)
I have already made it so it can save settings, I just havent enabled it yet because its not completely finished. It won't use a .dat file though, a config file based on XML, just to keep you all at the forefront of technology! Martin22:28, 30 December 2005 (UTC)
Login issues
A`problrem occured yesterday when my login cookie expired. Please increse the priority for re-checking the cookie more often, perhaps on every page edited. In the mean time I advise users to double check that they remain logged in -- the display will show the difference if you look. DES(talk)22:13, 30 December 2005 (UTC)
Yeah I noticed that, and I have fixed it already, it checks every edit now. I'll upload it tomorrow (will be version 1.4), as I have got a couple of other things to do as well. thanks Martin22:28, 30 December 2005 (UTC)
Headings and spacing
I've been asked twice now about a possible problem with AutoWikiBrowser's "general fixes" function. The AWB removes the leading and trailing spaces in headings, such as this one, but MediaWiki automatically generates the sections with one heading. I haven't seen it cause any problems, but Help:Editing has it that way too. Just so you know. Titoxd(?!? - help us)22:24, 30 December 2005 (UTC)
Yeah, I have stopped that now, it still will for "see also" and "external links" sections, as it makes it easier to check for common problems in those headings, but hopefully I'll get round to making that better. thanks Martin22:31, 30 December 2005 (UTC)
Can you provide a full lsit of the "general fixes" soemtime? I am reluctant to check that box without knowing what it will do in some detail. DES(talk)22:35, 30 December 2005 (UTC)
Most of them are listed on the main page, however I'll update it with a few changes I have made, p.s. I have uploaded 1.4 now. Martin22:46, 30 December 2005 (UTC)
But if you edit an article with any arabic fonts in it will screw them up. There is one thing I might try though that may fix it, I will let you know when its ready if that's ok. thanks Martin16:00, 31 December 2005 (UTC)
Version 1.5.1.0 error
After downloading version 1.5, I opened the AWB and clicked on Help>About to make sure it was the right version. It said it was version 1.5.1.0, not 1.5. I closed the about window and then immediately got this error:
See the end of this message for details on invoking
just-in-time (JIT) debugging instead of this dialog box.
************** Exception Text **************
System.NullReferenceException: Object reference not set to an instance of an object.
at AutoWikiBrowser.AboutBox.okButton_Click(Object sender, EventArgs e)
at System.Windows.Forms.Control.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnClick(EventArgs e)
at System.Windows.Forms.Button.OnMouseUp(MouseEventArgs mevent)
at System.Windows.Forms.Control.WmMouseUp(Message& m, MouseButtons button, Int32 clicks)
at System.Windows.Forms.Control.WndProc(Message& m)
at System.Windows.Forms.ButtonBase.WndProc(Message& m)
at System.Windows.Forms.Button.WndProc(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
************** Loaded Assemblies **************
mscorlib
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.42 (RTM.050727-4200)
CodeBase: file:///C:/WINDOWS/Microsoft.NET/Framework/v2.0.50727/mscorlib.dll
----------------------------------------
AutoWikiBrowser
Assembly Version: 1.5.1.0
Win32 Version: 1.5.1.0
CodeBase: file:///C:/Documents%20and%20Settings/Jacob/Start%20Menu/Programs/Wikipedia/AutoWikiBrowser.exe
----------------------------------------
System.Windows.Forms
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.42 (RTM.050727-4200)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Windows.Forms/2.0.0.0__b77a5c561934e089/System.Windows.Forms.dll
----------------------------------------
System
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.42 (RTM.050727-4200)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System/2.0.0.0__b77a5c561934e089/System.dll
----------------------------------------
System.Drawing
Assembly Version: 2.0.0.0
Win32 Version: 2.0.50727.42 (RTM.050727-4200)
CodeBase: file:///C:/WINDOWS/assembly/GAC_MSIL/System.Drawing/2.0.0.0__b03f5f7f11d50a3a/System.Drawing.dll
----------------------------------------
************** JIT Debugging **************
To enable just-in-time (JIT) debugging, the .config file for this
application or computer (machine.config) must have the
jitDebugging value set in the system.windows.forms section.
The application must also be compiled with debugging
enabled.
For example:
<configuration>
<system.windows.forms jitDebugging="true" />
</configuration>
When JIT debugging is enabled, any unhandled exception
will be sent to the JIT debugger registered on the computer
rather than be handled by this dialog box.
Would it be possible to allow AWB to convert Unicode characters from their HTML codes to their proper symbols? See Curpsbot-unicodify if you're not sure what I'm talking about. --Ixfd6401:16, 2 January 2006 (UTC)
I would love to do that but I wouldnt really know how, maybe someone has some idea of how to go about this? Martin01:18, 2 January 2006 (UTC)
I noticed that Unicode conversion appears to be possible with the find-and-replace function. However, the feature only converts one string at once. Would a possible solution be to add the ability to do multiple conversions? It will still be very tedious to convert every Unicode symbol, but we could save common conversions in the find-and-replace list. --Ixfd6401:35, 2 January 2006 (UTC)
If we had a list of all the html and unicode symbols, we could make a simple list of find and replace routines, it wouldnt be pretty, but it would work and be reliable. p.s. I just put on note saying that I am perfectly willing to share the source with anyone who wants to help develop it ; ) 01:41, 2 January 2006 (UTC)
AWB text reformatting clutters diffs
Hi there AWB developers. I've noticed that AWB-assisted edits have a habit of reformatting wikitext in addition to the noted changes in the edit summaries. It would be nice if the reformatting were performed as a separate edit before the intended change (with an edit summary like "reformatting wikitext"). This would lead to much clearer diffs and a better reflection of what was actually done to the article. None of this applies, of course, if this behavior has changed in more recent versions or if what I have seen is a result of the AWB user's actions and not the software itself. Mike Dillon16:34, 5 January 2006 (UTC)
The edit summary is always "AWB-assisted" followed by a phrase of the user's choice. Therefore, it is the user who wrote the incorrect edit summary, not the AWB itself. — MATHWIZ2020TALK | CONTRIBS22:48, 5 January 2006 (UTC)
How is it "incorrect" if the user doesn't know it's happening or doing it intentionally? I don't have access to AWB since I have no Windows machine to run it, so I don't know if they see a diff before saving or if they have to request it just like on the primary web interface. Does a user really know that AWB realphabetized the categories? Is that an automatic behavior, or did the editor I observed do this intentionally and neglect to note it? Not to sound like I'm on a witch hunt or something, as the diff issue is a pretty minor inconvenience. I would say that AWB should strive toward being neutral on the original wikitext formatting, as far as possible. Reformatting is an excellent functionality to expose to the user by choice, but not if they aren't aware of it. Mike Dillon03:59, 6 January 2006 (UTC)
I noticed that tweak when I was reviewing the code for the first time, but I didn't know that was new recently. In addition, when I was reviewing the code, you seemed to use * and ? differently than listed at Regex. The article says ? matches 0 or 1 recurrences of the character, and * 0 or more, but somewhere, you used a *?. This leads me to believe that, in C#, * means 1 or more, the equivalent of + in most systems. Is this correct? — MATHWIZ2020TALK | CONTRIBS20:48, 6 January 2006 (UTC)
The *? is a single regex atom. It means 0 or more, but it says to use stingy matching instead of the default greedy matching of *. Unfortunately, the Regex article doesn't address greediness, but basically, a greedy regex will match as match characters as possible until it fails, while a stingy regex will match only until the atom that follows can match. There is a better explanation of greediness in Chapter 4 of the canonical Mastering Regular Expressions (search for "greedy" in the text). Mike Dillon03:23, 7 January 2006 (UTC)
P.S. Search for "laziness" and "non-greedy" to get the explanation of lazy/stingy regexes, or better yet, read the whole thing ;) Mike Dillon03:31, 7 January 2006 (UTC)
Another thing that should be added is the ability to change categories with a modifier after them, for example {{Category:Wikipedians in the United States|Jtkiefer}} The way AWB currently handles them if you tried to change them over to say {{Category:Wikipedians}} or {{Category:Wikipedians|Jtkiefer}} I'd end up with something like {{category|Jtkiefer}} which causes problems and which is an unusable category. JtkieferT | C | @ ---- 23:41, 5 January 2006 (UTC)
The key is the bit after the pipe " | ", if a category has a key it is sorted alphabetically by its key and not by its name. Martin00:13, 6 January 2006 (UTC)
Due to the recent success of Firefox, OpenOffice, and other open source programs, I was wondering what the general consensus would be on making the AWB open source. I could release the source code and then make a page, maybe Wikipedia talk:AutoWikiBrowser/Open source, where anyone could request features, and developers could post code. If I implemented such a plan, I would also make available to developers an extensive list of the changes in each version of the AWB. Any ideas? — MATHWIZ2020TALK | CONTRIBS23:49, 5 January 2006 (UTC)
At the moment we make people register to avoid anyone abusing the software, being completely open would make that impossible, any features can be requested here, it would be cool to have more people developing it though. Martin23:52, 5 January 2006 (UTC)
Abuse how? It's software for editing a wiki. Hardly revolutionary. If anyone really wanted to make their own then they could do so. This thing should be open source as should anything to do with the wiki (MediaWiki, etc).
Hmm. Just a very rough idea: would it be possible to keep a core part closed and make the rest open source? That core part would be released in binary as we have now and the rest would be open. After all, it's .NET :-) --Adrian Buehlmann22:29, 29 January 2006 (UTC)
Of course it's up to the authors, but my vote is to open up the source.. security through obscurity really isn't that effective, and it's not that hard for someone to write a vandalbot on their own. It is also not that difficult to decompile and modify MSIL code, so someone with some .NET knowledge could bypass the registration check. Great software by the way, I just used it in order to subst a template which has been deleted.. now I just need to re-learn regexes, it's been a while since I've used them. Rhobite20:14, 12 February 2006 (UTC)
The point is that this software would make high speed vandalism available to all, yes it is obviously possible that someone with a lot of spare time and decent programming skills could make their own vandal bot, but it is unrealistic. Also, consider that 1 person has already been removed form the check page for being reckless with it. Plus I am - prepare to be shocked - not a big fan of open source software in general, and the system that we have at the moment has been very successful in rapidly developing the program. Martin21:35, 12 February 2006 (UTC)
Not a fan of open source? The Washington Post had an interesting article today, which basically stated that flaws in open source software are fixed 60% more quickly on average than those in closed source software. (With the AWB, all fixes are quick. The article was about profession software, for example, Firefox, all of Microsoft's programs, etc.) Anyways, if anyone wants the closed source code, all you have to do is ask Martin and he'll happily give it to you. Unless you become some crazed lunatic vandal, of course. The only reason it's not open source is because there are crazed lunatic vandals out there, and keeping the code confined to people whom Martin trusts prevents them from getting their hands on it. E-mail Martin and ask for him to e-mail you the code, and then just e-mail him the code back if you make any changes. It's as simple as that. Martin's always quick to respond to his e-mails (no matter what the time zone discrepancy may be). --M@thwiz202022:33, 12 February 2006 (UTC)
Feature requests
I have two requests. Every now and then I notice the alert about a long article having stub status. Could there be a button (or something) that would quickly remove all the stub templates. I hate scrolling through the text and finding it.
Secondly, this tool is works wonderfully with fixing typos. I imagine it could do the same for disambiguating pages, but I'm really not sure how it would work. The idea I have is somehow a the program receives the different terms from the user (which he collected from the disambiguation page). After getting the list of pages linking to the DAB page, the user goes through each one. If it's the first term (say, pop music), he clicks on it (or maybe presses a hotkey) and the link changes to that term. Say pop -> pop. Clicking on option two (pop art) results in changing pop -> pop. I hope I've explained myself alright, it's difficult to describe. Let me know if you have questions or any suggestions. Oh, and welcome back! :) GfloresTalk07:48, 6 January 2006 (UTC)
Tools for disambiguation is a really good idea, at the moment I am working on scanning the database dump, but this will probably be my next target. (that and introducing an spell checker, but I am waiting on Microsoft for that). Martin13:17, 6 January 2006 (UTC)
I would like to point out this python bot which does something similar to what I described. [7]. I think if AWB can make it a bit more user friendly than using the python bot, that would be fantastic. GfloresTalk05:56, 9 January 2006 (UTC)
If you use navigation popups, you can access a similar feature via the popup. If you hover over a link to a disambig page, the bottom of the popup lists all the links on the page. Clicking on one replaces the link Pop with, e.g., Pop (it adds the correct page while keeping the text seen the same). — MATHWIZ2020TALK | CONTRIBS20:48, 6 January 2006 (UTC)
Minor regex request: I'm looking for a regular expression to fix bad links. Essentially, it needs to look for links in this form... [[http://www.abc.com]] and change it to this [http://www.abc.com]. Same with [[http://www.abc.com link]]. Sometimes, linke are like this [[http://www.abc.com|link]]. This needs to be changed accordingly to [http://www.abc.com link]. Any help is appreciated. I read a little about regex and came up have used this in AWB... \[\[([Hh]ttp:[^\]\]]+)]] However, it doesn't change for the later caveat (the '|') and may find false positives. If you have time. Thanks. GfloresTalk18:04, 6 January 2006 (UTC)
I just wanted to say thanks for working on this item specifically. Currently the bad link cleanup process tasks many hours for each dump and this would speed up the task considerably. --PS2pcGAMER (talk) 22:30, 6 January 2006 (UTC)
Note to Martin - try:
replace \\[\\[http:\\/\\/(.*)\\]\\] with [http://$1]
replace \\[http:\\/\\/(.*)\\|(.*)\\] with [http://$1 $2]
This removes the double [ from http links, and changes the pipe to a space. It finds all links beginning with http://, which means it will also do this to links to articles such as [[http://]]. In addition, it will not fix links beginning with hTTP - I did this since, if you try [8], Wikipedia does not recognize it as a link. Wikipedia only recognizes external links that begin with an all-lowercase http, but the regex could be easily tweaked to fix any case. — MATHWIZ2020TALK | CONTRIBS17:59, 7 January 2006 (UTC)
Another note - you have to have the two regex replaces listed in the order above. For example, if you have [[9]] and do regex replace one and then two, you get Google - in the reverse order, you still have [10]. — MATHWIZ2020TALK | CONTRIBS22:57, 7 January 2006 (UTC)
Wikipedia:Bad links also has some bad characters for internal links. I have developed these regexs to fix them:
fixes double space: replace \\[\\[(.*) (.*)\\]\\] with [[$1 $2]]
fixes space at beginning: replace \\[\\[ (.*)\\]\\] with [[$1]]
fixes space before "#": replace \\[\\[(.*) #(.*)\\]\\] with [[$1#$2]]
fixes double underscore: replace \\[\\[(.*)__(.*)\\]\\] with [[$1_$2]]
fixes underscore at beginning: replace \\[\\[_(.*)\\]\\] with [[$1]]
fixes underscore before "#": replace \\[\\[(.*)_#(.*)\\]\\] with [[$1#$2]]
I just tested them, putting them after the two lines above. The reason why I have separate regexs for spaces and underscores is because I don't want to change a link such as January__1#External_links to January 1#External_links - I want the link to use all spaces or all underscores. — MATHWIZ2020TALK | CONTRIBS23:06, 7 January 2006 (UTC)
I know the AWB separates the categories, language links, FA templates, and Persondata templates and puts them in the correct order - could you do the same with the above sections, i.e., could you write code to separate them and then put them in the correct order? Thanks. — MATHWIZ2020TALK | CONTRIBS21:09, 7 January 2006 (UTC)
1.6.2
FYI: There is a 1.6.2 listed under the list of changes, but the check page doesn't show this version as enabled. — MATHWIZ2020TALK | CONTRIBS
17:28, 7 January 2006 (UTC)
In addition, I was looking through the code of 1.6, and, in AboutBox.cs, on line 131, there is a type: guidlines should be guidelines. In AssemblyInfo.cs, the copyright date should be 2006 on line 13. Can I have the source for 1.6.2? — MATHWIZ2020TALK | CONTRIBS19:08, 7 January 2006 (UTC)
Sure, I'm busy at the moment, but I'll get all of the above sorted tomorrow evening, thanks for the regexs! Martin22:21, 7 January 2006 (UTC)
Martin - why'd you remove "When using this software, check every single edit and try to avoid making extremely minor edits such as adding or removing a single space" from the notice? — MATHWIZ2020TALK | CONTRIBS23:05, 8 January 2006 (UTC)
Martin says (above) that AWB "is not broken, it's not even specifically designed for this task as you seem to suggest."
We have no way of knowing the intent of the author that it's specifically designed for any particular task.
Unfortunately, the actual effect of AWB is to mass de-link dates, and to mass de-alphabetize inter-wiki links. For example, see breakage of Wikipedia:Disambiguation and breakage of Israel. That's just two very high profile examples.
Therefore, we should assume that it's misused because of poor quality control by the program author (regardless of intent), and prohibit futher use.
Huh? If you want to know the intent of the person who made those edits, since anyone using the AWB has to approve every edit they make, you could ask him (trust me, Ian is very kind and will respond to any inquries you have), and I don't see how the cited diffs could be interprated as "breaking" those pages.--Sean|Black08:41, 1 January 2006 (UTC)
Thanks Sean. William, there is an option in the program that I was asked to implement that removes excess date links (I didnt even make the logic behind it), users have to conciously turn this option on for it to work. Plus every edit has has to be accepted by the user. The software can be used for a range of tasks, it is designed for no individual task in particular. Martin11:12, 1 January 2006 (UTC)
Also, if you want to know what intentions were, examples include; Me stub sorting about 50% of "Artist" stubs in just a few hours, Kbdank71 (who pretty much single handidly takes care of WP:CFD) is now able to re-categorise the articles himself and User:Gflores has been correcting typos, many others have been using it as well for a variety of things. Martin13:18, 1 January 2006 (UTC)
"Mass de-alphabetize inter-wiki links" is NOT an effect of the program. Spanish has the ISO code es. Esperanto has the ISO co eo. I see nothing about the behaviour of the program that indicates that it puts eo after es in a alphabet-based sort. The fact that you don't, or won't, understand how it is sorting things is another matter entirely. David Newton20:53, 14 January 2006 (UTC)
I believe the confusion is that the original poster was referring to "alphabetization" of the resulting list of languages, not the underlying codes. The examples cited (Wikipedia:Disambiguation and Israel) used a sort based on alphabetization of the Latin alphabet transliteration of the local language names, but they were changed by an editor using AWB to be alphabetical based on the ISO-esque Interwiki language code. Since there is no official policy on interwiki link ordering, it is up to the editors of an article to agree on one scheme or another. They are both culturally biased, but the faux alphabetization of local names is not really any more internationally saavy because the sort order is still derived from Western languages. There is no inherent reason that the sound "A" should come before "Z" and that isn't even the case for all the languages that actually have those sounds. Mike Dillon21:08, 14 January 2006 (UTC)
I beg to differ:
AWB does not require every edit to be approved by the user. It makes dozens or hundreds of changes with only one approval.
For those who have difficulty reading the diffs that I provided, concentrate on a few things:
removing (previously correct) date links for 1948 and 1967 (and many others) in Israel.
re-sorting (previously correct) interwiki links so that "Esperanto" is alphabetized before "Español" (in both diffs).
There are many such bad edits, prompting a firestorm of complaints.
If you made the software, you are responsible for its output, not the poor sap that used it assuming that the rules it followed conformed to consensus. There is no disclaimer of warranty in this venue.
Finally, the argument concerning intent is pretty standard in the legal domain, and I'm sorry that's too esoteric for many to understand. Here's the short version:
The intent doesn't matter, and is not an element to prove.
The actual results are enough to indict.
Please cease and desist using AWB until its results are tested and proven to conform to consensus.
That's simply not true. The AWB does require all edits to be accepted by tose using it- Why would Martin lie about that? A couple of incorrectly alphabetized interwiki links do not "break" pages.--Sean|Black23:53, 1 January 2006 (UTC)
(Thanks again Sean) William, you are simply wrong; every edit is checked by the user, if you have a problem with people removing dates then take it up with them, as long as it is a guidline it will be an option in the software. I dont know where you get these ideas from, but I hope you stop these slanderous accusations. Martin00:00, 2 January 2006 (UTC)
Rather, the user is prompted to check every edit. This is a very different thing; you can't enforce an actual check in code. And if someone's making a change every two or three seconds, he's spending much more time waiting for the page to load than looking at what he's actually doing. I recall one article where an AWB-assisted edit changed "May 9th, 1955" into "May 9, 1955" [11], correctly fixing and linking the non-functional date, but incorrectly unlinking the year; it's hard to argue that this edit was sufficiently "checked".
My own bot has a function to convert old cut-and-pasted tables into invocations of Template:Album infobox. Because of all the crazy things folks have done with the formatting between the paste and my conversion, the function can't be 100% accurate, so I check every edit (in raw wikicode) before it's posted. Regardless, I still run it at the normal 30-second throttle, because the bot, not me, is doing most of the work. AWB should enforce a timeout as well. —Cryptic(talk)03:46, 2 January 2006 (UTC)
True. But if someone is sloppily checking their edits, then it's the fault of the user who's, well, sloppily checking their edits, not the software.--Sean|Black04:51, 2 January 2006 (UTC)
Plus, any user abusing the software should be removed from the list so they can no longer use it. I have also disabled the date removal thing because I am tired of defending what people do with it. Martin10:42, 2 January 2006 (UTC)
Thank you for disabling the date removal "thing", please:
disable the interwiki sort "thing"; and
enforce a prompt for each and every change on a page, not a blanket acceptance of dozens of changes on the same page; and
enforce a 30 second PER CHANGE timeout for speed of page edits!
Sorting interwiki links has been carefully and concientiously done by many international editors, and this one program damaged thousands of such pages, sorting by ISO code instead of alphabetically. That's broken by any definition of the term!
Moreover, there is no reason for us to have to go to each user of your software to chide them for making the mistakes programmed into the software. Many folks went to your talk page and the AWB talk page and complained, and you did nothing about it while thousands of pages were damaged! Now you (Martin) accuse us of slander?
It's apparent to those of us who have been both professional programmers and professional editors that your code was insufficiently tested, and did not conform to any standard of Professional Responsibility.
Please cease and desist using AWB until its results are tested and proven to conform to consensus.
It has been released as a development version. Plus, the only information I could find on inter language order was the Wikipedia:Language order poll, in which the alphabetical listing was the most popular choice, hence that's why it sorts like that. Martin11:39, 2 January 2006 (UTC)
Will you (William) please stop going on about "blanket acceptance of lots of edits". It simply is not true. Every edit has to be accepted individually by the user and it is their fault if the edit is wrong because they didn't check it. Also, I think that sorting interwikis is very worthwhile because it makes it easier to add more in the future. --Celestianpowerháblame18:09, 10 January 2006 (UTC)
Incorrect edit
Just to let you know, on this page, NTL, it incorrectly moves the line to the bottom (it thinks it's a translation link? It begins with...
ntl:hell following shortly after. Devalued and struggling with debts of around $18bn NTL was forced to seek Chapter 11 bankruptcy protection in May 2002 in order to organise a refinancing deal. The company did not emerge from protection until January 2003,
The AutoWikiBrowser page does not answer: Are there any plans to port AutoWikiBrowser to compile on Linux? (There's the Mono C# compiler and the MonoDevelop IDE available, though this page is grim about how well WinForms apps work on Linux. It sounds like Mono's WinForms support isn't up to the same quality level as its other toolkits, like Gtk# and Qt#. --Unforgettableid | talk to me19:44, 13 January 2006 (UTC)
I dont know much about mono, but I highly doubt it will be portable as it uses the internet explorer core, if they manage to port that then maybe. Martin19:48, 13 January 2006 (UTC)
Since there is a portable, free (as in beer & freedom) C#/.NET runtime (Mono), and since there are portable and free C#-graphic bindings (Gtk#, Qt#), and since there are portable and free HTML rendering engines (Gecko), shouldn't it be better to focus on them instead of going to use highly platform-restricted technologies? Even if you really only want to stay on Windows, using WinFX will make your application available only on Windows XP and Vista. I can't understand the reasons of this choice. Is there some compelling reason for this? Please note that a Linux or MacOS developer that restricts its applications to its favourite OS in the same way would get the same criticism for me. --Cyclopia17:26, 30 January 2006 (UTC)
Well I am a c# developer who uses VS2005 for a start, plus this highly restricted platform is the one that the vast majority of people use. I have never used mono, and quite frankly while I have more pressing issues don't intend to. As for WinFX, it will allow me to easily implement spell checking, I will provide a non-winFX version as well. Martin17:35, 30 January 2006 (UTC)
Ok. I find it quite sad, but you're the developer. I'm sorry if I bothered you too much, I just wanted to let you know that there are ways to implement what you want to do without restricting your users to a (non-free) platform (no matter how used it is). For example, GNU Aspell can probably help you on spell checking in a multiplatform, free way. Thank you anyway for the good project -I'm just unhappy of being unable to use it. --Cyclopia22:26, 30 January 2006 (UTC)
Interwiki bots, when adding new links, routinely sort the interwiki links. Thus, it's probably preferable to disable the option in AWB as it makes diffs more difficult to read. -- User:Docu
Hey there. Just a quick request. Is it possible to add a date to when each new version is/was released? This just makes it easier if one has been away and they wish to see the changes that have taken place since their last version.
Allows for openings into other areas of space and time.
Okay. I did just that! (Well, except for the fact that I haven't yet come up with a way for the AWB to engage in hyperdimensional travel, but that's on it's way...) --M@thwiz202015:01, 16 January 2006 (UTC)
Would it be possible to make a feature that would help AfD artciles? When I'm going through the articles needing cleanup I find the need to AfD some of them, but I don't want to stop very long or switch browsers to do that. I'd imagine that it would (after pressing a button) subst in the template then take you to the next page where you could write the summary and finaly takes you to the daily page to subst in the deletion page. BrokenS15:12, 16 January 2006 (UTC)
He is. WP:AWB states, "The author is perfectly willing to share the source code with anyone who wants to help in development." When Martin left Wikipedia on 2 January, I asked him for the source code so I could take over. However, he came back to Wikipedia so now he is doing the majority of the coding. Every once in a while, though, I send him some code that he includes in the AWB. For example, if look at the versions table, I contributed to 1.6.3 and 1.6.5. --M@thwiz202015:40, 16 January 2006 (UTC)
Interesting idea, not sure how feasable though, in the mean time there is an option on the textbox context menu to open the page in your normal browser. thanks Martin16:38, 16 January 2006 (UTC)
Disambiguation tools
I just started using AWB today to do some disambiguation link repair. I don't have a lot of suggestions ye but figured I would add them here as I think of them:
Allowing creation of a custom list of Edit Summary's that can be changed for each page without having to re-click start. I like to include what I actually disambuated in the edit summary so that a) someone who looks at the history does not have to do a diff to see what I did and b) so I can look at the work for a single disambiuation page and get a feel for how many links are going to which articles. I usually do something like disambiguation link repair (You can help!) United Provinces to Dutch Republic but with a diffrent target article depending on what I did. It woudl also be good if you coudl apply the text from the dropdown then edit it for the given instance to add for example two links. In anyevent the edit summary is I think a good first step for doing DAB work with AWB.
A lot harder to implement would be allowing the find and replace to specify multiple replace with strings which the user woudl be prompted for each instance. Dalf | Talk04:38, 17 January 2006 (UTC)
Add a filter to the list like the one that removes links outside of the main namespace to remove or select links via redirects. Dalf | Talk05:15, 17 January 2006 (UTC)
No, some people like to sort them one way, some another, thats what happens when there is no policy. AWB does it the most popular way, plus it has an option in the menu to disable it. thanks Martin19:59, 18 January 2006 (UTC)
Martin, one of the rules for the AWB is "Don't edit too fast." What do you consider "too fast"? I noticed that you routinely crank out edits at five or so per minute, yet User:Talrias blocked User:Bobblewik on 29 Dec for editing the same amount, saying it was not possible. Is that speed acceptable? Thanks. --M@thwiz202021:42, 18 January 2006 (UTC)
I would be interested in the answer to this. User:Talrias blocked me again today <sigh>. So if I go quiet again, either it has happened again or I have given up in disgust again. This problem is inherent in being a janitor.
If people with blocking powers are using speed as justification then artificial speed reduction might be deal with the symptom, if not the disease. I do not know how I would target a particular edit speed. Is it possible to add a speed-brake at an acceptable rate, or a rate that could be modified by negotiation?
Incidentally, I have been accused of being a bot in the past (e.g. on my talk page and elsewhere) because my manual edits are often fast. I think I have got up to 4 edits per minute.
In addition, a suggestion for increasing overall speed while keeping individual speed low might be to share tasks. bobblewik22:01, 18 January 2006 (UTC)
"Too fast" depends on what you are doing, so some kind of throttle becomes useless, plus people doing disambig'ing etc. frequently edit just as fast. I havent been using my bot account because it is flagged and wont appear in recent changes, I probably should make a seperate bot account though. Bobblewik was blocked because he was making controversial edits frequently, not specifically because they were frequent. Martin22:16, 18 January 2006 (UTC)
(originally a reply to bobblewik, now down here after EC) In deleting categories that are about to be deleted from articles, it is not uncommon for me to open 10 or 20 tabs so I can go up and down the line with my cuts. While I did a bunch of stuff this morning with little heed to my speed, I am not sure that awb made me any faster than my multitabbed method. It just made me more efficient (well, until I broke someone's complex italicizing on an article or two). In fact I just checked and here is my multitabs where I was doing 6 per minute, which is comparable to my awb speed this morning (on the first page or two of my most recent contributions). --Syrthiss22:24, 18 January 2006 (UTC)
I just looked at bobblewik'scontribs, and all his edit summaries say "x percent' -> 'x %' in accordance with Manual of Style" even if they dont change the percentage at all. Is this a bug or something?
This is very annoying, we have a check page so this software is only used resposibly, please check your edits. Martin23:14, 18 January 2006 (UTC)
Actually I am going to remove your name from the list Bobblewik, you have already received multiple complaints from just a few hours work. sorry Martin23:17, 18 January 2006 (UTC)
Huh? I do not understand. jeffthejiff wanted me to check a particular edit. No matter how carefully I check that edit, I cannot see anything wrong with it. It removed a blank line. How is that controversial? It is bizarre to be criticised for edits that are actually inherent in AWB.
If the problem is that the summary implies my own edits and all that happens is AWB-inherent edits, then that is fine. I can easily incorporate that as a constraint. Alternatively, I can modify the wording to add "possibly". But, until now, I was not aware that it was a problem. I still don't quite see the big deal. What other complaints do you think are valid? Sigh. bobblewik23:52, 18 January 2006 (UTC)
The problems are; you havent been checking edits properly, your edit summaries are misleading sometimes and you have made very minor edits (which the main page specifically says not to do). If you come across a page where the task your are performing isnt necessary (i.e. you are fixing %s, but the page contains none) and/or the only edit it is making is very minor, then just ignore it. Also, err on the side of caution. May I also recommend that you generate a list of pages that definately need something fixed, such as a common typo (I can do that for you from the data dump if you want), because that way you won't come accross too many pages that should be ignored. If you are happy with all that then I am happy for you to comtinue using the software. I am signing off now so someone else can re-add your name before I come back if they see fit. Martin00:22, 19 January 2006 (UTC)
I'm not sure the verification is working though since for some odd reason when I forget to log in it still lets me edit using AWB so you'll probably want to recheck and fix if needed the verification code. JtkieferT | C | @ ---- 23:57, 18 January 2006 (UTC)
Bobblewik, on the main page, under "rules", it says: "Avoid making extremely minor edits such as adding or removing a single space." In the edit described above, that's exactly what you did. Therefore, I am in full agreement with Martin's decision to remove you from the enabled users list.
I see that. I read it a while back but it clearly did not sink in. That must be because I find it easier to remember things that seem rational and harder to remember things that do not. My simple view of the world has been than a small improvement is better than no improvement. Unless my memory is playing tricks on me, there is encouragement for editors to improve articles in any way they can. I do recall seeing instances whereby my chosen edit did not appear but I chose to go ahead on that basis. I could easily have turned off the 'general fix option'. Believe me, I wish I had. It would even have made my work faster. I still do not understand why small improvements are a bad thing but I won't forget the constraint now.
As far as misleading summaries are concerned, I usually attribute that term to a claim of Doing X and Y when it does A and B, or X and A. In my case, the summary said Doing X. Y. and the complaint was that it merely did X. Actually, it sometimes did X only, sometimes Y only, sometimes X+Y together. So perhaps it should have said Doing X and/or Y. Ironically, the very specific summary was added in response to a request for more detail by Talrias. I do not like being blocked by him so I took his request quite literally and probably gave too much detail.
A while back I was going to ask for an option of ignoring pages if my 'Find' string was unsuccessful but still apply the general fixes if it was successful. That would mean I could be sure all my edits are a targetted 'hit' and may also have the benefit of the general fixes. The current situation means I now have a very strong incentive to turn general fixes off for all edits whereas I have only a very weak incentive, if any, to keep general fixes on.
Now, enough of that negative stuff and onto the positive. My aim was to bring the many instances of 'x percent' and 'x per cent' into line with MoS guidance which says digits should be paired with '%'. I used Google to find them, but if there is a way to probe a database dump, that would be much better. Just to confirm my knowledge of the rules: small edits are bad; summaries should not exceed the actual edits. If I am still persona non-grata, then that would be a shame but there is not much I can do about it. bobblewik02:45, 19 January 2006 (UTC)
Your tone has grown slightly caustic. The point is, all your edits have some cost. Server time, wasted time of people checking RC (shouldn't that edit have been marked 'minor'?), filling up the edit history, etc. Removing that line changed the article in no way. You're supposed to ignore trivial cases because they are trivial (and should be done in conjunction with worthwhile edits, not as their own entities). If you haven't the time to change the edit summary while editing you are going too fast (unregistered bots, even manual ones, are supposed to go slower than one edit every 30 sec) and I doubt you are actualy sufficiently reviewing the edits. I agree, you shouldn't be using the bot for a while. BrokenS04:20, 19 January 2006 (UTC)
It is difficult for anyone to detect tone in text. To the extent that 'tone' is apparent, it frequently looks worse than intended. So do not judge me please. I have no idea about server time or what RC is. All I know about is that articles have a lot of rubbish in them. I was trying to fix percentages in line with MoS guidance. I succeeded in that task and thousands of instances of percent are better for it. That seems to me to be a good thing, not a bad thing.
I was not tackling excess lines but it came up as part of general fixes and it seemed a reasonable suggestion to me. My limited understanding of computers is that extra code on a page is wasteful of something. I have already explained the reason for the same summary but if you missed it, the reason it remained as Doing X. Y. is because it was indeed doing X or doing Y, or both.
If you are determined to believe that I am a bad person, or that I am damaging Wikipedia, then there is not much I can do about your opinion. If you look for defects in people you will find them. If you look for merits in people you will find them. I am not your enemy. bobblewik05:16, 19 January 2006 (UTC)
We're not being your enemy, just trying to help wikipedia and help you help wikipedia. Extra code on a page is wasteful to some extent, but because each and every edit is saved in the history, it will just use space on Wikipedia's servers anyway. No idea how much they've actually got, but it must be some huge amount. A larger problem is the server time though. By that we mean the time the Wikipedia servers take to make the page up and send it to you - something that is costly in such large quantities (as one of the most popular sites on the internet), and the main reason why Wikipedia asks for donations. And by RC, brokensegue meant Recent Changes, a page which lists all the recent changes made to wikipedia. People check it and might look at the edits only to find that a space has been added in a place that makes no difference to the actual article. Same goes for the Watchlist.
So essentially really minor edits that make no difference to the article should be avoided because they just waste space in history lists - the actual space saved in the code is negligible. Thanks for spending a lot of time improving wikipedia though. -- jeffthejiff(talk)08:22, 19 January 2006 (UTC)
(indent) The point of the "general fixes" is simply that while you are fixing something such as a common typo, re-categorising or something, that it also makes other minor changes at the same time, as a rule of thumb, if a change doesnt actually affect the look of an article, then don't make it. Also, you will be pleased to hear that the software does have the feature you mention, the "ignore if doesnt contain" is very helpful as it will automatically skip any pages that do not contain what ever the problem is you are fixing. As for your rules; small edits are good, but not if the total edit is insignificant, edits can exceed the edit summary (but not by anything significant), this might sound strange to write, but the vast majority of edits ever made exceeds the summary in some way, just don't let them be misleading, (I think this problem will be solved naturally if you skip pages that dont contain the mistake you are fixing) thanks Martin10:28, 19 January 2006 (UTC)
Thanks for explaining it, I understand better now. I am delighted to hear that you have added an 'ignore if doesn't contain' feature. I appreciate the time you take in this. I would be happy to fix more of the same percent problems with a modified edit summary and avoiding insignificant edits. bobblewik13:04, 19 January 2006 (UTC)
Bobblewik, above, you said "I used Google to find them, but if there is a way to probe a database dump, that would be much better." Well, Martin recently developed a database dump search tool. Either you can download and run it or ask Martin to scan the dump for the regex "per ?cent". That way, your fixes would be targeted to articles which (as of the last dump) contained percent or per cent, not all articles returned by a Google search of percent. --M@thwiz202021:52, 19 January 2006 (UTC)
NoAutoBrowser tag?
Hi. Is there any way for telling the AWB not to change (by default) a given piece of text? In two different revisions, two editors have removed a lone underscore that was instead necessary [13][14] . Another editor has found the solution of replacing it with the HTML code "_", but I am not convinced that the AWB will not delete it again. Having some way for telling the AWB not to change a given piece of text (or, at least, making it alerting the user that that piece of text is not to be changed automatically) would solve the problem. Thanks. - Liberatore(T) 13:33, 19 January 2006 (UTC)
It does have a way of telling the editor what not to change, as it shows them exactly what it has done before it commits anything, unfortunately, as discussed above, Bobblewik has been less than thorough in checking, and for that reason is not able to use the software at the moment. Martin13:40, 19 January 2006 (UTC)
For this particular change, AWB should know not to remove underscores from the right-hand side of the pipe. In other words, given [[A_B|C_D]], AWB should transform this to [[A B|C_D]]. HTH HAND —Phil | Talk15:11, 20 January 2006 (UTC)
Sorry about this bug - I wrote the code for the bad link fixer. I'll relook at the regexs and then get back to you with an answer and, hopefully, a new regex. Thanks for bringing this to my attention! --M@thwiz202020:26, 20 January 2006 (UTC)
While I did write the bad link fixer, Martin added his own underscore fixer as mine had some flaws to it. I went over his code and I think I fixed it - I'm currently testing it out on ASCII. I'll post my results here and, if it works, send the code to Martin. --M@thwiz202018:51, 21 January 2006 (UTC)
What the script does is it takes every link and, if it does not contain ":" and it does not begin with "[[_" (my script lacked these components, which is why it wasn't used), then it replaces all "_" with " ". I tried to get it to split the link at "|" but I couldn't get it to work. So, I guess, there is no fix. (It also tries to make [[Bracket|<nowiki>[]]</nowiki> into [[Bracket|<nowiki>[[]]</nowiki>.) Editors will just be forced to double check every edit before saving - then again, since everyone is supposed to be doing that anyways, there is no problem here.
Just a side question: As Netoholic points out to me on my talk this seems to be a common case to be carefully considered at least in respect to template calls. Is there any consensus how to write template calls? Is it normal that calls to templates are written at random with underscores or spaces? --Adrian Buehlmann22:06, 26 January 2006 (UTC)
Underscores are seen as spaces, so replacing them with spaces makes no difference to how templates etc. are displayed. They are removed simply because they are unnessecary mess. It does say on the project page not to make very minor edits for this reason. thanks Martin22:17, 26 January 2006 (UTC)
Ok thanks. I had that "unnecessary mess" feeling too and thought that while I'm cycling through the calls of Infobox Film I could clean up that in the same run. I have clarified that on the project page that this also accounts under "extremly minor edit". Thanks. --Adrian Buehlmann22:39, 26 January 2006 (UTC)
Several pages on standards and punctuation syntax infact:[15]. Maybe there should be a rule which searches if a link in the page in question leads to the underscore page. At least the odds of encountering such a problem again would be slashed. Although, any page which relies on underscores (any page which deals with game map topics springs to mind) will be an issue in the future. Being a statistical guy, right now there are 930,000 articles on Wikipedia, and there are probably at least 1000 pages which contain links which rely on underscores. Therefore, in an even edit spread, there's a 1/1000 probability that AWB edits a page incorrectly. Realistic odds are much longer but the 1/1000 is still too high to ignore.--Dan(Talk)|@21:51, 21 January 2006 (UTC)
Virually all those links to underscore do not actually contain an underscore, I'll make it so it ignores links that actually contain the word "underscore" though anyway. Plus no links rely on underscores, as they are seen as whitespace, a few links start with an underscore, but it ignores them already (such as _NSAKEY, but as you can see the underscore isn't actually part of the name for technical reasons). Plus AWB is not automatic, so it shouldnt matter anyway. Martin22:38, 21 January 2006 (UTC)
Actually, I had already seen the discussion on the AWB in the village pump, and I agree that the user is responsible for all changes. I suggested that the AWB could alert the user to more careful than usual when changing some pieces of text, but I consider that a suggested improvement for the AWB, rather than a fix. Not having any idea of how long would that take to be implemented, however, I don't know if the effort of realizing this improvement can be worthy. - Liberatore(T) 11:43, 22 January 2006 (UTC)
no category bug
AWB fails to reconize {{1911}} as adding a categoy. I suspect it does not catch them from any template as that woudl be hard to do. Its still probbly notable as articles with only 1911 britinica categorys probly need more. Dalf | Talk08:25, 20 January 2006 (UTC)
that's correct, it doesnt see the category inside a template, but the 1911 one deoesnt count as a proper category anyway. Martin09:44, 20 January 2006 (UTC)
True enough. I was wondering though if you coudl add a feature to add a category to articles if they do not already have it. I was using AWB today to add Category:Cities in Romania to all of the cities listed in List of cities in Romania (alphabetical) and there did not seem to be an easy way to do it. Some of them needed other Categories added too so just pasting over and over did not work. FOr that task it looks like we can just at the category to Template:Romanian cities infobox but for the future it might be a worthwhile feature. Dalf | Talk09:51, 20 January 2006 (UTC)
Multiple regex replaces in the same run?
I know I'm evil (and greedy and possibly stupid :-): Could AWB be extended so that mutilple regex replace actions could be entered and executed in the same run (a list of search/replace fields)?
Maybe I'm on a totally wrong track, so I try to explain what I want to do: The underlying problem I'm thinking about to solve is changing calls to templates that need renames in parameters. Specific example job I'm about to do: template:Web reference supports an old variant that used uppercase/lowercase in parameter names. I'm thinking about changing these to the new lowercase only parameters variant.
Reason for this (sorry for my verbosity): Web reference is currently a meta template and I'm trying to convert that to using the new CSS-Trick of Netoholic. I have problems to do that supporting both kinds of parameter sets, so I'm thinking about switching calls to the lowercase only parameters variant.
Sorry for nagging with this whole chain of reasoning and many thanks for any help and ideas in advance. And thanks again to Martin for providing this great tool. (And please don't bother to tell me if I'm asking too much or the wrong thing!). --Adrian Buehlmann12:41, 20 January 2006 (UTC)
It's a good idea, I'll get around to it one day, part of the problem is simply organising the interface. Martin21:21, 20 January 2006 (UTC)
Yuha! Great. Thanks a lot. Maybe you could consider a config file with a list of exchange rules for simplicity. A menu point to load the rules (for the more experienced users) or so — Just an idea to make it as easy for you as possible. UI programming can get laborous. --Adrian Buehlmann21:34, 20 January 2006 (UTC)
Use with other MediaWikis
This is a great tool. Is it possible to configure it to work with other wikis, like other languages, or meta.wikimedia.org? Elonka03:25, 23 January 2006 (UTC)
At the moment I am mainly working on making the code more robust, this will make using it on other sites easier, but then there are probably problems that I havent foreseen. Martin09:51, 23 January 2006 (UTC)
Date delinking
I thought that the date delinking feature was removed. However, Bobblewik seems to be doing a lot of date delinking. For example, see this edit. Has it been removed, or has it been added back? --M@thwiz202002:21, 24 January 2006 (UTC)
You can certianly run the regex repalcements manually, without using the delink checkbox. This will be true as long as repalcemetns using regex notation are supported. DES(talk)03:14, 24 January 2006 (UTC)
However it happened, it seems that the changes are still not being reviewed in the latest round of the war on date links. In the Google edit cited above, [[January]] [[18]], [[2006]] got stripped of the brackets, rather than fixing it for the date prefs to work (manually removing the two middle brackets is all that is required). The malformed Dec 22, [[2005]] was handled in the same way. I know that reviewing changes like this is tedious, but my opinion is that quality shouldn't be sacrificed just for speed. Neier06:05, 24 January 2006 (UTC)
My name (Eagle 101 is in the list, but AWB tells me that I am not elegible to use it? What am I doing wrong. (yes I am logged in, on windows SP v2, and have broadband internet. What is wrong.
The user-checking script was changed a bit in 1.7.2. However, as far as I know, it was only changed to make sure the cookie is not empty. --M@thwiz202022:42, 24 January 2006 (UTC)
I used version 1.7.4.0 and had Czestochowa in the list. When "Bypass redirects" is checkmarked (default), AWB loops forever on that redirect ("Browser status is {Loading|Complete}). I had to click the ignore button (BTW a stop button might be a good idea? Don't know). Same happens on Nowy Sacz, Znin, Elblag and others (from what links here of list of Template:Infobox Poland). Low prio problem for me. Just wanted to report it. --Adrian Buehlmann14:38, 25 January 2006 (UTC)
This one is for the wish list: it would be nice if the text edit window (lower right window) of AWB could use a fixed width font (like courier or so?). I recently noticed how helpful this edit window really is (I was a bit reluctant to edit there until recently but it works just great). But this is just a "nice to have" one (not that important, no clue how nasty to implement). Thanks! --Adrian Buehlmann15:42, 26 January 2006 (UTC)
Is there a way to have an option to turn off the loading of images? Some days, like today, I notice a huge delay after I click save, and it's mainly due to waiting for the images to load in the page. I wind up clicking Start the Process again to skip to the next article. --Kbdank7120:52, 26 January 2006 (UTC)
I just turned off images in IE's preferences, which isn't a problem normally as I use Mozilla for RealBrowsing (tm). Since AWB basically uses IE, that stops the loading of images. --Syrthiss20:56, 26 January 2006 (UTC)
AWB isn't letting me do any work since it keeps saying that I'm not logged in even though I'm logging in and I can even check a special page in the browser (which get pulled dynamically) and get the fact that I'm logged in. I'm using 1.7.4.0 btw. JtkieferT | C | @ ---- 22:58, 26 January 2006 (UTC)
Yes, how did you change the way that logins are confirmed, I never had this problem before the changes to the way login checking was changed. JtkieferT | C | @ ---- 00:02, 27 January 2006 (UTC)
Bug with anchored links
The general cleanup function replaces underscores in link anchors with spaces. This is incorrect behavior; the part of the link after the '#' character should not be altered. Kelly Martin (talk) 06:26, 27 January 2006 (UTC)
I just registered and AWB isn't letting me do any work. It keeps telling me to log in, I keep logging in, and it keeps saying I'm not logged in. BTW, I've got version 1.7.5.0. Alr15:55, 28 January 2006 (UTC)
In the IE window it creates at the top of the page, does it show you logged in up there? I was switching to my bot account earlier and thought all I had to do was open up IE and log in there, but AWB didn't have me showing in the upper panel. Even though it says 'don't click in the top panel', you can click to do stuff like log in and look at your watchlist. --Syrthiss16:03, 28 January 2006 (UTC)
Do you usually browse wikipedia using Microsoft Internet Explorer? As far as I can tell, AWB relies on MSIE being configured correctly. --Netizen16:36, 28 January 2006 (UTC)
I use Firefox and only go with IE when absolutely necessary. I did try logging in with IE and then with AWB in various combinations. Didn't work once. Alr20:06, 28 January 2006 (UTC)
Under "alerts" when it says multiple wiki-links, if you double click a repeated link, in will highlight that link in the text, but will not highlight the last two sqaure brackets: ]]. So for example if you double click on a link saying Spain, it will highlight the [[Spain part in the text box to the right, but not the final brackets. I hope I have made myself clear. Thanks, — FireFox • T • 17:58, 28 January 2006
First off, thanks :)
Making it so I can remove categories is a big help. I was doing it by hand (with awb opening the articles for me).
There appears to be a bug with the category removal though... if the category is listed like [[Category:Child prodigies|*]] (with the asterisk) then it wants to wipe out everything after that as well (other cats, interwiki links). I've just been ignoring those articles where that happens and I'll do them by hand. --Syrthiss04:07, 29 January 2006 (UTC)
Sorry, just confirmed that it will do it no matter what the cat construction syntax is in some cases. It just wanted to wipe out all the interwiki links in Jeremy Bentham when I was removing the child prodigy cat. --Syrthiss04:09, 29 January 2006 (UTC)
I'm not good at filing bug reports, so I'll just describe step by step how to reproduce this bug and leave the brilliant prose for featured articles... :-)
Opened AWB, started process.
While waiting for pages to load, opened MSN messenger, started to chat.
At the same moment when page finished loading in AWB, a carriage return ("enter") would be sent to my chat window.
Closed MSN, but then went to IRC using XChat. The same "carriage return" event happened, cutting sentences half-way.
If pages finished loading in AWB while Firefox was in focus, the last tab I opened with middle click would re-open in a new tab.
The above happened enough times as to lead me to believe it wasn't just something I was doing wrong. It seems that AWB just wouldn't sit quiet in the background. I'm using the latest version of AWB (just downloaded it today) and I have no memory of this error happening before, so it might be interesting to have a look at this. Thanks. -- RuneWelsh | ταλκ22:36, 29 January 2006 (UTC)
Hmm. Think this has nothing to do with AWB. That category is really strange. Seems never ending presenting the same entries over and over again (next is always enabled though there seems no next). --Adrian Buehlmann09:21, 31 January 2006 (UTC)
Is it possible, if "Apply general fixes" is checked, to not move categories enclosed within noinclude tags to the end of the article? I've come across a few templates that someone had edited with AWB, and the category was moved outside of the tags, causing a number of articles that had that template to be miscategorized. --Kbdank7119:01, 31 January 2006 (UTC)
Is it possible to put a timer under the save button? such that the timer would reset everytime the save button is pressed? It would be nice to have, if it does not take that long to program. Thanks!!!!!Eagle (talk) (desk) 22:39, 31 January 2006 (UTC)
To avoid editing so rapidly that other users claim that a bot flag is required. I have seen several complaints that any edits made at a rate of more than 2 or 3 per minute by an accoutn without a bot flag using AWB should be consdered bot-edits. I don't agree with thsi position, but avoiding clamor may be worth while to some. DES(talk)23:52, 31 January 2006 (UTC)
You have my intentions exactly, time goes reeeally slow when you are stub sorting, especially by regex. I want to make dead certian that I do not edit faster than 2 or 3 edits a minute. What I am doing is replacing ((stub-here)) with ((another related stub here)), using regex to narrow the field, and then double check that the regex did the right thing. More infomation on what I am doing can be found hereEagle (talk) (desk) 00:09, 1 February 2006 (UTC) (the posted time is out of order due to edit conflict)
Really? When I use the AWB (which I haven't, unfortunately, been doing as much recently), I edit at about five pages per minute. To me, as long as I check each edit before submitting, I can go this fast. Each page loads in about five to eight seconds, so I can then spend seven to ten seconds scrolling through the changes window. If I see anything suspicious, I'll stop and investigate - otherwise, I'll submit. I see no reason why this technique should be criticized - sure, it's a lot of edits per minute, but they are each checked! --M@thwiz202000:12, 1 February 2006 (UTC)
Are you sure?? I will do what you stated above. Thanks!! Because most of my time now is spent watching the clock on my computer not editing or checking...are you sure that is ok.. according to the rules..., oh just tell me if it is ok!!!EagleDES(talk) 23:18, 1 February 2006 (UTC)(talk) (desk) 00:27, 1 February 2006 (UTC)
As I read the rules, if an edit is manually cheked it is ok no matter how quick, but if someone objects to particular edits, they have been known to denounce them as "bot-edits" if faster than 2-3 per minute. read WP:BOT where things are less clear then they might be.
Is it possible to make this an option and not a mandatory timer (if it goes in at all)? I can edit faster than 2-3 per minute just by using firefox. --Kbdank7114:16, 1 February 2006 (UTC)
Some categories are in a particular order. However AWB goes through and sorts them into alphabetical order. The result is I constantly have to partially revert AWB changes made by users. Is there a way around this? Can new rules for category sorting be added? Jdorje05:50, 1 February 2006 (UTC)
I could do, but when pywiki bots changes categories they alpha sort them in a totally automatic fashion, so I dont understand how they get away with it but I don't. Martin09:47, 1 February 2006 (UTC)
If I see a wikibot do automatic sorting, I would complain about that too. But so far I've only noticed human users doing it (though I might have missed it). Jdorje16:40, 1 February 2006 (UTC)
Here's an example: [16]: (AWB assisted living people category -- I guess that's a person using it) and (Robot: Changing category United States soccer players)
This would be pointless, there is no reason why the specific order of the category listings matters since they're still classified the same way no matter what they're order. Bluemoose I suggest you leave this feature the way it currently is since there is no good reason for turning it off even if it is manually enableable. JtkieferT | C | @ ---- 19:51, 1 February 2006 (UTC)
It affects the order of categories as listed in the bottom of the article. These are in a specific order. Jdorje19:56, 1 February 2006 (UTC)
But what I'm saying is that other than the asthetic tastes of a few there's no reason why they need to be done that way and there's no advantage to having them sorted that way, I will continue to use the category sort feature on every edit I make using AWB. JtkieferT | C | @ ---- 19:59, 1 February 2006 (UTC)
Your argument is circular. If there is no purpose to sorting them any way, why do you want to sort them at all? Obviously there *is* a purpose to sorting. It also helps editors to have a consistent sorting: for instance, Tropical cyclones are categorized by basin, season, strength, and location, in that order. Keeping the order consistent lets editors see at a glance that there are the correct number of each category applied (1 basin, 1 season, 1 strength, 0 or more locations). Putting them into alphebetical order provides no additional benefit whatsoever. Jdorje20:05, 1 February 2006 (UTC)
A consistant sorting would mean sorting every article the same way and since there is no consensus on which way is best alphabetically works as well as any. JtkieferT | C | @ ---- 20:16, 1 February 2006 (UTC)
The hurricane articles are sorted consistently. Of the 550+ articles, only a small fraction (those which have been changed by AWB or a bot) do not follow the same sorting rules. You are, of course, free to continue using the AWB sort feature - in most cases it is probably appropriate to do so. I am also free to keep fixing incorrect sortings, and to keep complaining. However I think it is not right for you to go out of your way to "fix" all tropical cyclone category sorting specificially, as you appear to be doing. Jdorje20:19, 1 February 2006 (UTC)
Might I suggest, if this is an important issue, that you try creating a template which adds an article to these multiple categories simultaneously in the correct order: this will also aid in making sure the sort key is consistent. HTH HAND —Phil | Talk10:25, 3 February 2006 (UTC)
Not sure if I am doing this correctly so I'll just describe what I did.
I wanted to fix all the double redirects to a page I just moved.
I generated a list using the what links here feature and told it to replace article name with newname.
When the program went to edit a redirect it followed the redirects to the page (the on I just moved) and did the search and replace. Oc course there was nothing to replace since the article didn't link to itself.
Is there a way to make it not follow redirects (or do what I'm trying to do)? Oh, by the way could you have it give out a warning if someone type in the category name as "Category:This category" because then it will look for "Category:Category:this category" which is almsot never what is intended. It's not that important though. BrokenSegue20:50, 1 February 2006 (UTC)
It downloads ok but it appears to be corropted "noi files to extract" when I try and extract it (even though it shows the files in the manafest). Dalf | Talk02:55, 3 February 2006 (UTC)
Option request
I wouldn't call this high up on the priority list, but I think it would be nice if there was an option to turn off the "List Complete" and the "No articles in list, you need to use the make list" messages. --Kbdank7114:09, 2 February 2006 (UTC)
The status bar would be fine, thanks. I was just thinking it would be two less things to click on (I'm extremely lazy). --Kbdank7114:54, 2 February 2006 (UTC)
Not updating after manual edit
Changes I make in the editing pane are not being reflected when I click "Preview" or "Show changes" again, although they are passed through when I click "Save".
This makes it all too easy to make mistakes, I fear: could someone fix it fast, please.
HTH HAND —Phil | Talk10:26, 3 February 2006 (UTC)
I introduced this bug recently by accident, it should have been fixed in 1.8, do you have this version? thanks Martin16:32, 3 February 2006 (UTC)
Excellent. This would also be useful in doing a certain very routine type of stub-sorting task: where a stub category is being split into a number of sub-categories, where each each of those is already tagged by another stub-category (so the object is basically merger of (sets of) two existing patterns, e.g. {{writer-stub}} & {{US-bio-stub}} => {{US-writer-stub}}).
This is already doable in two passes, but that's obviously not ideal (more work, and even if set to auto-run, more server load and more RC spam). Another way would be if add/remove stub template support existed as with category at present (indeed, it's conceptually almost exactly the same thing, just inconveniently different in wiki-syntax).
I should say, suberb job on this tool BM. It's sweeping my watchlist like wildfire, and it's a great hope for many of those mountainous cleanup task backlogs. Alai04:38, 4 February 2006 (UTC)
Is there a reason we can't save or load our settings? Secifically our regex and comment field settings? This would be a nice feature. ThanksEagle (talk) (desk) 22:16, 4 February 2006 (UTC)
When I run AWB on Wikipedia:AutoWikiBrowser, the preview shows that it is trying to empty out the contents of 2 <nowiki> sections (<nowiki>{{wikify}}</nowiki> and <nowiki>{{stub}}</nowiki>). Please point AWB at that page and take a look. --kingboyk23:34, 5 February 2006 (UTC)
It would be great if AWB could recognise a subst'd AFD tag, and not suggest moving the Pages for deletion category to the bottom. That would save some time for me, as a lot of my minor edits are tweaking of deletion-listed pages. Another time saver would be if it didn't suggest changes which are no more than the insertion of removal of line breaks. --kingboyk00:06, 6 February 2006 (UTC)
The AWB adds and removes empty lines just to "clean up" the page. If the only edits are adding or removing one line, then don't save. --M@thwiz202021:00, 6 February 2006 (UTC)
Yes, of course :-) But if AWB kept a count of how many changes it has made, and how many were 'trivial', it could skip to the next page if those numbers are low and equal - thereby saving me the time it takes to review. --kingboyk00:33, 7 February 2006 (UTC)
It seems to be common usage to annotate an Oscar category with the name of the actor involved, in an HTML comment, and I assume this happens for other categories also.
However AWB unhooks these comments and leaves them dangling in mid-air.
Could the category-sorting code be fixed so as preserve these comments?
HTH HAND —Phil | Talk10:05, 6 February 2006 (UTC)
Request
It's nice to see the implementation of multiple find and replaces. Would it be possible to create multiline find and relplaces too? For example, when you set AWB to find a sub-heading and replace it with nothing, it normally leaves an extra unwanted gap (example). This could probably be fixed with a multiline find and replace, if it is possible to do so. — FireFox • T • 20:57, 6 February 2006
An alternative: Bluemoose, program the AWB to replace multiple blank lines with one and remove blank lines after headings after finding and replacing. --M@thwiz202021:01, 6 February 2006 (UTC)
Ah ok. Thanks. — FireFox • T • 21:16, 6 February 2006
It's because it bullets external links on a new line after the ==External links== header. This was an unusual situation as the was inside a template and on a new line. Martin23:34, 6 February 2006 (UTC)
AWB 1.8.1: If I have set "Bypass redirects" AWB does not edit the redirect page, it instead edits the target page of the redirect (superb so far). I have now set in "Skip articles" (on the "Set options tab") a regex expression in "Skip if doesn't contain". It seems to me that the skip expression is applied on the redirect page. I would rather expect that the skip expression is applied on the page that AWB edits, i.e. the target of the redirect if "Bypass redirects" is set. BTW an option that would "click ignore for me" if it is a null edit would be fine ("skip null edits"?). (Sorry for not using the newest version, I have just loaded a bunch of multi-regexes right now into a 1.8.1 instance of AWB and I am too lazy to reenter them into the newest version :-). And sorry for being so greedy on features. AWB is just such an wonderful thing :-) --Adrian Buehlmann19:59, 7 February 2006 (UTC)
It does check the article it is redirected to to see if it should "ignore if does/doesnt contain", and this seems to be working fine as far as I can tell. I'll add add the skip null edits thing to list of things I will do. thanks Martin20:28, 7 February 2006 (UTC)
You are right. Sorry for nagging. I'm an idiot. I was distracted by the short blink of the redirect page in the browser window. Thanks for picking up the "skip null edits". BTW do you have an example settings xml file somewhere (or a doc)? The version note for 1.82 says that AWB can load it (not yet save). --Adrian Buehlmann21:45, 7 February 2006 (UTC)
No problem, I have just uploaded version 1.84, this can save the settings as well. Hopefully you will be able to keep the same settings.xml file for any future versions, meaning you wont need to keep re-entering your settings. thanks Martin23:40, 7 February 2006 (UTC)
It's that problem IE has with certain unicode fonts in the URL, I am fixing them as I find them, so it is useful to know which ones it has a problem with. I have also added the option to skip pages that it hasn't made an edit on. (version 1.85). thanks Martin13:47, 8 February 2006 (UTC)
Oh, that's bad. This must be a real pain for developing. I had no clue about that font problem. Would it be helpful to start a list of these problem redirects somewhere? I could then add them there without noising up this page here. Or I could simply email them to you. Or we could just make a special section here. Just some ideas. Re 1.85: Many thanks, I'm hurrying to download... --Adrian Buehlmann14:20, 8 February 2006 (UTC)
Oh, dear. The new "skip articles with no changes" works like a rocket. Incredible. So far for reducing my wiki editing.... BTW a pause button (or ESC key?) would be something nice to have (very very low prio this). I'm a bit ashamed for constantly asking new stuff and frankly baffled by your responsiveness. Many thanks. I'll try to shut up my chatter a bit now. --Adrian Buehlmann14:42, 8 February 2006 (UTC)
Calm down, our kid! If it's skipping articles, then it isn't making edits. If you're blasting through your list at a rate of knots not changing anything, nothing is lost except your opportunity to make a cup of tea . HTH HAND —Phil | Talk09:15, 9 February 2006 (UTC)
OH. Just to make this clear: "skip articles with no changes" is an extremely helpful feature to me. It's great to see AWB rush automatically to the stuff I want to change. The idea behind having a stop/pause button is that I do not want to go away from my computer while AWB is running. But sometimes, I do have to do so :-). --Adrian Buehlmann09:48, 9 February 2006 (UTC)
thanks Adrian, btw if you want to reproduce the problem for yourself try copying and pasting http://en.wikipedia.org/w/index.php?title=Xagħra_Stone_Circle&action=edit into internet explorer, it doesnt load up properly. If you click the link it works, if you paste it into firefox it works. therefore i am certain it is simply a bug in ie. Martin22:29, 9 February 2006 (UTC)
If you have "Skip articles with no changes" set, then the "Preview" button acts like "Ignore".
I can only assume that the dingus which checks for "no changes" is failing to spot any changes because these are not displayed when you do a "Preview".
I note that this feature is disabled when you set "Preview instead of diff".
Might I suggest that the "Skip articles with no changes" should only actually perform the skip when you first land on the article and thereafter be disabled until the next article?
HTH HAND —Phil | Talk17:37, 8 February 2006 (UTC)
It also skips incorrectly if you suffer a failure loading the article up. This is not so helpful when you then have to go back and figure out which article you just skipped so you can try it again. HTH HAND —Phil | Talk09:12, 9 February 2006 (UTC)
HTML entities bug
When making a list, article titles containing HTML entities such as an ampersand (&) appear in the list as the HTML code for them (&). Simple fix I guess... BigBlueFish10:43, 9 February 2006 (UTC)
Due to only the most recent version of the software being enabled, and being updated every few days, is it possible to enable multiple versions of the software, so that a particular download of the program lasts more than a few days, and then only require an update when a "major" update is made? The constant un-enabling of the different versions makes things quite difficult, as using AWB seems to require a new download practically every single time I use it.
A self-update feature in the program whereby it updates itself to the most recent version might make things less difficult as well.
Not that I would want to speak for Martin, but you might consider that AWB is still a development version and as such we – as early users – help to develop it by using it. As such I think the process that Martin uses (release often, release early) fits quite well. This also includes the early deprecation of older versions (it is good to use and test the new versions as they come out, as this helps improve the software too). This helps to make AWB strong and keep the workload for Martin as low as possible. I understand that this is a bit ugly for the users (me included!), but given the fact that the installation of AWB is trivial (just copy it in a directory) and that the settings can now be saved/loaded, I think we do Martin (and thus ourselves and Wikipedia) a favor if we do take the time to download the newer version before letting it go over articles. A new release each day is not that bad. --Adrian Buehlmann07:35, 10 February 2006 (UTC)
I don't mind that older versions are disabled, but I would prefer that the check occurred at the "Make list" point and not after the "Start the process" button press. I know I can save the results and reload them, but it would be easier to switch to the new version if I didn't have to repopulate the article list. That's a small annoyance that I can live with though. -- JLaTondre15:17, 11 February 2006 (UTC)
It's because it bolds the first occurance of the title if it is near the very beginning of the article, other bold text does not occur at the beginning and the title is not bolded anywhere else in the article. The problems you point out are because I made a silly error in the code, I have uploaded 1.881 now, which fixes this. thanks Martin19:57, 11 February 2006 (UTC)
Minor edit setting not working?
I just starting work on Wikipedia:Bad links with AWB. AWB is such a great tool for this project and saves quite a bit of time. However, for some reason, the setting "mark all edits as minor" isn't working. All my edits are being marked as major edits even though the setting is checked. I've also tried closing AWB and opening it again with no luck. Is this a bug or user error? PS2pcGAMER (talk) 22:34, 11 February 2006 (UTC)
Nevermind, it is working now. If you want to investigate this further, let me know. Otherwise I will just mark this as a goof by me. --PS2pcGAMER (talk) 22:38, 11 February 2006 (UTC)
I'll tweak it so it sets the checkbox on load as well, at the moment it only sets it on save, but I can't see anything wrong with it. Martin22:50, 11 February 2006 (UTC)
While going through Wikipedia:Bad links, I've been getting some weird results every 1/20 pages that I (luckily) catch in the "show diff" and fix manually. I'll review the code and let you know if I make any progress as to the cause of this. I can't give you a sample edit, though, since I fix them manually. --M@thwiz202000:40, 12 February 2006 (UTC)
Thanks! I'm going to stop going through the bad links for the night pretty soon, so just upload it whenever you can, and I'll download it tomorrow. I have other work to do now (although whether or not I'll do it now is another question - I'm a bit of a procrastinator.) --M@thwiz202000:58, 12 February 2006 (UTC)
I've uploaded 1.89 now, with the above fixes and numerous others, including stopping links being clicked in the browser window. I probably did other stuff too, I'm just to tired to remember now. Martin01:09, 12 February 2006 (UTC)
Does one need to have a bot flag on their account to use "bot mode" - I'm listed in the access as a bot. I have a 8000 entry list to do and I rather automate over clicking 8000 times :) Tawker12:22, 12 February 2006 (UTC)
I couldn't say if you need a bot flag or not, but the auto mode is virtually untested at the moment, once it is reliable then I have no problem with you using it. Martin12:31, 12 February 2006 (UTC)
From testing, it doesn't work if I have my bot username in the bots field, it still won't give me the field. Not sure if I want to use Auto on adding subst: anyways, I don't know if I'd trust the python bot to do it.) Tawker23:41, 12 February 2006 (UTC)
Adding it to the bots isnt meant to enable the automode, When it is more thoroughly tested I will enable it for some users. Martin23:48, 12 February 2006 (UTC)
Martin, just as you have the "enabledusersbeginshere" tag, you could have a "botslistbeginshere" tag that tells the AWB if a user is a bot. --M@thwiz202023:56, 12 February 2006 (UTC)
I used to be able to use the "In Template Rule" to replace parameters and/or text within nested params (e.g. {{B}} is inside of {{A}}, and I need to change something in {{B}}). However, recently I can only affect the top-level template (e.g. I can edit stuff in A but not B). Is this something implemented in the most recent version, or am I doing something stupid? Primefac (talk) 19:07, 26 December 2016 (UTC)
I filed an admin task on this but then saw this user is CU-blocked so I closed as invalid. Task for anyone interested is phab:T257146 in case they want to chase it for some reason. Archiving otherwise. --Izno (talk) 17:27, 5 July 2020 (UTC)
Module bug
When I used Autowikibrowser to adding Category:Module to Module page,it looked like a bug(because it show a red text,and the pages can't add to the category,so what should I do(Is it Autowikibrowser bug,i'm a Wikia aka Fandom administrators.)? — Preceding unsigned comment added by HansJie (talk • contribs) 11:12, 29 July 2018 (UTC)
You would need to agree with Reedy to create a new release of AWB, which we'd only normally do soon after a previous release if the fix is for a severe/significant issue. Rjwilmsi23:07, 30 November 2019 (UTC)
This is phab:T247694. Archiving this shortly even though it is not currently in a release version in an effort to consolidate bug reporting. --Izno (talk) 17:34, 5 July 2020 (UTC)
TypeInitializationException
I am getting this following error on AWB:
TypeInitializationException
Exception:
TypeInitializationException
Message:
O inicializador de tipo para 'WikiFunctions.Parse.Parsers' accionou uma excepção.
Call stack:
em WikiFunctions.Parse.Parsers..ctor(Int32 stubWordCount, Boolean addHumanKey)
em AutoWikiBrowser.MainForm..ctor()
Inner exception:
TypeInitializationException
Message:
O inicializador de tipo para 'WikiFunctions.Parse.SiteMatrix' accionou uma excepção.
Call stack:
em WikiFunctions.Parse.SiteMatrix.GetProjectLanguages(ProjectEnum project)
em WikiFunctions.Parse.MetaDataSorter.set_InterWikiOrder(InterWikiOrderEnum value)
em WikiFunctions.Parse.MetaDataSorter..ctor()
em WikiFunctions.Parse.Parsers..cctor()
Inner exception:
XmlException
Message:
A tag de início 'img' na linha 21, posição 38, não corresponde à tag de fim de 'a'. Linha 22, posição 3.
Call stack:
em System.Xml.XmlTextReaderImpl.Throw(Exception e)
em System.Xml.XmlTextReaderImpl.Throw(String res, String[] args)
em System.Xml.XmlTextReaderImpl.ThrowTagMismatch(NodeData startTag)
em System.Xml.XmlTextReaderImpl.ParseEndElement()
em System.Xml.XmlTextReaderImpl.ParseElementContent()
em System.Xml.XmlTextReaderImpl.Read()
em System.Xml.XmlLoader.LoadNode(Boolean skipOverWhitespace)
em System.Xml.XmlLoader.LoadDocSequence(XmlDocument parentDoc)
em System.Xml.XmlLoader.Load(XmlDocument doc, XmlReader reader, Boolean preserveWhitespace)
em System.Xml.XmlDocument.Load(XmlReader reader)
em System.Xml.XmlDocument.LoadXml(String xml)
em WikiFunctions.Parse.SiteMatrix.LoadFromNetwork()
em WikiFunctions.Parse.SiteMatrix..cctor()
Comment:@Magioladitis: I can report similar errors and my default wiki is Wikisource. Presumably we have something coming out of the most recent Mediawiki update. I had an existing login that worked last weekend, and not working today. :-( — billinghurstsDrewth08:44, 12 December 2019 (UTC)
Using Win10, AWB 6.1.0.1, when I preview an article being edited, it appears that it does so using an IE11 browser object (Internet Explorer version: 11.0.18362.836 / .NET version: 4.0.30319.42000 / Windows version: 6.2). I'm seeing a number of problems using it. Perhaps someone can tell me if these are reported yet (and if they are, what I should search for on phab since I couldn't find anything relevant ):
Previewing 2018 VP1, in the infobox, |mp_category= uses {{Hlist}} to create a bullet-separated list. In the browser, this renders "ApolloNEO", with no space or bullet between the two. There are other styles that don't render quite right either. It renders correctly in Firefox and also when I browse the page from a separate instance of IE11 (though see below).
Clicking on links to sections from the table of contents does nothing, nor do links to references.
If I scroll down to the reference list and (left-)click on a linked reference, the link turns purple as though it has been visited, but nothing happens. If I right-click on the link and choose "Open", nothing happens. If I right-click and choose "Open in new window", it opens a large IE11 window (~ 1700×900 px – almost the full size of my 1920×1080 px display). The website being browsed, though, ends up in a separate, borderless (1000×460 px) window near the upper left with extremely small text. The image map for its menus is also confused. I get the "DATA" menu highlighted when the mouse pointer is down/over by "Read MPEC 2018-V41". The vertical scroll bar doesn't work if you try to grab it where it appears. It does work, though, if you grab what would be the original (large) browser window's scroll bar (near the right edge of the screen). If I reduce the size of the "large" browser window, the smaller window reduces in size also. The place you have to grab the scrollbar moves also, remaining at the right side of the large browser window. Here's a (cropped) image of the original (before I resized it):
Yes, the webbrowser implementation we use in AWB that's part of C# WinForms has a number of limitations that I don't think we can do much about. Rjwilmsi18:32, 26 May 2020 (UTC)
Option: Do not apply WP:MOS fixes
The option shown above when ticked still applies the MOS:ORDER fixes which itself is bug on the current release. If this option was working it would be a workaround, but it doesn't work with MOS:ORDER either. Sun Creator(talk)17:27, 11 December 2019 (UTC)
This general fix for the incorrect accessdate parameter is a false positive as the accessdate for that reference was not "2016-10-08". It could have been "2016-10-18" or "2016-10-28" but in this case, it was none of them so I reverted it. However, why is AWB putting a random number in front of "2016-10-8" without knowing what the real accessdate is? Pkbwcgs (talk) 13:13, 2 January 2020 (UTC)
This seems to be another strange edit. Why is AWB assuming that if the date parameter is set at "2017-8-3", it means "2017-08-03"? Pkbwcgs (talk) 13:31, 2 January 2020 (UTC)
I don't see the issue; if it was supposed to be -18, then regardless of whether it's -08 or -8 it's wrong. However, the accessdate param requries it to be in full YYYY-MM-DD form, so -8 is definitely incorrect. I'd rather be wrong and formatted correctly than wrong and formatted incorrectly. Primefac (talk) 16:27, 2 January 2020 (UTC)
Genfixes wants to insert Reflist when references tag is already on the page
AWB wants to insert {{Reflist}} on 2019–20 coronavirus pandemic despite the code already containing <div class="reflist columns references-column-width" style="-moz-column-width: 30em; -webkit-column-width: 30em; column-width: 30em; list-style-type: decimal;"><references responsive="0"> and </references></div>. --Ahecht (TALK PAGE) 19:35, 23 April 2020 (UTC)
This is not a bug. The editors of that page deliberately substituted templates. Archiving shortly. --Izno (talk) 17:57, 5 July 2020 (UTC)
Edit window syntax coloring all gray after reverting a suggested edit
If, in the diff section, I double click on a line to revert a change that was made automatically (by general fixes, custom regex, etc.), all the text in the edit window changes to be highlighted in dark gray (#D3D3D3) instead of the normal syntax coloring. Is this an existing reported bug (I couldn't find it in phab)?
Also, is there an existing feature (or feature request) to show the line number the cursor is on in the edit window? When trying to find in the edit window an edit shown in the diff window as being on line number nnn, it would be far easier if I could see those line numbers in the edit window. Or is there perhaps a better way to get to a place in the edit window relevant to a change in the diff window? Thanks. —[AlanM1 (talk)]—01:00, 22 May 2020 (UTC)