This is an archive of past discussions on Wikipedia:AutoWikiBrowser. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
I'm trying to make a list from Special:Log/Newusers, but I'm not getting any users whose talk pages don't yet exist, even if I uncheck "Ignore existing pages" in the "Skip articles" section. I deduce that this is because the code added or tweaked per the request at Wikipedia talk:AutoWikiBrowser/Archive 4#suggested functionality addition assumes that one would only want users with live talk pages. But I want to find users without talk pages, and I suspect that the unchecked "Ignore" option never comes into play because the generated list must first include the desired pages. If so, could this be fixed so that all users in the desired portion of the log are represented? If this is done, the default behavior of skipping non-existing pages should automatically provide the current functionality, and folks in my situation will be accomodated as well. Thanks. ~ Jeff Q(talk)00:29, 12 September 2006 (UTC)
I would like to use AWB's excellent mechanisms for fetching pages and examining their content to generate a list of pages with challenging editing problems. The idea is that AWB can find problem pages matching a specific pattern, but the fix to each page may take some research, so it would be nice to simply generate a list for offline work. However, I haven't come up with a decent way to do this. The "Make list" filter only works on page names, as I understand it. The skip articles can identify target articles (or filter out non-targets), but only to perform an operation on them — they toss the page off the page list whether or not they perform the operation. (Tagging the articles for attention is an option, but I'd prefer to create an offline list rather than edit each article twice, once to tag and once to fix.) Nor can I see how to use the "Find and Replace" options, even the "Advanced" rules, to manipulate either the page list or a separate file (like a log). Do the experienced AWB users here have any advice for this AWB newbie? Thanks. ~ Jeff Q(talk)00:51, 12 September 2006 (UTC)
If you can program in C# or VB.NET your best bet would be to make a plugin. It would be very simple to implement. You'd build your list, AWB would send the text of each article to the plugin, the plugin would analyse the content and write it out to a log and just tell AWB to skip the page (so AWB wouldn't actually do any edits). You wouldn't need a fancy user interface or anything so you could do that with a few lines of code and some regular expressions. --kingboyk10:10, 12 September 2006 (UTC)
Sounds like fun. You don't happen to know of any cheap (and legal!) C# or VB.NET programming tools, do you? I can't even afford to upgrade my Windows OS with Microsoft's monopoly-enabled fees. ~ Jeff Q(talk)21:12, 12 September 2006 (UTC)
Microsoft Visual Studio Express. The bees knees. AWB is developed in the C# version. My plugin uses VB.NET (which, of course, all the best programmers use - isn't that right Martin? ;)) --kingboyk 21:17, 12 September 2006 (UTC) PS There are Java and C++ versions too, but I can't vouch for either of them as I haven't used them. --kingboyk21:19, 12 September 2006 (UTC)
Cool! I've been wanting to try out C# after having read an article about it a few years back that made it look better designed for OOP than than C++'s grafting of OO onto C. (Ugh, what geeky alphabet soup.) Thanks for the info. ~ Jeff Q(talk)00:38, 13 September 2006 (UTC)
I'm as happy to bash MS as the next guy (my first PC had Linux on it over 10 years ago), but dotnet is OOP heaven. When I first read a massive tome on it every page was "wow, it does that?" and "that's clever". It's first rate. Definitely as a C++ programmer you want to use C#. I'm using VB.NET as I have a lot of experience with VBA in Access, and VB6. They all compile to the same Intermediate Language so, with a very small number of exceptions, they all do pretty much the same thing. Good luck and let us know how you get on! --kingboyk09:31, 13 September 2006 (UTC)
Oh noez! A programming language thread ;) ! C++ does have it's merits. But not for those using old fashioned C programming paradigms (read a decent book that explains things like RAII). I admit, average joe programmer is quick at achieving progress in C#, as such it isn't a bad language. It's also cool for rapid prototyping. --A C++ freak ;-)09:51, 13 September 2006 (UTC)
Have you tried C++.NET? Is it any good? Or are they incompatible bedfellows?
Horses for courses. I'm into rapid application development. I have no desire to write device drivers, no ability with art so no interest in creating fancy graphics etc etc. I also think there's a certain amount of snobbery about low vs high level languages. Indeed take C# vs VB.NET - VB can do almost everything that C# can do, but it's a higher level language. Surely that makes it better? (unless coming from a C background). --kingboyk10:16, 13 September 2006 (UTC)
bypass wikilinks while scanning database
1) could there by an option "ignore wikilinks" (into wiki database scanner) ?
2) Feature, automated searching the list of articles (for a MISTAKE) – i.e. articles are created from database but many of them are already fixed (database gets out-of-date soon). To eliminate those "fixed" articles I load a new settings with only one regex/string matching MISTAKE, then set "skip when no change/replacement made" and push "start the process" – if it find "no change/replacement" those "no needed" (I don't want to apply general/other fixes for them etc. if no MISTAKE is available anymore) articles are removed from the list, but the process stops when MISTAKE is founded. The thing is to check all articles automatically in this case (similar to "auto save"), like auto ignore (remove from list) if there's no MISTAKE, leave the article on the list if MISTAKE is founded, and check consecutive articles, could this be implemented in the future version? gregul
I'm not sure what you mean by ignore wikilinks? As for the second idea, if I understand correctly, this has been suggested before, but I refused on the grounds that it would be a large drain on servers to have people crawling through thousands of pages. Martin13:55, 12 September 2006 (UTC)
2) I will be doing this by switching articles by hand anyway, this's for not doing redundant edits which will be included into database
1) Not to search into [[pl: ]] [[se: ]] etc. (it's called "ignore interwiki links" as in "find and replace") --gregul
now if i search through database – sometimes there's nothing to change because im ignoring interwiki into "find and replace" gregul
404 on startup with nonstandard Default.xml
Once again, this is using AWB with en.wikinews, I've overwritten the default config .xml file with that detail, plus setting the EnableRegexTypoFix option. Now, whenever I start up AWB I get a 404 error, my guess it is perhaps looking for a page of regexes on Wikinews. If this is the case, can you let me know what I'd need to create on wikinews, and where I'd need to copy from? I'd love to be able to include fixes to change quotes from MS Office into plain quotes - they break our PDF/print edition.
Steps to reproduce are, File->User and project preferences, set project to Wikinews, select make from Category, enter a recent date (eg September 1, 2006), click on the More options tag and select Enable RegexTypo Fix, uncheck Skip article, click Make list, select File->Save settings, overwrite Default.xml, quit AWB, restart and observe the error, should be: The remote server returned an error: (404) Not Found. --Brianmc17:35, 12 September 2006 (UTC)
Thank you for this, I've copied the typo list from wikipedia and added it to my watchlist so I spot updates. I really appreciate this tool and have made some significant changes on Wikinews with its help. --Brianmc20:14, 12 September 2006 (UTC)
No. It's quite usual, alas, for it crash after a thousand or more edits, but I've never had it crash after one or two. --kingboyk22:33, 12 September 2006 (UTC)
Hmm.. I also have the same problem, I tried both 3.0.2.9 and 3.0.3.0 and they crashed on my first and second edit. Dunno why though. --WinHunter(talk)01:25, 15 September 2006 (UTC)
XML settings bug?
Loading these settings (make list from category) I get an error at
if (reader.MoveToAttribute("index"))
listMaker1.SelectedSource = (WikiFunctions.Lists.SourceType)int.Parse(reader.Value);
in UserSettings.cs.
Settings (tested with plugin deleted, it's not a plugin issue) -
On further inspection I think the settings are getting saved incorrectly, and "selectsource index" should be "0", not "category"? --kingboyk14:38, 13 September 2006 (UTC)
Make list from category - first 200 articles. Sometimes I want to sample the category and not get the entire thing (especially if contains 100,000 articles!). Links on page for a category page doesn't currently work; an alternative to my request might be to makle links on page for a category page work i.e. it returns the listing on the first page.
My Auto Wiki Browser refuses to believe that I'm logged in, even though I very obviously am. As you can see at the screenshot to the left, I had logged in successfully, yet it was still prompting me to log in again. What on earth is the matter? Ingoolemotalk04:48, 16 September 2006 (UTC)
I'm finding that when I create a list of articles from multiple large categories, AWB omits a substantial number of the articles. Specifically, the subcats of Category:Orphaned articles, there are about 17,000 articles listed, and when I create a list from them, many articles are left out of the list (several hundred at least), even if I try it twice. So um... any ideas? If this is a known bug, is there any reliable tool to generate a list of all articles in a large category? --W.marsh14:29, 16 September 2006 (UTC)
Ah I'd glad you posted this. When I build a listing of Category:Living people I get 120,000 or so articles. If I build a list of Category:Biography articles of living people (talk pages tagged with living=yes) I only get 101,000. I do a bot run and discover that thousands of my remaining 20,000 or so articles already have living=yes. Mediawiki hasn't updated the category properly (unlikely, because the job queue runs often enough on WPBiography); the list comparer is broken (possible but I don't think it's this); or there's something wrong with the list grabbing from large cats. --kingboyk 14:37, 16 September 2006 (UTC) PS My plugin keeps a log so I can furnish a skipped list if need be.--kingboyk14:37, 16 September 2006 (UTC)
I've noticed this, as it only occurs on very large categories, I half suspect it is the queri API rather than AWB, but I'll find out for sure soon. Martin15:21, 16 September 2006 (UTC)
Adding wikiproject banner to talk pages
Hi, is it possible to add a wikiproject banner to the talk pages of articles using AWB? I clicked on more options, clicked on append message and wrote down the banner of the project {{WP India}}. And then I saved. But nothing happened. Please suggest -- Lost19:50, 16 September 2006 (UTC)
Thanks, I did set all settings as far as I could gather. But not able to do it. Help would be greatly appreciated. -- Lost04:52, 17 September 2006 (UTC)
Purpose of AWB?
I'm sorry if this is a stupid question, but what is the actual purpose of AWB and/or what is the main function of it that makes it superior to simply going around in IE and editing pages? The article doesn't exactly make it clear (to me). I tend to see mainly spelling and grammar errors corrected with AWB tags in the change-log. What exactly does AWB allow you to do? TheHYPO00:42, 17 September 2006 (UTC)
AWB can be used for repeating the same task over and over and over and over again. Like adding a template to every page in a category, (even hundreds of them). It can also be used to do tasks like update a link or image on pages. It is not designed to replace your normal browser, or to be your primary editor. Some bots run solely using the find and replace utility of AWB. — xaosfluxTalk02:02, 17 September 2006 (UTC)
User login
Hi, I'm using AWB on Swedish Wikipedia, and it works well. To my knowledge, I have not entered my username in AWB or its config files, still AWB is logging in with my standard login. How can it work? Magic? I'm clogging down the RC with my edits though, how do I make AWB login as my bot account? //Knuckles06:22, 17 September 2006 (UTC)
Yes, it's magic :) Actually, no, it's because AWB uses the Internet Explorer engine. You'll have to log out of Wikipedia in IE and log back in as your bot. If you want to run AWB and do manual edits at the same time using 2 different accounts, use IE for your bot and do your manual edits in another browser like Firefox or Opera. --kingboyk09:13, 17 September 2006 (UTC)
Hi there. Occasionally, people will come across the Black Mesa (game mod) page using AWB and "correct" the spelling of a person being quoted, even when the spelling is in their exact words. Is there a way to prevent this? Thanks. Viewer06:28, 17 September 2006 (UTC)
Put "(sic)" or "[sic]" next to the intentional spelling mistake, and any AWB user with any wits about them will know it's as quoted and leave it? --kingboyk09:10, 17 September 2006 (UTC)
What to do, what to do?
I have just downloaded and been registered for AWB and looked at the Terms and Conditions. Does spell-checking - the task I plan to complete with it - class as unecessarily minor edits? Thanks. Ck lostsword|queta!|Suggestions?17:25, 17 September 2006 (UTC)
Are you sure? I'd been under the impression from Wikipedia:Minor edits that simple spelling corrections were minor edits. Where I'm still fuzzy though, is whether the addition of a {{stub}} template counts as minor or not. --Elonka18:15, 18 September 2006 (UTC)
Err... no, this is a seperate issue. You're talking about "do I tick the Wikipedia 'minor edit' box or not?". The original question was about the terms and conditions of AWB and not making "unneccessary minor edits". --kingboyk18:19, 18 September 2006 (UTC)
In addition to Kingboyk, the way I see it, an unneccessary minor edit is one that doesn't change the presentation, how the page looks to end-users, like the examples listed under Rules of Use and many of the ones under General fixes. Spelling is not an unneccessary edit, because it improves the quality of the article. Harryboyles12:30, 20 September 2006 (UTC)
Memory leak?
Again, this is something I've observed before I started using a plugin, so the problem is within AWB itself. I've found that AWB memory usage can increase steadily throughout a session until it's at 400MB or more of physical RAM. Also in the past it's been normal for me to wake up in the morning and find that AWB stalled throughout the night. To counter the second problem, I've added a feature to my plugin to stop and restart AWB if the list isn't empty and it doesn't send any articles to the plugin in 10 minutes. Unfortunately that has the side effect of trying to keep AWB running if it's struggling for memory.
My machine has 1GB of memory, but this morning when I got up both of my AWB processes had crashed as out of memory and, rather annoyingly, they'd taken my Firefox with umpteem open tabs up down with them. I can only imagine that certain resources aren't being disposed of correctly or objects are somehow kept alive when no longer needed. Any ideas Martin and has anyone else doing thousands of automated edits noticed this? --kingboyk10:04, 18 September 2006 (UTC)
It's the IE control, it seems to want to cache pages. I have never had any problems with it, even on large runs, maybe your IE has a different option set to cache pages in a different way or something. Martin10:08, 18 September 2006 (UTC)
Ah. Well, remember the issue I had with gigs of pages being cached? Also, since I zapped that cache my MSDN help viewer has been f*cked too. My version of IE must have problems. Any registry settings or owt you know of to help fix it? --kingboyk10:20, 18 September 2006 (UTC)
My Pc's got 2GB of ram in, and 1GB of page file. I've had AWB running for long enough to have to close it due to using all the page file. I havent really done AWB runs recently, or any large ones, but it seems to be a bit better in the newer version.
Ive noticed it during any .NET app that i've created, whenever you open or close forms, and press buttons and the memory usage just keeps increasing! I know people run AWB bots and stuff, with i think mboverlord running one quite a lot... And martin, you have bluebot don't you? Reedy Boy10:41, 18 September 2006 (UTC)
I regularly do runs of multiple thousands of edits without any problem, the memory usage does get quite high after a couple of thousand, but it seems to reach a ceiling eventually. Historically there was a problem with stalling occasionally, but that particular problem has been solved. Martin13:10, 18 September 2006 (UTC)
I made ~3000 edits today, I noticed memory usage went up to ~300mb, then IE seemed to purge itself and it went right back down. Martin18:27, 18 September 2006 (UTC)
Pre-loading pages
As a feature request, would it be possible to have AWB pre-load a page? Sort of like running a tabbed browser? I notice that when I'm running through a long list (such as Special:Uncategorizedpages), that I usually only need a few seconds to actually decide what to do with a particular page, but that it takes just as long to wait for the next page to load after I hit "save." If AWB could be pre-loading the next page in the list, while I'm making the decision on the current one, that would speed things up considerably, as I wouldn't have the "wait for page load" delays. --Elonka18:20, 18 September 2006 (UTC)
It would be pretty difficult to implement. Normally the delay is fairly insignificant, but the servers have been slow for the last couple of days. Martin18:27, 18 September 2006 (UTC)
Also, sometimes I use AWB from a dialup location (yes, I know I have masochistic tendencies <grin>), so in those situations, the pre-load would still be very useful. :) But I understand if it would be too difficult -- I figured it couldn't hurt to ask! --Elonka18:54, 18 September 2006 (UTC)
Would it be that difficult? You could grab the next page in advance using an invisible webcontrol object. Of course, some nasty plugin developer might come along and have a plugin programatically insert new items at the top of the file list though ;) --kingboyk19:04, 18 September 2006 (UTC)
One way to speed up the loading process in general is by setting Internet Explorer to not load images. Back in User:JoeBot's hayday (i.e. avoiding studying for finals), it really helped for long lists of articles with misspellings. JoeSmackTalk18:26, 19 September 2006 (UTC)
As a follow-up, I've found that I can also speed things up by keeping open two instances of AWB, each working on a different section of a list. That way as soon as I click "Save" on one, I can flip to the other version and work on its page, while waiting for a new page to load in the first version. It'd still be a bit easier if it worked like Firefox with the animated "swirl" on a tab showing me that a page was still loading, but this way works too. :) --Elonka23:37, 21 September 2006 (UTC)
I'd really rather not deal with HTML from Wikipedia pages, as I think that's a job which should be encapsulated in the webcontrol :) My plugin doesn't get its hands dirty with such things, it just processes article text :) Nice suggestion though; any chance of implementing something like that into the webcontrol Martin? --kingboyk14:30, 19 September 2006 (UTC)
The html snippet above is missing in the page returned from the "&action=edit" URLs, so that would mean an additional load of the whole page before going into edit of the page. Very inefficent, but it could work. Maybe we could "simply" ask the MediaWiki devs (duh!) to include the revision ID somwhere in a html comment of the "&action=edit" page. We could try to make a patch for MediaWiki. But I don't know what chances that patch would have to get live ;-)... --Ligulem14:47, 19 September 2006 (UTC)
"The html snippet above is missing in the page returned from the "&action=edit" URLs" Doh! How bizarre - I imagine they'd be quite happy to rectify that? --kingboyk14:48, 19 September 2006 (UTC)
It would make sense to put the revision id in with the other javascript variables, after there are similar variables already there:
<script type= "text/javascript">
var skin = "monobook";
var stylepath = "/skins-1.5";
var wgArticlePath = "/wiki/$1";
var wgScriptPath = "/w";
var wgServer = "http://en.wikipedia.org";
var wgCanonicalNamespace = "Project_talk";
var wgNamespaceNumber = 5;
var wgPageName = "Wikipedia_talk:AutoWikiBrowser";
var wgTitle = "AutoWikiBrowser";
var wgArticleId = 3625052;
var wgIsArticle = false;
var wgUserName = "Bluemoose";
var wgUserLanguage = "en";
var wgContentLanguage = "en";
</script>
Hmm. No hotline from my side. Don't expect a fast response from a bugzilla entry. An entry on bugzilla with a patch has much better chances. We could ask on wikitech-l what would be the best way to implement that. Think I should definitely start doing a test MediaWiki install finally. I've synced to the sources with SVN lately to dig up something for the village pump [1]. Oh well. --Ligulem16:25, 19 September 2006 (UTC)
Problems w/ running AWB under VirtualPC
I've been trying to persuade AWB to run on Windows XP running on my PowerMac by Virtual PC. I can open the program fine, can log in and get the green light, but when I try to set AWB running I get an edit window appearing at the top, the text appearing in the box on the bottom-right, and the message "Loading changes" appears in the bottom left. However, that message doesn't go away, and the process then restarts (I get the "restarting in 6..5..4..etc, and a "Page cannot be displayed" message in the top window). Does anyone have any suggestions for getting this running properly? Thanks. Mike Peel17:01, 19 September 2006 (UTC)
Since you have my plugin installed, I'd suggest starting by exiting AWB and deleting the file "Kingbotk AWB Plugin.dll" from the AWB folder. Then try again, and report back here. Let's keep things simple by finding out if vanilla AWB works. --kingboyk17:15, 19 September 2006 (UTC)
Tried again with a fresh copy of AWB, version 3030. All preferences should be as standard. Exact same problem. In case it matters, it's located in C:\Documents and Settings\Administrator\Desktop\AutoWikiBrowser. I'm running Windows XP SP2, with .NET framework 2.0, under Virtual PC for Mac 7.01 on Mac OS X 10.4.7. The only other application installed on XP is Virtual Machine Additions. I've not installed any XP updates past SP2. Mike Peel18:38, 19 September 2006 (UTC)
Yes. I have no problem accessing the internet from the virtual machine in general. Nor have I experienced any unusual delays in loading pages. AWB gets the contents of categories, and the initial page contents, without a problem. It then seems to stall when applying changes to the content. Mike Peel19:26, 19 September 2006 (UTC)
Though it would be interesting if it did work on that platform, as I have no way of testing/debugging, it is almost impossible for me to work out what is going wrong. If I had to hazard a guess, I would say that the IE control doesn't work 100% on that platform. Martin10:11, 21 September 2006 (UTC)
I've just downloaded the latest version of AWB, and it seems to be working fine now. I guess that the bug was one of those fixed in the last few versions. My thanks to both of you for your help. Mike Peel16:33, 6 October 2006 (UTC)
Probablky you don't want to be asked about plugins, but I'll try anyway.
Here's the situation. On WP:CFD we deal with lots of moves daily. Currently some of we have pywikipedia wrappers to do the task. For instance at Wikipedia:Categories for discussion/Working we have listings with some fixed structure. For instance today I would copy and paste the webpage into notepad to get
# Category:Fictional aerokineticists to Category:Fictional wind manipulators
# Category:Fictional atmokineticists to Category:Fictional weather manipulators
# Category:Fictional chronokineticists to Category:Fictional time manipulators
# Category:Fictional cryokineticists to Category:Fictional ice manipulators
# Category:Fictional biokineticists to Category:Fictional characters with healing powers
and then a python script uses a regex to extract the category and process them in batch with pywikipedia.
Caveat. Pywikipedia category script isn't very good. It doesn't handle well catgories included as
<pre>
[[category:category name|some parameter]]
</pre>
and it dies on categories added to redirect pages. Thus, things can't be really handled automaticlaly anyway. But AWB seem to be able, so I'm humbly asking for you helping CFD, and consider writing a plugin that does basically the same. Takes that structured listing, extract the category names and then.. either move or remove the category (could be 2 plugins) with AWB. -- Drini18:49, 19 September 2006 (UTC)
I don't have the time to help at the moment, alas.
I was wondering though - AWB can recategorise out of the box can't it?
I noticed that if I drag and drop an XML file onto the browser control in AWB, AWB tries to start processing. Presumably this is an affect of IE raising an event or is it actually intentional?
On a related note, the behaviour I was looking for is being able to drop an XML settings file (perhaps onto the options tabs) and have AWB load those settings. Might be cool. Yay or nay? --kingboyk15:37, 20 September 2006 (UTC)
Adding articles to categories
I added a bunch of Wikipedia pages to a new category using AWB, and somebody has complained that they weren't sorted. This has led me to thinking that if AWB is adding non-mainspace pages to categories maybe it ought to add the PAGENAME variable as a sort key automatically? --kingboyk09:49, 21 September 2006 (UTC)
This is probably a futile hope, but I was wondering if there's any (reasonably simple) way to adapt the AWB program to use on another (outside) wiki (using mediawiki software). It's rather time consuming to propagate a new wiki, and I thought some sort of at least partially automated editing system, would make the task a bit easier, and programming bots seems like far too complex an endeavour... TheHYPO03:05, 22 September 2006 (UTC)
I think it would be great if there were an option to set a settings file as the default settings on startup. I know you can replace the default.xml with one of your own but it would be handy if you could do it within the program. Harryboyles09:32, 22 September 2006 (UTC)
I'd like a "further attention" button for making a list of articles I need to come back too later - during the couple of runs I've made I found articles with extensive problems that I can't really fix "on the fly". - Stephanie Daugherty (Triona) - Talk - Comment - 18:01, 22 September 2006 (UTC)
There's a "false positive" button, but you can't do a lot with it so far: it just dumps the name into a flat file for later retreival. It would be nice to develop this feature further. Your input would be invaluable. HTH HAND —Phil | Talk20:35, 23 September 2006 (UTC)
I'm aware that one of the points say "Don't edit too fast; consider opening a bot account if you are regularly making more than a few edits a minute.". I think I may be infringing on that: I am replacing album infobox ratings, i.e.(4/5) or with (the Rating-5 template). I do a few of these a minute, is this acceptable? -- Reaper X01:37, 23 September 2006 (UTC)
Actually I am. Mass mistakes is somethin I don't want on my track record. That and I am not computer-oriented enough to know a thing about how to get/make/do whatever with a bot. -- Reaper X 02:07, 23 September 2006 (UTC)
Gee, I'm going to seek help in opening a bot account. Thanks anyway. -- Reaper X02:46, 23 September 2006 (UTC)
Bot accounts aren't as difficult as it may sound. It just means that you can make multiple edits per minute, and it isn't going to show on recent changes i believe, but will on peoples watch lists.. You can run AWB in the same way, just log into the bot account Reedy Boy09:43, 23 September 2006 (UTC)
Protected pages
I'm not being allowed to save edits to protected pages on en.wikinews, yes I'm an admin, I run cleanup on old articles as they go into the archive. Someone has done protection on these, but missed things like multiple wiki links. I'm sure AWB has let me save edits to protected articles before, the only difference I'm aware of is I've added en.wikinews to my MSIE trusted sites so the Dynamic content on the front page works. (I'd take a fix for the dynamic content or a fix in whatever makes collapsable sections work on Wikipedia but not Wikinews (The style sheet perhaps?) --Brianmc17:09, 23 September 2006 (UTC)
Bot timer
Hi Martin: I have an idea for the bot timer, which I don't think is too difficult. Can you change the bot timer so that it starts counting when the previous edit finishes? Right now it starts when AWB is ready to make the edit. Perhaps you could do something like set a variable bottimerready = 1 when the previous edit is finished and then start the countup timer, and when the value of the timer equals the selected delay, set bottimerready = 0 and stop the timer and the program can only make an edit when bottimerready = 0. I'm sure you'll have a much better way to do this, but it's just an idea. —Mets501 (talk)22:27, 23 September 2006 (UTC)
I think the current method makes more sense, apart from being more simple, it naturally slows down the edit rate when the servers are slow. Martin16:23, 9 October 2006 (UTC)
The problems with it is that if AWB goes through 100 pages and skips 90 of them at the beginning, it will still wait to make the 1st edit after it hasn't edited in several minutes. —Mets501 (talk)18:12, 9 October 2006 (UTC)
Better with Every Version
Cheers Martin! - More useful features at our fingertips
This version doesnt seem to hog memory as much either!! And then it increases up, and then drops down to below 120Mb again =D Reedy Boy10:29, 24 September 2006 (UTC)
First time with regular expressions
Where am I going wrong with this regex substitution? (for year wikilinks in cvg infoboxes)
find: \[198(0-9)\]
substitute: [198$1 in video gaming|198$1]
That doesn't work.
If I change the find to: \[198[0-9]\] then it finds years ('1983' for example), but doesn't catch the digit for substitution (it actually puts '$1').
I finally got it working with: \[198(0|1|2|3|4|5|6|7|8|9)\], which substitutes correctly but that's a bit yukky looking. So what's wrong with (0-9) ?
Thanks for any help. Marasmusine22:10, 25 September 2006 (UTC)
Right. The square brackets [ define a range or series to match. The parentheses ( define a capture group. So, (0-9) searches for a literal "0-9" and captures it. [0-9] matches a digit but doesn't capture. ([0-9]) matches and captures. --kingboyk22:44, 25 September 2006 (UTC)
You might want to make use of the code for digits '\d'. As in:
\[198(\d)\]
or:
\[(\d{4})\]
Links such as '[1983 in video gaming|1983]' are sometimes called 'easter egg links' because it looks just like [1983] and has an unpredictable outcome. Some editors think that they are not a particularly good idea. You can see them discussed from time to time in wp:mosdate and wp:context.
Converting formats like '[August 11] [2000]' to '[August 11] [2000 in video gaming|2000]', as you have done, is not a good idea. This is because it breaks the date preference mechanism. This is an understandable mistake that many people make because of the unfortunate way that dates are handled in Wikipedia. Please can others help Marasmusine with this.
Is there a way for a AWB to start with a list of articles on main namespace, and add messages to the article's talk pages? If not, could you please add it, that would really help with adding project banners on talk page headers. Please advise. - Ganeshk(talk)05:39, 26 September 2006 (UTC)
In the list context menu there is an option called "Convert to talk" which converts the articles to the appropriate talk page. Martin09:34, 26 September 2006 (UTC)
Martin, I don't see that menu option on my AWB. I see following only:
Hi, I would like to notify you that there has been some namespace changes on dawiki. I would therefore like you to change /AWB/WikiFunctions/Variables.cs to the following:
Now I only use I.E. for AWB related purposes, on this machine, but I have for some time had a situation where AWB can get stuck on "Loading page to check we are logged in...." After this IE seems unable to access enWP, although other sites seem fine (including ang.wp).
Sometimes clearing the cache/cookies appears to help, sometimes rebooting. I have just cleared everything, installed the latest AWB and re-installed I.E.. nothing seems to work, so I'm reinstalling the dontnet framework. Has anyone else had this problem? RichFarmbrough, 21:36 29 September2006 (GMT).
Oh yes, and sometimes when it clears I get stacked "You are not logged in" dialogue boxen. RichFarmbrough, 21:45 29 September2006 (GMT).
I've not seen this before, but the next verison will have an improved mechanism for detecting loggin status etc. so hopefully it will be fixed anyway. Martin14:03, 30 September 2006 (UTC)
If that includes recognising that a "please login" winforms messagebox is already up and doesn't pop up that would be great (and fix a bug of mine where the plugin nudges AWB to restart and AWB pops up another login box, leaving me with 10+ of em when I get up ;)) --kingboyk10:55, 1 October 2006 (UTC)
Just a little more info, in case anyone else experiences it: The fix is (in I.E) "delete files" and check "delete all offline content". Not a real problem, now I can consistantly overcome it. RichFarmbrough, 20:08 4 October2006 (GMT).
Custumizing
Could you make a menu who we can choose the "general fixes" we wanted. Because on the other languages, there are fixes, we don't want (such as deleting more than one "return at the line", and subst'itution of "Vienne", etc.). Thank you very vbery much for these future feature ! 86.213.165.7113:39, 30 September 2006 (UTC)
also not everybody wants disambig at the bottom, so this would be good idea ;] gregul 17:53, 30 September 2006 (UTC)
again about "ignore external/interwiki links, ... images, ... " – does ignoring whole image section is done on purpose? (now it ignores complete [[image:...]] section) I think that ignoring could end on "|" character, because farther there's a description of image, which can contain matched strings
could you expand the above feature to "ignore into double brackets" and between [[and | ?
how about ignoring also galleries? and like above in 1. – only the name of images included into gallery, but without their descriptions, I think that could be included too
would there be a possibility (besides of modifying sources by myself) to ignore images from other-languages? "[[Image:" is there called in other way ("[[Grafika:" on plwiki), so in fact, that option isn't working completely
how can I avoid a situation when I founded some articles into database but in fact most of these are redundant, because I always set "ignore external/interwiki links, ...", so they won't be changed anyway gregul 21:25, 30 September 2006 (UTC)
These 5 things are important, so please answer what do you think gregul
The general fixes are designed for the en wikipedia, they would be far too complex to customise, as many rely on the fact that other functions have already happened. However, as most wikis use the same format, the general fixes should largely work ok with most projects, if they don't, then don't use them. Having said that, I can change things like moving disambig tags to only work on en wiki. Martin10:41, 1 October 2006 (UTC)
Date wikifying bug
A recent edit of Arsenal F.C. using AWB [2] exposed what is probably a bug. AWB looked to wikify all dates in the article; two of the article titles used in the {{cite web}} templates had dates within them, so the code changed from:
{{cite web
| title=Arsenal Holdings plc Results for the year ended 31 May 2006
to:
{{cite web
| title=Arsenal Holdings plc Results for the year ended [[31 May]] 2006
This broke the links when they were displayed. I have since reverted these two specific changes, thought I kept the rest. [3] Could this bug please be fixed? Thanks. Qwghlm18:50, 1 October 2006 (UTC)
I wouldn't call these links broken, they link fine and display fine. RichFarmbrough, 20:12 4 October2006 (GMT).
placing categories in the right place
can you include into general cleanup that in :pl.wiki (and others) a category name is [(k|K)ategoria and not only [Kategoria ? and perform change from small (if present) to big letter
[4] – of cource I can use a regex, but that would force me to do "general fix" after "find and replace" to have this placed in the right order
Yeah please, and the First letter of the name of the article in the category too ! And for the french Wikipedia, if any letter of the article is an accentued letter, then, writing in the category the title, without the accentued letters (ex : [[Catégorie:Travail du bois]] in Élagage des conifères => [[Catégorie:Travail du bois|Elagage des coniferes]]. Thank You very very much ! 81.51.91.23612:31, 2 October 2006 (UTC)
it seems (I'm not sure becouse i've got a lot of regexes included) that new AWB changes from [[kategoria: to [[Kategoria:, but still first letter remains small, this could be applied also for [[grafika: etc. (images, for all languages), and maybe for any other constructions that I don't remember at this time gregul
Help Creating Regex Expression for Bot
I'd like some help creating a regular expression to make some changes to an edit I recently made with my bot. Specifically, I'd like to replace:
{{Merge-date|MONTH YEAR|RANDOM TEXT}} with {{Merge|RANDOM TEXT|date=MONTH YEAR}}
{{Mergeto-date|MONTH YEAR|RANDOM TEXT}} with {{Mergeto|RANDOM TEXT|date=MONTH YEAR}}
{{Mergefrom-date|MONTH YEAR|RANDOM TEXT}} with {{Mergefrom|RANDOM TEXT|date=MONTH YEAR}}
Don't forget if you're doing this in a case-sensitive fashion, the first letter can be upper or lower case and still be pointing to the same template (due to Wikipedia's Mediawiki configuration). [Mm] would cover that. --kingboyk11:32, 3 October 2006 (UTC)
Thanks guys. I'm not running it case sensitive. You can check out the effects on Alphachimpbot. One correction to the above syntax (which is awesome) though: {{Merge\|$2|date=$1}} should be {{Merge|$2|date=$1}}, etc. (just remove the "\") alphaChimp(talk)17:41, 3 October 2006 (UTC)
Single replacement version: Find: {{(Merge[a-z]*)-date\|(.*?|)\|(.*?|)}}, and replace with e {{$1|$3|date=$2}} (assuming there are no other merge*-date templates, which would also be caught in this pattern). -- JHunterJ17:54, 3 October 2006 (UTC)
By chance I have just (yesturday) implemented the checkpage to work on other projects, it will be in the next version. Martin14:06, 3 October 2006 (UTC)
I made this change yesterday with AWB. As you can see in the diff, the div tag was moved when it shouldn't have been. Is this an easy fix? If it makes a difference, I was not using "apply general fixes". --Kbdank7110:26, 4 October 2006 (UTC)
That's a very rare bug, will only occur with maths tags inside image links using certain settings, but I have fixed it now anyway in the version I just released. thanks Martin13:17, 4 October 2006 (UTC)
OK, thanks. I really hope the error wasn't proliferated any further than that one article. (If you check the article, you'll see that the first pass by Alphachimpbot had no difficulties) Alphachimp14:29, 4 October 2006 (UTC)
Problem with CheckPage
In Main.cs in code
else
{
MessageBox.Show(UserName + " is not enabled to use this.", "Problem", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
System.Diagnostics.Process.Start(Variables.URLShort + "/wiki/Wikipedia:AutoWikiBrowser/CheckPage");
return false;
}
"/wiki/Wikipedia:" should be replaced with "/wiki/Project:", or it will send non-enwiki users to the non-existant page on en:. MaxSem17:44, 5 October 2006 (UTC)
I don't know if it's due to changes you've made in the latest version, whether it's because I cleared my IE cache manually again, or because I turned off image display, but I've just had an AWB run tag 4,400 pages without crashing. That's the best I've managed for a long time. I'll give you the credit Martin, so - thanks very much! :) --kingboyk13:09, 6 October 2006 (UTC)
I added a timeout function when loading/saving, that may have been it (as well as a quite bit of tweaking), I think the servers have been quick recently as well. Thanks! Martin13:33, 6 October 2006 (UTC)
Thank you for the speedy reply. There's a problem though; that also finds [[Interstate 495 (Massachusetts)|I-495]] and replaces it with [[I-495 (Massachusetts)|I-495]]. I believe this is because the variable can be any length, so it finds this false positive as long as there is a .svg somewhere after it. --NE201:32, 10 October 2006 (UTC)
I actually wanted Interstate( |_)([0-9A-Z]|)([0-9A-Z]|)([0-9A-Z]|)([0-9A-Z]|).svg, since there are routes like I-H201 and I-35E. Thank you. --NE203:14, 10 October 2006 (UTC)
"Messages can only be appended to talk pages"
I'm trying to nominate a large number of images for deletion, but I can't add the "message", since it only lets me do it on talk pages. Is there any way to bypass this? --NE201:51, 10 October 2006 (UTC)
Hi Bluemoose: I figure the IRCMonitor included with AWB should have an option "Only if comment matches regex:" similar to the existing "Only if title matches regex:". Perhaps you would consider that in a future release? Thanks. –Outriggr§04:27, 12 October 2006 (UTC)
it's definetely into "apply general fixes", because after unset, it doesn't try to do a change like this
[6]gregul
I've done a load more tweaking to make things work better in other languages, so this will be fixed in next release. Martin18:09, 12 October 2006 (UTC)
With the newest version, 3.0.4.1, I'm getting frequent "hangs" after I click on the "Save" button. Sometimes, yes, it's just slow, and if I wait a minute it'll automatically move on to the next article in the list, but sometimes it's gotten stuck for several minutes, until I click on "Ignore", at which point it wakes up and moves on. When I check the related article, the save is getting processed, such that if I instead click "Stop" and then restart the process, it skips the article as having already been handled. The problem seems be occurring about 50% of the time today... Is it just a problem with sluggish servers, or is there perhaps some other bug going on? --Elonka19:10, 12 October 2006 (UTC)
(update) Another odd behavior worth mentioning, is that it will display the correctly "saved" article in the upper panel, before it freezes. So it's definitely getting some feedback from the server. --Elonka19:16, 12 October 2006 (UTC)
Quite possibly due to server problems, but I have fixed a few things anyway, so hopefully the problem will disappear one way or other. Martin15:10, 14 October 2006 (UTC)
Help with regular expressions
Hello all! I'm just wondering if it's possible to do the following with AutoWikiBrowser using expressions...
For the N/S set, the first number can be 1-3 digits, the second 1-2 digits, the third 1-2 and 0-4 (##.###). The same pattern repeats for the E/W set.
I've just been searching for °, and it's been working, but I have some 1,000 articles to do and it's quite tedious to have to manually reformat each entry. Thadius85602:57, 13 October 2006 (UTC)
If there are other optional spaces, use more ( |) and don't forget to increment the variable numbers. --NE203:02, 13 October 2006 (UTC)
Works great. Thanks! Just for the record, I was wrong before... the N/S coordinate only goes to 90 deg at the poles (so there's no need for a 28th digit to, as N/S can never be 3 digits). It seems you already knew this, though. :) Thadius85620:54, 13 October 2006 (UTC)
Both seem to be working. However, I'm running into a problem with (some) articles where somebody input it across two lines.
coordinates = 33° 42' 50.8000" N
96° 40' 25.2000" W |
I tried setting both versions to multiline="true" and have had no luck. So, I attempted to modify NE2's version (placing the original as singleline="true" and my version as multiline="true"), but apparently I'm not advanced enough to figure it out. Here what's I tried:
However be aware it's not tested for the southern or eastern hemispheres. RichFarmbrough, 11:35 14 October2006 (GMT).
It doesn't appear to handle decimals in the seconds, but I do like \d better than [0-9] though, and it handles spelled-out compass points. Again untested:
I have added a note to the caution "Don't do anything controversial with it" to notify do-gooders that the Wikipedia mathematics community has repeatedly confronted the issue of automatic conversion of entities to Unicode, and firmly objects. Individuals writing articles are free to use UTF-8 characters if they prefer, but existing entities (like “θ”) should not be replaced. (Because many characters do not have HTML entity names, I keep a private page with a large table to use in my own edits, and inform newcomers of Code2000 and the upcoming STIX fonts.) We are not anti-Unicode, nor anti-bot; this consensus involves issues special to mathematics, such as coexistence with TeX markup and MathML. Your cooperation will be appreciated. --KSmrqT05:01, 13 October 2006 (UTC)
I'll change the code so maths articles are ignored automatically. Though I have to say, there is hardly a firm objection in the given discussions. In fact, the logic used against unicode is highly flawed, as unicode is already used virtually everywhere it could be. Martin09:23, 13 October 2006 (UTC)
Many thanks. One underlying problem is that mathematics markup for Wikipedia is an ugly mishmash of TeX, wiki, HTML, and UTF-8. There are bugs and limitations in the first two, and browser problems with the latter. The mathematics community outside of Wikipedia has standardized on LaTeX, and a project is in the works to make a better converter than texvc; it's called blahtex, and can produce lovely MathML. It already works well by itself, but use of MathML in Wikipedia output requires the pages generated by MediaWiki to be valid XML, and the developers have been slow to make the necessary fixes. The goal is to be able to use LaTeX markup exclusively, for both inline and display equations, and have it always look readable, attractive, and consistent. Since the TeX is going to use names like theta, it is preferable to use named entities for the same characters where possible.
For example, the fundamental trigonometric identity
But it is highly unsatisfactory inline, , where the font sizes, font faces, and baselines clash with the surrounding text. In fact, TeX theta alone looks bad inline, coming out either or depending on whether it is converted to a PNG image or not. Instead we can write sin2 θ+cos2 θ = 1 in an inline context, using wiki markup.
This illustrates the kind of crap Wikipedia mathematicians have to live with today. And whenever a well-meaning bot gleefully changes all our θ markup to θ, it not only makes our life more difficult in the present, but complicates our conversion to all-TeX markup (with beautiful consistent MathML typesetting) in the future.
Bear in mind that I am only relaying one of the arguments from our extended discussions, and not trying to do justice to all the issues raised. But I hope this helps others better appreciate that the request is not Luddite.
As an example of a wiki markup bug, nested superscripts should be able to use the obvious markup.
''A''<sup>''B''<sup>''C''</sup></sup>
No such luck, as the bug converts this to ABC. Instead, we must use a dodge like
One likely way forward, is to use AWB when Blahtex is ready to rock an' roll to convert all the current maths display methods. This would mean that there is little harm in unicodifying at present. What do you think? RichFarmbrough, 11:28 14 October2006 (GMT).
No, changing θ to θ makes life more difficult for mathematical editors even in the present. People who do not write a lot of mathematics think WYSIWIG is a boon. Not necessarily. Do a time-and-motion study, in the style of GOMS, for typing the name versus selecting a special character from a menu. Then add in the penalty of inconsistency, using \theta in displayed TeX versus a UTF-8 character inline. If we had a two-view editor, in the mold of Lilac, then we might get the best of both worlds. I don't see that happening here any time soon. --KSmrqT22:58, 14 October 2006 (UTC)
white space removal
Hi! I have a suggestion, to the 'skip' section, could a checkbox saying skip if only whitespace is removed be created, to help bots decrease the server load? ST47Talk10:23, 13 October 2006 (UTC)
Hi, the "Skip when no changes" does do this to a certain extent, but to be honest, a bot should be automatically skipping articles when the change it is making is not done. The only time this may not be possible is when the change is part of the "General fixes", but that is why I made the "More" skip options. Martin10:37, 13 October 2006 (UTC)
stub on bottom
it isn't good if "section stub" is moved at the end, because section stub relates to a section, not to whole article, could this option be unset in some way, in other languages moving stubs at the end isn't welcome gregul
many "general fixes" are useful everywhere, but yes, the name is {sekcja stub}, I also thought about not moving {stub} or setting this after an article (before categories), many admin's bots move it this way gregul
Error with 3.0.41
"The system cannot find the file specified. (Exception from HRESULT: 0x80070002)" Problem occurring at beginning of run, where I am having problems with AWB confirming that login and authorisation are there. (Similar to my problem above, but harder to shift.)
Now: Unhandled exception: perhaps this data will help.
************** Exception Text **************
System.NullReferenceException: Object reference not set to an instance of an object.
at WikiFunctions.Browser.WebControl.set_Status(String value)
at WikiFunctions.Browser.WebControl.IncrememntTime(Object sender, EventArgs e)
at System.Windows.Forms.Timer.OnTick(EventArgs e)
at System.Windows.Forms.Timer.TimerNativeWindow.WndProc(Message& m)
at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)
Loaded asemblies in a comment here.
I'd look at the code, but I can't seem to sync tortoise SVN anymore.
I've fixed the NullReferenceException. Googling for "HRESULT: 0x80070002" it seems this error is normally caused by a problem with the .NET framework installation or (rather strangely) if the computer has some kind of adware running on it. See [7] and [8]. thanks Martin15:05, 14 October 2006 (UTC)
I've been using AWB on a number of projects, especially Commons. AWB knows where to look for its files, but it would be helpful if the shortcut in the edit summary was not WP:AWB but COM:AWB (the short version there). This would also be true for Wikinews (WN), Wikibooks (WB) and so on. Thanks.--Nilfanion (talk) 02:22, 15 October 2006 (UTC)
Fixed for Commons and Meta, but this system requires some rewriting. Currently, summary tag and project namespace depend on language only and makes no difference between WP and WS for example. I'll write a code to load namespaces directly from the server. MaxSem07:42, 15 October 2006 (UTC)
Wow, rewrote a lot of code, can someone more proficient in English check new messages before I commit them:
An error occured while loading project information from the server. Please make sure that your internet connection works and such combination of project/language exist.
Please make sure that your internet connection works and such combination of project/language exist.
It might be corrected in different ways, depending on what it should mean.
Please make sure that your internet connection works and that such a combination of project and language exists.
Please make sure that your internet connection works and that such combinations of project and language exist.
Or you might mean something else. I'm not sure how to interpret the "/", whether as "and", "or", or something else. --KSmrqT13:06, 15 October 2006 (UTC)
The problem is that such hardcoding requires bo be much more complicated than it is currently. Different projects of the same language may have a different set of namespaces, and with growing numbers of new projects, especially Wikiversities, keeping this list up to date would become harder and harder. Take a look at my implementation here. MaxSem11:13, 15 October 2006 (UTC)
There are currently about 700 Wikimedia projects, without looking at wikia etc. etc.. RichFarmbrough, 12:02 15 October2006 (GMT).
What would be good is if the namespaces were only loaded if they were not hardcoded in already, for example we could hardcode the namespaces for wikipedia projects, as these are the most common, but make it load the namespaces for other projects at runtime. Martin12:55, 15 October 2006 (UTC)
Existing code typically supports only wikipedias in other languages. For example, no: Namespaces[4] = "Wikipedia:";, and therefore AWB will not function fully on no.wikibooks, no.wikisource and no.wikiquote. Adding nested switch statements to fix this will complicate the code even more. MaxSem16:49, 15 October 2006 (UTC)
I just commited what I meant, this way the namespaces are only loaded for sites that are not wikipedia, but otherwise your method is used to get them. Martin19:23, 15 October 2006 (UTC)