This is an archive of past discussions on Wikipedia:STiki. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.
The AGF dialogue is quite a new feature. I think what is happening is that one presses "revert" and this runs all the sanity checks (i.e., you beat everyone else to it). This hands off control to the custom message dialogue, you spend a few minutes writing a message, and then press "done". Then, both edits (the revert and the message) are committed to Wikipedia. I need to make the change so the revert commits immediately, and the message posts whenever it is authored. As it is now, others can get in on the action while you are writing the AGF message. Added to bug tracking as T#025, should be a simple fix. Thanks, West.andrew.g (talk) 21:06, 6 March 2013 (UTC)
I would like to request permission to use this tool. I don't have the ability to do rollbacks, and my Wikipedia edit count is less than 1000, but I do have experience with reverting vandalism on SmashWiki, where I was given adminship at one point. I believe that I have a good understanding of what qualifies as vandalism, and would be a good operator of this tool. Pikamander2 (talk) 20:55, 11 March 2013 (UTC)
Obviously, you can hold off to see what Andrew has to say, but have you attempted to apply for rollback rights here on Wikipedia English? It took me a while, but for good reason - they were screening me to make sure I was "safe." It was pretty frustrating because STiki makes it so much easier to edit effectively and the main complaint was that I didn't have enough edits (in quantity) in the article space, but you can get there using Twinkle with relative ease. I recommend the normal approach. As an addition, looking at [1], you may already be a candidate for Rollback rights with a good application. --Jackson Peebles (talk) 17:08, 12 March 2013 (UTC)
Done on assumption of good faith and experience on other sites. Make sure you use this responsibly, and do not be surprised if other STiki users (or myself) audit some of your initial contributions to ensure they comply with policy. Thanks, West.andrew.g (talk) 06:20, 13 March 2013 (UTC)
Reviewing reverts made by another STiki user
I've just been asked to classify my own revert which I made about 15 minutes previously. I've also recently had a discussion with I dream of horses where she has been asked to classify edits from other STiki users including ones I made.
Am I right in thinking this is a case of the queue running low? I seem to recall a discussion about this before but I can't remember what the upshot of it was.
The queue is never "running low" in an absolute sense. The queue always has about ~4 million entries (one for each article). When we mean "running low", we are talking about how ripe the vandalism probabilities are. If no one has used STiki for a while, the top of the queue will be filled with RIDs that methods computed to have an ~80%+ chance of being vandalism. With STiki's recent surges in popularity, people have been going through ~4000 RIDs per day, draining it to the point where the edits presented have a ~25% or less chance of being vandalism (per the models; in practice slightly less due to competing tools and ecosystem considerations).
See if you can find the old thread where we decided not to whitelist STiki edits in the CBNG queue (I know the STiki queue does white-list). I thought we were doing this for CBNG as well (a brief code review suggests otherwise), and I want to make sure we did not have any compelling arguments to the contrary. Otherwise, this is something that takes virtually no time at all to implement. Thanks, West.andrew.g (talk) 05:30, 14 March 2013 (UTC)
Guys, you do know that the archive list in the top-right corner has a search box, don't you?
The most relevant discussion seems to be Wikipedia talk:STiki/Archive 6#White-list. There was no compelling argument against. However, Andrew stated that his favoured approach was to improve the underlying machine-learning algorithm, rather than putting in a non-learning filter on the end. I have a certain degree of sympathy with this approach. However:
Are you about to release an improved machine-learning algorithm Andrew? If not, a whitelist could be a good quick fix. It could even be taken out later if it proves redundant when a new algorithm comes along.
The competing tools and ecosystem considerations that you mention above affect the accuracy of calculated probabilities, as you state. But they may also mean that the machine-learning algorithms become less effective in relation to a queue-exhaustion situation. So even if you had a super algorithm based on super features, it wouldn't take the competing tools and ecosystem considerations into account and you might always benefit from some kind of whitelist.
I was using an old version of STiKi until tonight, and always noticed that the last Revqueue option (meta (combination)) was grayed out. Now I'm using the most recent version. Now the only two options I can select are CBNG and STiKi. Is there a reason for this? --I dream of horses (T) @ 05:25, 16 March 2013 (UTC)
That will come. There are very naive ways to do this (i.e., summing the scores) and very elegant ways to do it (multi-dimensional machine-learning). I have the data to support the latter case, but it will take a little effort to get right. I'd prefer to do it correctly. There are also some implementation-level issue to consider (i.e., what happens when one queue is down) -- not a major issue to think about -- but a bit of code to write in practice. Thanks, West.andrew.g (talk) 21:34, 16 March 2013 (UTC)
Locked into reverting
Hey Andrew, remember me? (Cluenet) Anyway, when you click "Good Faith Revert" it pops up asking what message, but locks you into reverting. You should be able to close the window and backtrack from the edit. 930913(Congratulate) 09:05, 14 March 2013 (UTC)
Hi A930913,
How useful would you find such a feature? Andrew is currently considering making the revert happen immediately, i.e. before the user message is chosen. See T#025, in #STICKY: Known Bugs and Feature Requests, above.
I suppose one option would be to keep them simultaneous, as they are at present. You could then have a "cancel revert" button on the message selection screen. You would deal with T#025 in a different way... Like make it do sanity checks on both pages and ensure it is OK to proceed before it does both reverts at the same time.
Hmmm. This is a consideration I didn't make when I spoke previously about T#025. At current, you are not locked into reverting. Pressing the "cancel" button in the dialogue will cancel both the revert and the message (is there a "cancel" button? if not, closing the dialogue achieves the effect). How does everyone think this should proceed? I'd prefer not to build an "undo" button where one can revert themselves, this is inelegant and could run into issues (e.g., edit conflicts). Thanks, West.andrew.g (talk) 16:50, 14 March 2013 (UTC)
Negative. There is no cancel button, nor can one close the dialogue. (Using release 2013/01/26)
At the moment, if I realise there is a better warning on huggle, I'll occasionally copy the article over and revert there. If I've clicked "Good Faith Revert", I get locked in and have to kill STiki. 930913(Congratulate) 20:29, 16 March 2013 (UTC)
I apologize. I was thinking of the "No MSG" = "No message" button in my head. This still doesn't answer whether it is more elegant to commit the revert/message separately, or do them together after the fact and risk another editor conflicting the original revert. West.andrew.g (talk) 21:30, 16 March 2013 (UTC)
If the transaction is performed correctly (revert and warn together), there should be no mess.
Use cases:
User clicks GFR, then clicks cancel - nothing should happen and the user get the same page back.
User clicks GFR, then clicks submit - Revert attempted, confirmation received and warning applied.
User clicks GFR, another editor reverts, then user clicks submit - Revert attempted, edit conflict detected, bailout performed.
I just pressed good-faith revert on this diff. Obviously if didn't revert because I had already edited the article outside of STiki. However, it did offer to send a message. In a different situation, someone could get beaten to a revert, miss the "conflict or error" message end up sending someone a message saying they reverted them, when it was someone else. It makes sense for STiki to give up on the message if the revert can't be poerformed. Yaris678 (talk) 22:19, 17 March 2013 (UTC)
Little Error Report
When choosing an AGF template, for the template called "unencyclopedic details", I noticed a small error, in which the words "may" and "collaborate" were connected together. Cheers, Kevin12xd (contribs) 01:51, 8 February 2013 (UTC)
Thanks, the error will be corrected in the next distribution. As has been mentioned here and elsewhere, anyone who has suggestions for existing and/or novel template suggestions will certainly be entertained. Thanks, West.andrew.g (talk) 04:46, 8 February 2013 (UTC)
I also noticed some typos and there's also a few wording changes I could suggest. But it would be a lot easier make these suggestions if the text was already on a page in this wiki. That would also, obviously, make it easier to do a diff, make further suggestions etc.
As I believe I also said there, I support this notion and will integrate any changes that the community makes to the text(s) (or any new templates they come up with). Be bold, if you wish, and copy-paste what we already have inside STiki onto that wiki page (I'm sure you can come up with a better way to format it than I can). Thanks, West.andrew.g (talk) 13:50, 14 February 2013 (UTC)
OK. It may be a while until I am on a machine that lets me use STiki but I'll set something up next time I have an opportunity (unless someone else beats me to it). Yaris678 (talk) 15:01, 14 February 2013 (UTC)
Right. Just had an opportunity to edit using STiki and... text of the AGF messages won't go into the clipboard. It looks like setting up this page is a job for someone familiar with the STiki source code. *Looks at Andrew* Yaris678 (talk) 22:04, 17 March 2013 (UTC)
Thanks. I have just created WP:STiki/Good-faith-revert messages based on your email. I think that is the best way to present the info but if someone can think of a better way, be my guest.
I am going to be off wiki for a few days but don't let that stop others from having a go and tweaking the template messages.
We'll see if anyone shows an interest. Regardless, we should try to work this link into the "Wikipedia" namespace documentation. Thanks, West.andrew.g (talk) 04:02, 19 March 2013 (UTC)
Done -- Seems reasonable to me. I'm not really a Twinkle user, but as this seems to have no effect on those who don't use the tool -- I see no harm. It's a trivial addition and will be included in the next release (no need for the formal numbering and all that, the change has already been made in source). Thanks, West.andrew.g (talk) 16:28, 20 March 2013 (UTC)
Do you know about Wikipedia:STiki/Good-faith-revert messages? If you would like the good-faith revert messages to be different or would like to add new messages, please edit that page.
Dear Andrew,
It might be worth telling us if the number of characters on a line and the two spaces on the end of each line are significant. I assume not, but I tried to preserve these aspects of the info that you emailed me.
You can nuke those and treat them as normal text (in terms of line lengths). It still might be nice to keep the "copy we want to use" in a text box like that though, and try to facilitate any discussion beneath it. I can just imagine a case where people start discussing modifications in threading and it would become very difficult to figure out where consensus might lie. Thanks, West.andrew.g (talk) 07:18, 29 March 2013 (UTC)
OK. I just went to remove the line breaks and found that Wikipedia doesn't do line breaks for code! I have added a note to the page.
Hey Andrew, good news - my proposal was accepted by WMF! I'd like to be able give them some measurable results. I'll be using statistics from other tools, too, but if you have some information on STiki users, such as:
The number of TOTAL STiki users, regardless of the quantity of edits
The number of STiki users with at least fifty edits
The amount of total STiki edits made
And any other information that you may find pertinent, this could be extremely helpful. I've gotten what I can off of the leaderboard, but I thought you might have an easy way to find out this information. If not, just tell me so and I'll find the information out, myself!
Great news! Yes, I can easily query for these statistics. STiki has had 653 unique user names use the tool. 440 users have made more than 50 "classifications" (but an "innocent classification" is not an "edit"); not sure exactly what you need here. STiki has been used to revert 279,528 edits (though total "edits" are probably roughly 2x that given a warning is typically placed); it is also impossible to quantify instances where STiki helped someone find damage, but then they reverted it using a tool like Twinkle. Hope this helps, let me know if more clarity is needed on anything. Thanks, West.andrew.g (talk) 18:35, 2 April 2013 (UTC)
I have mentioned a couple of times the idea of using some kind of regression tree on the odds of vandalism. I have discovered that this approach has a name! It's called a logistic regression tree.
A Google search came up with this paper by Wei-Yin Loh. Looks promising. What I find interesting about that paper is that that LOTUS algorithm described aims to choose a single predictor variable / feature on each leaf. If you used log(CBNG_odds) as a feature this would probably end up being used on most leaves. The tree would mostly just determine the constant and coefficient to modify log(CBNG_odds) by for different conditions. It may also find a few cases where log(CBNG_odds) doesn't help and determine what does in those cases. The perfect way to do a combined scoring system!
As far as I can tell logistic model trees and logistic regression trees are the same thing.
Oo... look... there's an article called logistic model tree. It doesn't say much at the moment though. Maybe I should have a go at expanding it. Anyone else want to join in?
Looks very cool. Theoretically, I think this is a pretty straightforward to build (given the 800,000+ classification we can train over). There are some temporal issues that give me concern, however. I will post some graphs I've recently produced once I get this dissertation wrapped up that well capture the phenomenon and how it might influence our training. West.andrew.g (talk) 05:50, 10 April 2013 (UTC)
FR 2: I find the editor info at the bottom is too easily overlooked while naturally looking at diffs starting at the top of the screen, how about moving it to the top? (this also is more consistent with the normal diff). This may be personal but I wonder what others think. Maybe a preference if divided opinion.
Greetings Widefox. Bug 1 has been reported and a solution will be pushed in the next release (it parallels T#025 in the tracking table above). FR1 sounds reasonable, it should be trivial to include in the next release, as well. I will allow the community to discuss FR2. I have some doubts about its visual pleasantness, but IIRC its a rather simple parameter which determines placement -- so an option/preference may be viable. Thanks, West.andrew.g (talk) 17:17, 12 April 2013 (UTC)
In terms of FR2, I am guessing that the reason that Andrew put the diff first was that that the original STiki queue used to ignore language completely so having humans inspect the diff was the key missing ingredient. Of course, now the STiki queue does include some language features and CBNG uses language heavily... so arguably there is less of a reason to put the diff first now.
I quite like having it like it is but you're not the first person to request this change. Maybe having a user preference to change the position of the diff browser makes sense.
Thanks for the quick response. I figured it presumptuous to edit the GFR messages page without first checking in here, so thanks I will in future. Just wanted to give you further feedback:
FR3: Allow more custom GFR messages (at least 6) and expanding the set ones, such as "incorrect edit to DAB page". I will try to get some together and add to Wikipedia:STiki/Good-faith-revert messages.
I test Stiki with Java 8 dev previews (Java 8 b85 as of today - its there, there just isn't a webpage for it yet), which work great (both 64-bit and 32-bit). I'm sure you're aware of this. Widefox; talk13:02, 13 April 2013 (UTC)
Space
Im usually using Huggle, but tried sTiki out. I think its dangers that space-key redo the last action. Just did some "vandalism" because I thought space was for inocent. (and why does the top of this page say, I shall discuss the software on the article, and not here on talk page? Christian75 (talk) 23:09, 16 April 2013 (UTC)
Hi Christian,
I wasn't aware that space redid the last action. I agree that this is a dangerous feature. Space is a big key that is easy to press by mistake.
In terms of what it says at the top of the page, I think you are mis-reading it. It says "Please discuss STiki here." not "Please discuss STiki here".
I just downloaded STiki for the first time, but I can't connect to the server. I can't think what the problem is, as outbound port 3306 is working fine according to this site, and I have the latest version of the software. Is anyone else having trouble connecting? — Mr. Stradivarius♪ talk ♪08:25, 16 April 2013 (UTC)
I'm stumped. I thought it might have been something to do with my Java environment, as I was using OpenJDK, but I installed Oracle Java 1.7 and still get the same problem. Anyone have any ideas? — Mr. Stradivarius♪ talk ♪09:14, 16 April 2013 (UTC)
STiki is starting up fine, but showing you the "no connection" dialogue? You can also try pinging "armstrong.cis.upenn.edu" -- that is where STiki lives. Thanks, West.andrew.g (talk) 13:40, 16 April 2013 (UTC)
Yes, that's right, it shows the "no connection" dialogue and then closes. Pinging the STiki server works fine, and the connection is pretty fast, so no problems in that area. I suspect it's some sort of network issue, though - I have dual boot Ubuntu and Windows 7, and I get the same dialogue from STiki with both. — Mr. Stradivarius♪ talk ♪15:30, 16 April 2013 (UTC)
@Andrew - it's a home network. Perhaps it could be a NAT issue? @Widefox - this is an automatic check that STiki performs when it starts up, so I don't get to choose between http or https. I don't get as far as the main STiki screen - when I run the program, it just shows the "no connection" message, and then exits. — Mr. Stradivarius♪ talk ♪16:50, 16 April 2013 (UTC)
This is perplexing, moreso because it seems like you know what you are doing, what does "nmap -p 3306 armstrong.cis.upenn.edu" give you? West.andrew.g (talk) 22:38, 16 April 2013 (UTC)
I know just enough about computers to get myself in trouble. :) Nmap gives me this:
Nmap scan report for armstrong.cis.upenn.edu (158.130.51.53)
Host is up (0.19s latency).
PORT STATE SERVICE
3306/tcp open mysql
Nmap done: 1 IP address (1 host up) scanned in 3.67 seconds
I just checked again, and I'm still getting the "no connection" dialogue. (This is in Ubuntu.) Perplexing indeed. I'll have the opportunity to try this from a different connection tomorrow, so I'll do that and then report the results back here. — Mr. Stradivarius♪ talk ♪07:10, 17 April 2013 (UTC)
Ok, I've just tried it from a new connection, and STiki starts up fine. This connection also looks like it is using NAT, though, so that in itself is not the problem. So, there's some sort of difference between my home network and this new network, that isn't NAT, isn't anything to do with my actual computer, and isn't the ability to connect to the STiki server on port 3306, that stops STiki from starting. I really can't think what that might be, though. — Mr. Stradivarius♪ talk ♪09:03, 18 April 2013 (UTC)
Hmm, network problem?... various things to try...0) you have turned everything off and on again? ..then... 1. if wifi - try ethernet 2. double NAT? 3) router 3.1 try a different router (if you can), 3.2 see if there's a firmware update, 3.3 change settings (temporarily: turn off router firewall, ALG off, turn on WAN ping, packet inspection) 4. packet fragmentation problem? 4.1 check MTU on router 5. test/workaround 5.1 try proxy (without giving it password) 5.2 VPN 6. client DNS: opendns or google etc Widefox; talk20:17, 22 April 2013 (UTC)
Why?
Was using the tool when I got this diff...It was already done using STiki, then how could it have crept in!? Also I had 4 and a half day old diffs shown today. Cheers TheStrikeΣagle13:05, 23 April 2013 (UTC)
Was this diff of another STiki user served from the STiki queue or the CBNG queue? Andrew has said he will remove these from the CBNG queue. Presumably that will come in the next update. It isn't mentioned in the bug/request list so I don't know if that means it has been forgotten or maybe Andrew did it straight away and didn't add it to the list.
Congratulations on finding a 4-day-old diff. These used to be fairly common but the high usage that STiki gets means that they are less common now.
Yeah, it will be in the next "update" -- although since this is a server-side change, I don't normally note those in the bugs/feature table unless they are significant. I thought I had changed this, maybe I just forgot to merge the changes on the live server. As a point of reference, two STiki classifications have been on edits that were over one year old. West.andrew.g (talk) 20:13, 23 April 2013 (UTC)
Andrew, Do you know (or can you search the database to find) the oldest diff that has been reverted using STiki? Who did the revert? Yaris678 (talk) 20:08, 24 April 2013 (UTC)
Ran the query quickly. It seems many of the oldest reverts are "good faith" in nature. Surviving for 33778741 seconds was 1st place. Active duration of 32057836 seconds for 2nd place. Third place is RID=469874327 surviving for 31588969 seconds, and given that it was REV-DELETED, I think we can assume it was probably vandalism. Thanks, West.andrew.g (talk) 22:09, 24 April 2013 (UTC)
Interesting.... both first and second place can be characterised as follows:
Within the last month (reverts on the 30th and 31st of March 2013)
Reverted by Mr X
Presumably, Mr X decided they were both link spam.
Hmmm... I suppose if one is using an older version of STiki they would still have access to the disabled queues that are not being populated with new edits. That would probably be a pretty convenient route to make really old reverts happen. West.andrew.g (talk) 14:02, 25 April 2013 (UTC)
Unable to close the "customize the AGF message box"
This is a big issue, as it happens that I change my mind about a revert, and I am not able to close the box, making me have to go through with the edit even if it is not 100% appropriate. LiquidWater20:26, 5 May 2013 (UTC)
I am aware of the issue, as it piggy-backs on T#025 in the bug tracking table. It will be fixed in the next release. Thanks, West.andrew.g (talk) 05:54, 7 May 2013 (UTC)
News from Snuggle
I saw these plots by EpochFail, creator of Snuggle, and found them quite interesting.
I also note that EpochFail has started using his desirability metric for new users. I had a quick go on Snuggle and it seems very effective. STiki users may be interested to know that the desirability metric is based on STiki scores.
The data is not terribly surprising, but I will note those are some very beautiful transparent graphs.
I'll note that the fact STiki has virtually no reverts < 60 seconds old is intentional. An explicit delay and client-side queuing dynamics prevent users from seeing edits much younger, and I see this as a good thing. STiki aims for elegant workload distribution, and just being another player in the Huggle-esque rat-race to vandalism only creates more repetitive work and would create more edit conflicts. Thanks, West.andrew.g (talk) 18:11, 8 May 2013 (UTC)
Yes. Totally agree that it is good that STiki is looking at older edits. That's why I like the graph, it illustrates this advantage of STiki well. You can see that we don't even look at edits during that big peak that Huggle has at about 20 seconds. We wait till things have calmed down slightly.
Are there any plans for STiki to be able look at a chain of edits by multiple users? That would obviously allow us to be even more chilled about it. i.e. if vandalism was followed by a non-reverting edit and STiki still remembered it we could happily revert it weeks later.
To be frank, my next set of cycles plans to: (a) Handle the current round of bugs/feature requests and push out a new release of STiki, (b) I owe some serious time to User:Ocaasi and WP:Turnitin. I won't say that it won't happen, but I am not ready to promise it anytime soon.
I'd also like to look at the "meta" queue, and I think I owe you a graph or two from my recent explorations in that direction.
Towards improving queue accuracy, it would be very helpful if we knew how many watchlisters an article has. I know there is a toolserver tool that does this. However, I think it is: (a) quite inefficient, and (b) will only say "under 30 watchlisters" if that is the case for privacy reasons. West.andrew.g (talk) 19:54, 8 May 2013 (UTC)
Well, I'm glad to hear I'll get some of your time ;) Administrators (like you!) can see pages with under 30 watchers, but this is kept secret because those pages would be ideal targets for vandalism. You might want to check at WP:AN/I or WP:VPT about using your admin status to incorporate that data. It'd be very useful to us but would need to keep it absolutely secret. Ocaasit | c20:01, 8 May 2013 (UTC)
Ooo... the WP:Turnitin collaboration looks very interesting. Are you going to do a STiki-style client/server model where users get served info about a possible copyvio?
Still looking forward to those graphs.
I can appreciate you have a lot on at the moment!
I have been thinking about the edits by multiple users. I think you could pretty much separate the GUI improvements from the new queue and do whichever you fancied first. But I won't go on about it now. Maybe I'll do a bigger post on the subject in the future.
Hi, I'm User: One Of Seven Billion, and I am here to ask for permission to use your tool, as I have not made more that 1000 edits, and do not have rollback rights (I was denied them). I have some experience with vandalism, and I just want to make this encyclopedia, this infinite web of knowledge, even better, and make it an encyclopedia, not a vandal haven! One Of Seven Billion (talk) 20:03, 8 May 2013 (UTC)
Not done Why were you denied rollback? Looking through your contribution history it seems you have done a lot of trivial work, which while helpful, doesn't demonstrate any subtle judgement capabilities. The little damage you have undone on your user-page... well.... looks a bit suspicious once IP geolocation is brought to bear. If you are truly interested in anti-damage work you should consult the WP:CVUA who gives tutorials and can brief you on policies. Until you have their certification/recommendation, I am going to decline your STiki permissions request. Thanks, West.andrew.g (talk) 20:23, 8 May 2013 (UTC)
Request for Update to Statistics
Hey Andrew! I hope that all is well! Now that school is done for the semester and I've wrapped up some research I've been doing, I'm almost done with the STiki tutorial. Before I release it, I'd like to wrap up my baseline statistics (as unscientific as this may be, it'll give the community something). Could you please provide me with the same information as you provided before? --Jackson Peebles (talk) 17:35, 12 May 2013 (UTC)
I'm happy to provide any statistics you need, but clicking-through on that wikilink I noticed you already seem to have the statistics for "today" (and they appear accurate). What other types of things are you interested in? Thanks, West.andrew.g (talk) 22:09, 12 May 2013 (UTC)
Ah, I suppose I did get most of what you provided, before. How many users with over 50 edits do you have? Also, anything else you'd like to include. Thanks, as always! --Jackson Peebles (talk) 05:20, 14 May 2013 (UTC)
Greetings STiki users. Below is the CHANGELOG for the 2013_1_26 release (available for download on WP:STiki). I have done some changes to my hardware and OS, so I am hopeful we do not run into any issues with Java versioning. I am excited there is some new GUI functionality in this release. This is likely to be the final version I release as a graduate student (!), but as always, I appreciate your suggestions and bug fixes! Thanks, West.andrew.g (talk) 03:38, 26 January 2013 (UTC)
When making an "AGF revert" classification, a dialog will now appear that allows users to post custom notifications/explanations to the offending user's talk page. This dialog can be disabled via the "options" menu. The dialog contains a text field in which any message can be authored, but more likely, user's will take advantage of the pre-written/template-esque options provided in a drop-down menu. There is also the capability to store custom messages if an editor is so inclined. Overall, it is hoped this will result in less WP:BITE behavior towards those being reverted (T#019).
The "edit properties" panel now displays the relevant "permissions" or "groups" which the editor under inspection has obtained. These are placed parenthetically next to the user name, and are listed below. If a registered user has none of these, then "no permissions" will be displayed. No information is displayed about IP editors (as they can have no such permissions) (T#022).
"autoconf" -> autoconfirmed
"conf" -> confirmed
"review" -> reviewer
"rb" -> rollbacker
"admin" -> sysop
Minor organizational changes to the "edit properties" window to bring greater prominence to the "summary" and "editor" fields, since these are often vandalism indicators (T#024).
Users can now generate a "live" leaderboard for their own viewing purposes inside the tool, via "Rev. Queue" -> "Generate leaderboard". It is a slightly simplified version of the one at WP:STiki/leaderboard, but should serve the edit-count-itis folks quite nicely until the nightly updates are made on Wikipedia (T#023).
A previous change prevented "null diffs" from being pointlessly displayed to end users. That change essentially caused such RIDs to be ignored. This caused them to stay in the queue, often accumulating and forcing the program to process (but re-skip them) at start-up. A fix now dequeues null diffs upon their discovery.
The "WikiTrust" and "Link Spam" queues have been disabled. The former is dependent on third-party computation which has been broken for some time (and no one ever used the queue, anyways). The link spam queue is not really used either, but that queue is computed on a local machine. The primary reasoning for disabling it is cost. The classification model is dependent on queries to the Alexa API, which are not free, and we make many thousand monthly ($65+) (T#021).
I encourage other STiki fans to check out and distribute this tutorial. I found it be effective while concise (and likely to improve if Yaris' comments are heeded). I didn't have much to comment on that Yaris' hadn't already, but I did second that a human voice (at least for English) would be more engaging. Thanks, West.andrew.g (talk) 15:24, 21 May 2013 (UTC)
Following that, when Jackson has a copy he is willing to consider "reasonably final", we should make sure there is a stable URL and then prominently include somewhere on the STiki main page. Thanks, West.andrew.g (talk) 15:25, 21 May 2013 (UTC)
STiki may be down! (UPDATE: Online)
With my thesis now defended (!), I am beginning to transition from UPenn to a new research position in the Washington D.C. area. In the immediate, the STiki server is moving off my personal desk and into a position within a dedicated server room at Penn. This change should not impact end users aside from some brief down time effective immediately. Newer versions of the software were prepared for this change and should work immediately after everything gets plugged in and comes online. However, older versions may incur a bit more of a delay as the new DNS routing will need to propagate. Thanks, West.andrew.g (talk) 18:45, 14 May 2013 (UTC)
Thank you! The IP address will change, but the URL/API should not. I am just waiting for the University to do the DNS switch and for that to propagate around the Internet. The new IP is 158.130.7.3 if you are really eager, but otherwise all should be back to normal shortly (assuming they get to this before end-of-business today). Thanks, West.andrew.g (talk) 19:43, 14 May 2013 (UTC)
I'm struggling to look up any scores for revisions that occurred after 04:22, 15 May 2013. Is it possible that STiki has stopped updating? --EpochFail(talk • work)14:11, 16 May 2013 (UTC)
Was going to test this and see what revisions I got however I'm currently unable to connect to the STiki backend or download STiki from the main page. Tried accessing from both my machines - everything else seems to be working fine. Anyone else having issues? Fraggle81 (talk) 16:14, 18 May 2013 (UTC)
I am going to assume this issue is resolved unless I hear otherwise -- since I see you did perform some classifications yesterday, Fraggle. Thanks, West.andrew.g (talk) 13:26, 21 May 2013 (UTC)
Weird, especially in how the link is resolving to a location different from that displayed in wiki syntax. Besides that, I'm having no issue. This direct link doesn't work?: STiki. Thanks, West.andrew.g (talk) 21:52, 18 May 2013 (UTC)
I have fixed the link, It appears to work now but earlier neither the direct link nor http://cis.upenn.edu/ would work. Anyway I downloaded and when installing I received an X error, but unfortunately I can't tell you what it was because shortly after that Fedora went into a Kernel panic *shrug*. --wintoniantalk23:06, 18 May 2013 (UTC)
Was the X error as you were unzipping the download? You don't need to install it. You just need to unzip it and then run it with Java. Yaris678 (talk) 06:54, 19 May 2013 (UTC)
No it was after I ran "java -jar STiki_2013_01_26.jar" after extracting it. BTW is there a way to run it with out being root, it doesn't appear to like sudo? --wintoniantalk15:43, 19 May 2013 (UTC)
I really hope you don't need root to run java! Can you run other java programs? Is this with Sun Oracle's Java 7? If you do need root (what is the world coming to)..have you tried the Fedora equivalent of gksu instead of sudo (i.e. beesu). Widefox; talk17:10, 19 May 2013 (UTC)
Thanks, you're right I certainly don't want to run it as root. Anyway I found the problem I wasn't added to the 'wheel' group. I don't really run Java apps, and yes I am using OpenJDK 7. Finally I can get some vandal bashing done for the first time since converting to Linux and failing to intoxicate Huggle with Wine. Just in-case anyone ends up in the same predicament they can just use the commands provided by the Fedora Project for F18 at least. --wintoniantalk18:10, 19 May 2013 (UTC)
As a general FYI, I have never heard of such difficulty getting STiki to run on a Linux box. Indeed, it was programmed in Fedora flavored operation systems, which I use on a daily basis. Wintonian, is your machine part of a large institutional network? the actual machine, not just the connection? I could imagine the sysadmins having some restrictive policies. Otherwise this is very weird. STiki needs minimum permissions on the file system. It needs to execute itself, and the program only only writes a configuration file in user space (which, if it fails, should be a caught Exception that does not break program operation). Thanks, West.andrew.g (talk) 14:45, 21 May 2013 (UTC)
Some institutions are very restrictive. You know the trouble I had getting STiki to work on my girlfriend's (Windows) laptop? I have worked out that it won't let you execute anything with Java... which is weird cos it does have Java installed... I can only assume her employer has put some kind of lock on it so that Java can be used for very limited things. Yaris678 (talk) 15:23, 21 May 2013 (UTC)
Nope, just me a home. Although having only used Linux for a year I freely admit I still have a lot to learn, but soon as I added myself to the 'wheel' group it worked with or without sudo. The strange thing is I have just tried removing myself from the group again and it still works. *confused*. --wintoniantalk00:02, 22 May 2013 (UTC)
Edits older than 20 days
Over the past two weeks using STiki on a portable Wifi connection, I've been receiving very old edits, with one of my reversions ending up getting shown on the interface, and the software refuses to show any recent edits. What's going on? hmssolent\You rang?ship's log11:39, 16 May 2013 (UTC)
Hmm... this could be related to the STiki server moving, but I wouldn't have thought that it would last for very long. Can you confirm the following?
You have only been getting old edits - no new edits at all.
This has been going on for about two weeks.
If these are both true then I'm not sure what the issue is. Some questions that might help us diagnose the problem:
What is the youngest edit you have seen recently? Does there seem to be a cut-off point? e.g. all edits shown are from before 1st of May.
Which queue have you been using? Does this issue occur on both the "STiki (metadata)" and the "Cluebit-NG" queue?
Both are true. The youngest edit was eight days old, the oldest being 40, and I've never used the STiki metadata queue. On my home network the ClueBot-NG queue showed very recent edits, the oldest being 3 days old. hmssolent\You rang?ship's log02:24, 17 May 2013 (UTC)
I have finally managed to get on to STiki and I can report I have the same problem. The youngest diff I have seen is 16 days old. I too had the problem with both the CBNG and STiki queue. It does look like the STiki server isn't looking at new edits any more.
If I had to take a stab in the dark I would say that new machine that the server is installed on can't scan new edits. Maybe that requires some software it doesn't have or maybe there is a fire wall issue.
HMSSolent, Do you still get recent diffs on your machine on your home network? That is the one piece of evidence that doesn't fit.
Fixed -- IRC issues were causing the main thread to hang. IRC output has been temporarily disabled (I don't think anyone uses it, anyways). Let me know if things do not return to normal. I'll investigate this further and report back at greater length sometime later. Thanks, West.andrew.g (talk) 16:40, 18 May 2013 (UTC)
I would like to use STiki. I make many reverts of vandalism, but I often don't have time to categorize the vandalism (i.e. good faith). LieutenantLatvia (talk) 01:48, 23 May 2013 (UTC)
Done -- I've approved your use of the tool, with 700+ article space edits and minimal/no complaints, I feel this user qualifies. Please familiarize yourself with the policies and usage suggestions at WP:STiki before using. Thanks, West.andrew.g (talk) 02:41, 23 May 2013 (UTC)
Time-in-queue and probability decay
I'll quickly post these graphs I promised Yaris in an earlier thread; produced during dissertation writing. Basically, the longer any Wikipedia edit remains most recent on a page, the greater the probability that *someone* has reviewed it. Thus, an edit with a (initial) computed vandalism probability of 80% that was made two months ago has probably been reviewed by this point, making it more likely a recent edit with 80% (initial) probability is vandalism.
The left-most figure makes this "temporal accuracy decay" concrete. All edits in this picture were computed by a back-end scoring engine to have an initial vandalism probability of ~80% (within a small epsilon). We see that the actual revert probability (based on human response) for these edits falls immediately to about 60% (because STiki forces a 60-100 second delay or so for Hugglers to do their thing, they deplete some of the true positives). We are able to plot a line that more accurately estimates the actual vandalism probability as a function of time.
The question here is whether we should use these empirical regressions to affect prioritization. This would certainly increase the hit-rate of end users, but may also effectively eliminate the possibility of finding really old vandalism (as the age decay would push these deep into the queues).
I think the *ideal* way to do this is investigate the product of "time since edit" and "watchlister count" (which per Occassi's suggestion, I need to investigate how my 'admin' rights enable me in that respect). This product would seem an accurate heuristic of whatever we are trying to measure here ("has someone reviewed this?", "how many people have reviewed this?")
The second graph just confirms that many of the edits which STiki reverts have a lifespan long enough for this to have an effect. Thanks, West.andrew.g (talk) 17:52, 21 May 2013 (UTC)
Very interesting graphs. Thanks.
Strange that a straight line fits. This implies that after about 100 hours the probability will be negative! Maybe it's actually the log odds (log(p/(1-p))) that follow a straight line.
up to 5 seconds - the results are dominated by CBNG. This will have a very steep slope but the gradient will lessen for lower nominal probabilities since there is a degree of correlation between the CBNG vandalism prob and the STiki vandalism prob (and CBNG reverts if its prob is above a specific value). In practice, the gradient of this line is both hard to measure and irrelevant, because of the time scale.
5 seconds to 2 minutes - the results are dominated by Huggle. This will have a less steep slope than for CBNG.
2 minutes to 36 hours - the results are dominated by article watchlisters. The gradient of the slope will be less than for Huggle and related to the number of article watchlisters.
36 hours plus - the results are dominated by people finding the vandalism by chance. The gradient of the slope will be less than for watchlisters. The number of watchlisters may serve as a proxy for the the number of people viewing the article but the relationship won't be a as good as it is when watchlisters dominate.
These times are obviously approximate. If I had the data I might be able to be more specific. We would probably find that the width of the watchlister-dominated section is dependent on the number of watchlisters.
In practice, you probably want to ignore the first two lines and base probability decay on the second two lines. This will help to avoid competing with the Huggle users.
I have a few points to make about the issue of eliminating the possibility of finding really old vandalism:
If you fit a line to log odds rather than probability then the calculated probability won't drop to zero.
The calculated probability will obviously still go very low though, meaning that it will eventually reach the point that it will never get to the front of the queue.
Concerns raised by point 2 can mostly be dealt with by correctly dealing with the transition from the watchlister-dominated line to the lower-gradient chance-discovery line.
I think the only remaining concern is that some people just like finding really old cases of vandalism!
I think it may be a good idea to implement probability decay in a way that lets users choose to not use it. This would effectively turn the current two queues into four queues. It would keep people looking for old vandalism happy and would also mean that anyone else could occasionally check for corner cases that have fallen between the cracks.
I guess there is another issue, related to point 4. Applying a probability decay will obviously result in people seeing more recent edits, on average. Some STiki users might prefer to see slightly older edits because of the reduced chance of an edit conflict. As with point 4, this can be addressed by making the probability decay optional. Another approach would be to allow users to increase the minimum age of diffs that they see. Yaris678 (talk) 07:46, 24 May 2013 (UTC)
T#018
Until T#018 is implemented, can we have a workaround such as also sending a (final) warning to the editors page as well as reporting to AIV, that way if declined by AIV, at least they will have a warning rather than nothing. My example is User_talk:70.60.86.147 where I decided to manually warn after being declined:
I don't know where Reaper Eternal got insufficiently warned from. The IP had already been blocked that month! All the edits post-block were vandalism and the IP had already been warned post-block by CBNG.
I can't really see how your idea would work Widefox... Look at it from the IPs point of view. They get a final warning "do that one more time and you get blocked" so they stop... and they get blocked anyway.
I tend to agree with Yaris on these points. You could push for a new template that states: "a request has been lodged for your blocking... even if that fails... be warned this is your very last opportunity... here are some links to help you out either way". I have no real opinion on whether this is a good idea or not, but this is RfC material for broader consensus, not something that STiki should pursue independently. Indeed, it is awkward if an AIV post fails and no warning is issued -- but perhaps the burden here is better placed on the AIV admins., who should issue a "last warning" message (or other appropriate level) if they reject an AIV post so as not to confuse automated tools and/or any subsequent AIV requests. Thanks, West.andrew.g (talk) 19:33, 30 May 2013 (UTC)
A level 4 warning with the text saying their blocking is being discussed at WP:AIV. In a similar way to mandatory WP:ANI notification, but just as a way of giving a level 4 warning. The problem is that without such a belt and braces workaround, fully automated STiki users may be unaware of repeatedly failing to warn or block. We have custom AGF messages, so would it really require an RfC for a temporary workaround? Widefox; talk03:10, 31 May 2013 (UTC)
That suggestion would also require wider consensus. I don't think it would be a good idea. I would prefer not to give vandals any jollies about how people are discussing blocking them or about how they managed to escape being blocked. See also: WP:DENY. Yaris678 (talk) 11:58, 31 May 2013 (UTC)
Feature request - plurals
I realise this feature request is a bit nit picky but hopefully it will be quite quick to do and it will make STiki edit summaries look better.
In the edit summary box, can we have a code word #s# that puts an s in the edit summary if more than one edit is reverted? This means the default edit summary could be
Reverted edit#s# by [[Special:Contributions/#u#|#u#]] identified as test/vandalism using [[WP:STiki|STiki]]
Hi, I am requesting access to use Stiki, though when looking at my contrib's i have more than 1500 edits on WP, the tool doesn't allow me to edit. I am usually looking towards recent changes for reverts and I feel this tool will help me be quicker in vandal reverts. Amit (talk) 14:09, 31 May 2013 (UTC)
Done -- The automatic entry threshold is 1000 article namespace edits, of which you have only ~750. Regardless, I find that you have demonstrated good judgement in your prior revert actions and I have approved your STiki account. Thanks, West.andrew.g (talk) 14:14, 31 May 2013 (UTC)
Done -- Normally such a lack of anti-vandalism experience would result in a declined request. However, in his brief career, Matty seems to have performed a large variety of maintenance tasks (disambiguation, speedy deletion tagging, etc.) with minimal negative feedback. He has also requested the assistance of a CVUA mentor. I encourage him to carefully read policy pages, but I will assume good faith that he will slowly tread into this new domain. Remember that a "pass" classification hurts no one. Thanks, West.andrew.g (talk) 13:08, 1 June 2013 (UTC)
I went on STiki just now and didn't seem to get many recent or relevant edits. The STiki and Cluebot NG queues seem to be equally faulty in this respect. The newest edit I got was around 1 hour; the oldest was something like 270 days. The average timestamp was around 5-10 days. My revert rate was also particularly low. Is there any occurence which can explain this, and does anyone else have the same problem? 069952497aComments and complaintsStuff I've done00:11, 2 June 2013 (UTC)
Same here - I'm having last month's feed jam all over again, only worse. The oldest was something like 605-700 days old, the newest ones being half an hour ago. Same goes for the STiki metadata. hmssolent\You rang?ship's log01:04, 2 June 2013 (UTC)
I noticed something some strange stats coming through yesterday and wondered whether a vandal was draining the system. I haven't booted up STiki to check, but the symptoms would match. In the meantime, you can stalk the vandalism in my IRC channel @ ##930913connect930913(Congratulate) 04:31, 2 June 2013 (UTC)
Everything is fine on STiki's end. CBNG's feed did go down for a little while today, but STiki attached immediately when it came back online (this is *completely* beyond my control -- I will shift default queue selection if CBNG remains down for too long). The STiki (metadata) calculation shows no blip. I have heuristics to detect if a vandal is trying to affect queuing dynamics. I encourage users not to over-scrutinize queue properties -- I have tried to rigorously analyze these for my dissertation -- and there are far too many environmental variables to try to make sense of what is going on. The STiki process emails me 6x daily regarding its status and statistics -- so I am generally quite in tune if something is going wrong. Thanks, West.andrew.g (talk) 07:11, 2 June 2013 (UTC)
Odd, I have no record of CBNG going down. *Ten minutes later* Ah, you got caught the wrong side of the netsplit - "16:07 -!- Netsplit radian.cluenet.org <-> decay.nullroute.eu.org quits: STikiQueuer"
Only 6x per day? I get every five minutes So you can detect and revert malicious STiki usage (or breach attempts - I wonder if I could pose as another user. Could you detect that?) 930913(Congratulate) 09:07, 2 June 2013 (UTC)
I am not going to discuss STiki security/vulnerability on-wiki, as that just encourages WP:BEANS. However, if you browse the archives here (or maybe my user-page), I remember an interesting discussion a while back with the CBNG folks on a privacy-safe mechanism to check claimed wiki identity to third-party services. Also, I get human-readable reports 6x daily -- the heuristics and status checks are a more continuous matter. Thanks, West.andrew.g (talk) 13:09, 2 June 2013 (UTC)
Request for Access
Hi, I am requesting access to STiki. I have nominated many articles for deletion that have been deleted and have reported vandals to administrators and gotten them blocked.
Thanks, Surfer43 (talk) 03:34, 2 June 2013 (UTC)
Done -- Yeah, I did the did database access addition and decided a nap was appropriate before confirming here. This case establishes a lower bound on what I am comfortable approving. Only ~200 previous edits. Very similar too -- but less experience than -- the "Matty.007" account above. While assuming good-faith, I am also vetting the work of these users in hindsight to ensure quality. Though some users have a been around quite a long time, broadly the retention rate for anti-vandalism work is quite poor, and I think a permissive approach is beneficial to the project. Thanks, West.andrew.g (talk) 22:12, 2 June 2013 (UTC)
I for one would really love this feature especially working within VPN most times. Just mentioning it to give it some weightage ;-). Other than that it has been great tool for me till now. Amit (talk) 06:01, 3 June 2013 (UTC)
I would like to use the tool stiki to help revert vandilsim. I currently help review new articles and tag them. I also help patrol recent changes for vandilsim. I would use this tool to make my change patrolling easier. I only started contributing recently but I do have a few edits. If you decline my request I understand. FalkirksTalk02:59, 6 June 2013 (UTC)
Done -- Well qualified. Nearly a 1000 edits across all namespaces doing maintenance work and polite handling of newbie talk matters. As an aside, I am curious why a lot of semi-new new page patroller folks have recently seemed to show up at STiki's doorstep. No complaints, of course! But did STiki get some get good press in that sub-space? Thanks, West.andrew.g (talk) 03:32, 6 June 2013 (UTC)
Thanks, I am not exactly sure why new page patrollers are showing up here. Very strange indeed. I think I read about stiki reading some Wikipedia documentation not on new page patrol. FalkirksTalk14:02, 6 June 2013 (UTC)
I think it is a chain reaction, most people want to contribute to the project and anti vandal edits are a quick and easy way to contribute to the WP (with having an average subject knowledge of the articles that are being edited). If you also notice most of these new users (including me) use STiki and do high volume of edits to revert vandalism or good faith erroneous edits. Most editors probably see other recent change patrollers using STiki and then come here to request the same. The edit summary with the STiki link is the "good press" :-). Amit (talk) 17:15, 6 June 2013 (UTC)