Just a thought.. one article is featured on the Main Page per day, so presuming that we don't want to duplicate any, we need the delta-FA for any day to average to at least 1. Over the last year (June 2005 - May 2006) the delta-FA is 371, so it's barely above 1. There is, of course, a buffer of quite a number of FAs which have not been used though. -- Mithent01:05, 3 June 2006 (UTC)[reply]
But didn't the Brilliant prose category serve a similar purpose? As an aside does anyone have any idea which article was the first to receive FA/Brilliant Prose status? Lisiate01:54, 11 August 2006 (UTC)[reply]
Let the FA proportion equal the number of FAs divided by the number of articles (where t is time, in units of months). So take the derivative of the above with respect to time:
(eq 2)
FA'(t), the change in FA count with time, is defined as:
(eq 3)
Substitute equation 3 into equation 2:
(eq 4)
The FA proportion isn't significantly changing now. The numerator in equation 4 must, therefore, be 0. Therefore:
(eq 5)
It's well known that the article count is exponential:
(eq 6)
(eq 7)
Substitute 6 and 7 into 5:
(eq 8)
Divide both sides by Pe^(rt)
(eq 9)
Divide by r
(eq 10)
And there you have it. The FA proportion is constant when the FA count and delta FA reach this state. From the data on this statistics page, we see that (aka, ) is roughly 30 for all t.
Also, I think you might be confusing FA(t) with FA'(t). FA'(t) (aka, months promotions minus monthly demotions) is roughly constant, at about 30-ish per month. Therefore, FA(t) - the integral of a constant - is linear. And the graph bears this out pretty well too Raul65419:46, 1 March 2007 (UTC)[reply]
I agree that the FA count is not constant, and is increasing linearly with time, at a rate of about 30 articles a month. But I don't think eq.11 says this.
Correct me if I am wrong, but FA(t) is the number of featured articles, as a function of time, t (presumably in units of a month); FA'(t) is the derivative of FA(t) with respect to t; A(t) is the total number of articles, as a function of t, and P(t) is the proportion of featured articles to total articles, as a function of t; no? And r is a constant (the exponent in A=P*er.t )?
So FA(t)=30/r means that the number of featured articles is a constant? (30 is a constant, and so it r, so 30/r is a constant, and has no dependence on t).
Equations 5-11 describe the conditions necessary to keep the FA proportion constant. As long as FA(t)=30/r is true, the FA proportion will stay constant. Raul65402:15, 2 March 2007 (UTC)[reply]
A better conclusion might be to start with ΔFA = 30, and so integrate directly to get FA = 30 t + c (i.e. the number of featured articles increases linearly with respect to time); or with ΔFA r × FA, which implies an exponential increase in FA, I think. -- ALoan(Talk)23:21, 1 March 2007 (UTC)[reply]
Is there some easy way to see this broken out into # promoted and # demoted? I know how to find both the number promoted and the list of articles in question but I am more intrested in a log of which articles were demoted in which months. Is this avalible somehwere? Dalf | Talk05:05, 19 January 2007 (UTC)[reply]
The graphs show that the percentage of articles that are featured has been in a steady decline. Any thoughts about why this might be true? JoshuaZ18:35, 1 March 2007 (UTC)[reply]
Actually, as I reported to the signpost - I think we hit a turning point in February. In February, for the first time in 2 years, the proportion increased.
People with bots/more knowledge than I have of perl et al: Can someone automate the process of grabbing the sizes of all FA articles and listing the 10 largest, 10 smallest, and getting us mean, median, and mode? MrZaiustalk09:11, 20 May 2007 (UTC)[reply]
I came across this page a couple of days ago following what links here, and have actually started working on a script to do this. Hopefully it should be finished within the next few days. Dr pda (talk) 20:45, 22 November 2007 (UTC)[reply]
I've finished the script, which works for any template transcluded on an article page (so it can also be used for stubs, infoboxes, cleanup tags etc). Documentation can be found at User talk:Dr pda/generatestats.js. Below is the output for {{featured article}}. A couple of caveats: The size is the size of the wikitext, not the readable prose size. Calculating the prose size requires loading each page, whereas the wiki text size is stored in the database, and can be accessed via the API interface. It would be possible to run my prose size script on the top ten articles to see what their actual prose sizes are. Also I note the total number of articles is four fewer than the FA number; possibly there are some recent FA's which don't have the star yet. I'll look into this. Dr pda (talk) 01:54, 24 November 2007 (UTC)[reply]
Also, we should not be off by 4, because Gimmetrow checks regularly; it would not be in recent promotions, because four is too few. Maybe it's a glitch in WP:FFA, or someone removed a FA template. Gimmetrow can help there, because his scripts check regularly. SandyGeorgia (Talk) 02:06, 24 November 2007 (UTC)[reply]
I added lengths from the script for the ten Dr pda listed, plus the two Sandy mentioned. I assume the size Dr pda listed was "Wiki text size" but it didn't match exactly so I didn't add that number in the two additional rows. I assume that by "readable text" Sandy means the number the script gives as "prose (text only)". I also resorted the list by that number. Mike Christie(talk)11:12, 24 November 2007 (UTC)[reply]
Thank you, Mike; I prefer to look at article size via readable prose, since we shouldn't "penalize" well cited articles. I think that includes now all the extra-long articles, but if I remember another, I'll add it. SandyGeorgia (Talk) 13:53, 24 November 2007 (UTC)[reply]
I modified my article statistics script to calculate the "readable prose" size. Unfortunately the only way to do this is to load each article, so it takes about an hour to run over the ~1700 featured articles, as opposed to the 20 seconds or so if one just uses the wiki text size from the database. For this particular application (finding out which are the longest and shortest FAs) prose size is the way to go, but for other applications of the script (e.g. finding out which are the longest articles with a particular stub tag, or getting an idea of article sizes in a very large category) wiki text size might be sufficient. The article sizes given below should match the prose size returned by my prose size script; there may be differences of up to 1kB due to differences in rounding, since I had to rewrite the algorithm. Btw the discrepancy User:Mike Christie noticed above was due to me using 1000 instead of 1024 to convert to kB; I have now fixed this. The articles which I missed because they didn't have the FA star were Hurricane Dog (1950) (promoted November 5), Andrew Cunningham, 1st Viscount Cunningham of Hyndhope (promoted September 9), Sargon of Akkad (promoted July 24), and Society of the Song Dynasty, though someone's added the star for that now. I've also got the complete list of FA's sorted by prose size; if anyone's interested I'll put it in my userspace. From this one can see that just over 75% of all FAs are 32kB or shorter. Dr pda (talk) 01:27, 27 November 2007 (UTC)[reply]
I'm not too concerned about the FAs without a star, I was more worried about whether there was a bug in my script which was missing articles for an unknown reason. Dr pda (talk) 02:06, 27 November 2007 (UTC)[reply]
Is there anyway you can write a program to read through an article and count how many inline cites there are in an article [including multiple invocations of the same ref]? I think that would be useful so that we can calculate the ref/prose ratio and see which articles need to be reffed properly. I did it manually for all the Australian FAs to see which ones were underreferenced (needless to say it is only a guide or rule of thumb, since an article can be sourced with few cites if all the info is derived from the same chunk of book over and over). Blnguyen (bananabucket) 02:19, 27 November 2007 (UTC)[reply]
I know. There are some articles that can have 3 refs for each kb of prose yet the refs still don't cover all of it, wheras some have only 1.5 or so and cover everything (usually when the author mostly based it on one book and only has one ref per para). But anything that trawls up about < 1.2 (from about 120 that I did manually) will always have unreferenced sections. Blnguyen (bananabucket) 02:28, 27 November 2007 (UTC)[reply]
Sandy, Rick Block, Gimmetrow, and myself have finished auditing this page (the grunt work was done at the now-deleted user:Raul54/test). The numbers are more accurate and (for the FA proportion) more precise. Raul65414:48, 6 July 2007 (UTC)[reply]
Clicking on an end of month diff may result in a different FA count than shown in this chart; the numbers on this chart are accurate. If clicking on FA promotions or FA demotions yields a different number than shown in the chart, please let us know, as that could indicate vandalism or another problem; those numbers should match the archives exactly. SandyGeorgia (Talk) 15:51, 6 July 2007 (UTC)[reply]
I didn't check article history in between the first December 2001 iteration of WP:FA and the January 2004 RefrshingBrilliantProse (RBP); that would be way too much work since there are no good records and it would involved stepping through WP:FA diffs one at a time. So, if any of these article were defeatured and re-featured between Dec 2001 and Jan 2004, I didn't pick that up. (Kudos to Yannismarou and Ceoil for maintaining Wiki's oldest FAs to current standards!) SandyGeorgia (Talk) 17:53, 22 November 2007 (UTC)[reply]
OK, I finished 2008, but was not sure about the superscript. I always was not sure how to 56.25%. Is this OK? Wnated to check before I did more. Ruhrfisch><>°°02:18, 18 September 2010 (UTC)[reply]
(out) Glad to help - I need to do some PRs and will work on this the next few days. I also added the year total for 2008, is that OK? Finally, do you want the numbers linked to the archives? Ruhrfisch><>°°02:41, 18 September 2010 (UTC)[reply]
I am doing totals and percentages with a spreadsheet, so they will be correct unless I mistyped something. Since the promoted FAs are given by month to Oct. 2003, but the archived FACs only are given by month to April 2004 (with aggregates before), how do you want to handle those? Ruhrfisch><>°°21:33, 18 September 2010 (UTC)[reply]
cross-posted to WT:FA, please comment in general over there, and add more stats here.
Has anyone ever kept track of stats on types of featured articles and topics? For example, I've done some analysis at User:Carcharoth/Featured articles needing regular updates. The list of living people featured articles was obtained with the Cat Scan tool, though the raw list was filtered to focus on single-person biographies. The results (as of 23 February 2008) were:
Of 1906 featured articles, 436 were about people or groups of people.
Of these, 33 were about music groups, leaving 403 single-person biographies (21%)
How about suggesting somewhere else to put the list? I'm not trying to be silly here, Sandy, but I've been making what I thought were useful suggestions and analysis around the FA process, and I'm getting the impression that there is some resistance to this from you, or at least some terseness. Do I have to contribute more regularly to reviews to get a better response, or something? For what it is worth, there are biographical encyclopedias out there, so the idea of having a section on "biographies" as a topic is not completely out of left field. Carcharoth (talk) 02:59, 23 February 2008 (UTC)[reply]
I'm sorry if my responses seem terse Carcharoth. I missed this comment last time through, and only noticed it now that Gimmetrow posted (below). I admit to being frustrated at following a conversation that's in four places: the talk pages of WP:FAC, WP:FAR, WP:FA and WP:FAS. I'm not sure where to suggest you can put the list, because I'm not aware of what types of readers might want to see that specific grouping, particularly considering the other sources of similar info Gimme posted below. There are so many different and interesting ways of sorting out the FAs that it's hard for me to view bios as having a place any more special than any other grouping. I often toy with different ideas about how we would divide up the FA list when it becomes necessary, and I can never convince myself for one way any better than any other, but we'll cross that bridge when we come to it. I'm also concerned that the title of your list could mislead; as I said at the talk page of WP:FAR, I don't believe that bios necessarily need regular updating any more than several other categories of FAs do. SandyGeorgia (Talk) 06:26, 23 February 2008 (UTC)[reply]
No problem, Sandy. Sorry if I got a bit frustrated there. I agree the title of my page is a bit misleading now. It was intended as a list of FAs that might need regular updating, but I soon gave up on that idea and got sidetracked into cleaning up the biography list. Gimmetrow points out Category:FA-Class biography articles, but my point, made several times now, is that that category (like all the WP 1.0 assessment categories, includes featured lists, which I want to separate out. Easily done (eg. using CatScan) but still annoying to have stuff mixed up like that. More specific to WPBiography is the mixing up of group articles (in this case music group articles) with single-person biographies. It is the latter I wanted to extract, and that took a while, especially as the "musicians" category includes both single-person musician biographes and music group "biographies". I suspect this is a historical feature due to the set up of WPBiography being done by Kingboyk, who is involved in music articles. I have no problems with music groups being included in WPBiography, but I do wish they could be filtered out more easily. Music groups are not the only examples of "group" articles, of course, but they are the most common, I think. Carcharoth (talk) 11:01, 23 February 2008 (UTC)[reply]
From the above, we can see that 21% of the featured articles are biographical articles in the classical sense (ie. articles about the life story of a single person, as opposed to a band or group of people). I'm not sure whether it means anything (would we expect the figure to be higher or lower?) but I hope I'm not the only person to find that interesting. Carcharoth (talk) 03:14, 23 February 2008 (UTC)[reply]
My gut feeling (without having done any analysis) is that bios are not unique. There are probably many subdivisions reaching about 20%, like warfare or history (since these are overlapping, not mutually exclusive categories), so I don't know that bios are unique. But I haven't checked. SandyGeorgia (Talk) 04:11, 23 February 2008 (UTC)[reply]
Thanks for those links. The former is more what I'm looking for, but as I said, the category system at the moment isn't really clean enough and mixes up various types of "people-related" articles with the more normal "single-person" biographies. Anyway, I have some graphs to upload now! They don't include the biography stats, unfortunately, but maybe later. Carcharoth (talk) 11:01, 23 February 2008 (UTC)[reply]
OK, I took data from this page version of Wikipedia:Featured articles, and rustled up the following graphs. This sort of thing varies a lot over time as newer articles are featured and older ones defeatured, and as new section "types" are added, so a snapshot at any one particular time is more interesting than useful, but as I'd collected the data, I thought I'd put the graphs up anyway.
It should be noted that the types at WP:FA are broad ones, intended to keep the page clean and readable compared to the more extensive categorisation seen at WP:GA and in the WP 1.0 assessment categories (by wikiproject), and that (by design) there is no overlap between types, though in fact many articles could fit in more than one. Having said that, there are currently 28 types, with more than half the featured articles being made up of the top seven types, and the top eight types having over 100 articles (biology and medicine; media; music; geography and places; history; warfare; sport and recreation; and literature and theatre). There are 6 types with less than 20 articles (computing; language and linguistics; business economics and finance; mathematics; philosophy and psychology; and food and drink).
Possibly doing more analysis than this will not be productive, but one idea I had was to continue trying to work out (using the categories) what the numbers of featured articles are at the very broadest levels, such as portals and topics such as history, science, technology, biography, and so on. Some people have said that Category:FA-Class articles provides this, but in fact that category assessment system (for WP 1.0) bundles featured lists and featured articles together, so (for example), Category:FA-Class geography articles has two articles, a featured article and a featured list, and doesn't include most of the articles under the "Geography and places" section of WP:FA. The useful endpoint of all thise would be to feed through updated lists for portals, such as Portal:Geography, Portal:History, Portal:Science and Portal:Biography to use. Those portals (and many others) change their featured article on a monthly basis, but in some cases there are enough articles available to have the articles updating more frequently, on either a weekly or daily basis. See Portal:Biography/Selected article/Candidates#Automatic rotation for an example of this. I am sure other portals now have enough featured articles available to rotate featured articles daily, and I hope analysis like this will help.
First, I'm shocked. I had never actually tallied the numbers, but I thought Warfare and History were the largest categories, and would be the first to require division. Perhaps they just "look" large because their articles tend to have longer titles? I had no idea Biology and Medicine was the largest group. (OK, tooting my own horn, and Marskell, Casliber and TimVickers' too :-)) Anyway, next ... it looks like we have the makings of a Dispatch article. I'll ping Marskell and make sure he's looking in here (I'm not sure he watches this page). SandyGeorgia (Talk) 21:57, 23 February 2008 (UTC)[reply]
Ps, can we sync this with WP:WBFAN and find what percentage of those in Biology and medicine were nommed by Casliber, Marskell and TimVickers (and anyone else I don't know about)? I think you've got a WP:FCDW article here. (FAR saves of Tuberculosis and Schizophrenia go to Tim and Cas, btw.) Ah, and the Dino guys; they beef that catgory up, what is that number? SandyGeorgia (Talk) 22:04, 23 February 2008 (UTC)[reply]
An FCDW article? Wow. Please feel free to use the stats for that, but please do double-check my figures. As for the dino articles, I looked at Talk:Acrocanthosaurus and was shocked to see that WikiProject Dinosaurs hadn't updated its assessment tag! In the end, I found a total of 16 that hadn't been updated, which probably means that the WikiProject should get an award for doing more article writing than article assessment! :-) Anyway, the newly updated Category:FA-Class dinosaurs articles contains 23 featured articles and one list. The rest of the section can be broken down into other subcategories, not for the purpose of browsing, I hasten to add, but to help identify trends and hotspots. I'll do that now for biology and medicine. Carcharoth (talk) 23:00, 23 February 2008 (UTC)[reply]
The 161 "biology and medicine" ones are: 67 "animals" (including 28 birds), 28 "general biology", 23 "dinosaurs", 22 "medicine", 11 "people" and 10 "plants" (including one fungus). If that helps. I think it would help to get lists separated out from featured articles in the assessment categories. Category:FA-Class bird articles has 43 members, but that includes 12 lists, Georg Forster, Archaeopteryx, and Flight feather, a total of 15 that subtracted from the 43 figure yields 28, the number of articles I've labelled "birds". I think the WP:1.0 people did say a while back they were thinking of doing a separate listing or category for featured lists. Indeed Category:FL-Class articles exists, but it seems the uptake is not 100% yet. maybe that could go in some announcement or report? Carcharoth (talk) 00:26, 24 February 2008 (UTC)[reply]
(od) Nice work! The last time I did a dispatch, I whipped it together in half-an-hour on Monday night. Unless Carcharoth has a problem, I'll use these graphs to do the same tomorrow night (mentioning that you compiled them, of course). I think we can segue from the broad numbers to the issue of pop culture over-representation and how much it's been debated (e.g. here). Marskell (talk) 20:02, 24 February 2008 (UTC)[reply]
That discussion Marskell linked was interesting. I note he said there that "The four of 28 FA categories that absorb pop cult—Media, Music, Sport and recreation, and Video games—account for 500, or 26%, of our FAs." - that may be an oversimplification. I had a closer look at "Music" and found at least 26 (of the 153) are decidedly not popular culture (though some of the "Music of..." articles are borderline. See the list here for example. "Media" is more uniformly popular culture, though even that has Film Booking Offices of America, B movie, BBC television drama, Blackface, Kinetoscope, Mutual Broadcasting System and Sound film as encyclopedia articles on more "classical" (non-popular culture) topics. That is 7 out of 159, though the biographies of very famous people would also be "classical" encyclopedia topics (some of the biography articles probably count as "popular culture"). Even "Video games", at first glance irredeemably popular culture, has ESRB re-rating of The Elder Scrolls IV: Oblivion, GameFAQs, Nintendo Entertainment System, PlayStation 3, Super Nintendo Entertainment System, and Wii, though all those are still about contemporary topics. As for "Sport and recreation", there probably are some in there that are "popular culture", but not really more than in other areas. As you can probably tell, I enjoy identifying broad category areas. One thing I'd love to do is identify the articles on "established" topics, as opposed to articles on "contemporary" topics that may, to be fair, only be transient and not become a noticeable part of history, no matter how 'popular' or 'current' they are today. Something like "pre-1960s", and excluding living people. But that would take a lot of time. It would be nice if the "403 biographies" bit could be mentioned, oh, and that there are 80 articles on living people. Carcharoth (talk) 00:11, 25 February 2008 (UTC)[reply]
FA proportion column heading
There's obviously been some confusion about this... The proportion of FA's is equal to the number of Featrured Articles divided by the total number of articles in Wikipedia. (The fact the number of articles is given in thousands is irrelevant). For example, for July 2008, there were ~2,484,000 articles (as shown by the "2484" in the number of articles column), and 2,163 FAs. Dividing the latter by the former yields 0.000871 = 0.0871%, the number shown in the "FA proportion" column. Nothing gets multiplied by 100. Tompw (talk) (review) 21:12, 10 August 2008 (UTC)[reply]
The multiplication by 100 is to convert from a decimal number (0.000871) to a percentage (0.0871%). Percent comes from the Latin per centum, i.e. 'for every hundred', so the percentage sign represents an implied division by 100. The multiplication by 100 is necessary to convert the raw number into "units" of percentage. Dr pda (talk) 21:22, 10 August 2008 (UTC)[reply]
Umm... 0.000871 is exactly equal to 0.0871%. They are the same number, just written differently. 0.000871*100 = 0.0871 = 8.71%, which isn't the number wanted. Tompw (talk) (review) 21:33, 10 August 2008 (UTC)[reply]
Semantics: the calculation to get the percentage shown in the chart is one number divided by the other times 100. Do as you will with the chart; I don't have time to worry about such a small issue as where we put the % (until someone is confused down the road and wants to put it back). SandyGeorgia (Talk) 21:55, 10 August 2008 (UTC)[reply]
(outdent)I agree that 0.000871 is equivalent to 0.0871%. I also agree that 0.000871*100 = 0.0871. I interpret the formula in the column heading as the algorithm necessary to obtain the percentage of FAs (which is what Sandy has just said). Perhaps it would be clearer if the percentage symbol was moved to the column header to show the "units" in which the values in the column are expressed.
FA percentage #FAs/#articles*100 (%)
0.0871
0.0868
...
though this then requires the reader of a row somewhere in the middle of the table to scroll back up to find out what this number means. Dr pda (talk) 22:23, 10 August 2008 (UTC)[reply]
OK, how about a compromise: how about #FAs/#articles*100% ? Tompw (talk) (review)
Thought this might be of interest to people. The following histogram shows the distributions of the readable prose sizes for all 4262 GAs and 2066 FAs as of June 2008. After about 25kB the number of GAs is approximately equal to the number of FAs. The majority of GAs are shorter articles; if I recall correctly the reason GA was introduced was to provide some sort of recognition for articles which were not necessarily of a length/comprehensiveness to reach FA, so as far as this goes it seems to be working. There is also a non-zero minimum length for the articles which have been promoted to FA so far. Dr pda (talk) 04:28, 21 September 2008 (UTC)[reply]
I thought the graph was a good demomstration of the fact that despite what many claim GA does reward short articles. Unlike FA, where there is much soul-searching and wikilawyering about how best to prevent short articles reaching FA status. But that doesn't mean that GA is only for short articles. --Malleus Fatuorum (talk) 17:55, 16 November 2008 (UTC)[reply]
And I thought the discussion at FA was over over comprehensive, not length (although keeping the discussion focused on that has proven impossible). SandyGeorgia (Talk) 17:59, 16 November 2008 (UTC)[reply]
fixed counts. Both bar and pie charts should now be correct. My error was in using the category separator character "·" to count in emacs: this works for all sections except for Mathematics whose second FA "1 − 2 + 3 − 4 + · · ·" used three of them. -84user (talk) 21:55, 16 November 2008 (UTC)[reply]
Gnumeric deserves the thanks really, I just pasted and chose chart settings. If anyone wants the settings that made these (for future charts) the whole gnumeric files are at these webs.com URLs:
There has been a change in how the semi-automated peer reviews (SAPRs) are linked in peer reviews (now the link is to the SAPR script on the tool server allowing interested users to run SAPRs themselves if they want). This means there will no longer be an archive of SAPRs to use as the PR stat for WP:FAS.
From Dec. 2004 to Nov. 2007 the PR stat was how many PRs were in the PR archive for that month (how many closed that month). I took over doing the stats and started using the number in the SAPR archive (how many PRs were opened in a month). Since the stat used to be how many PRs were in the archive for that month, and will be going back to that for August 2009 and beyond, I was BOLD and went back and changed the stats for the months using SAPRs to the PR archive stat instead.
The two numbers are not identical - here they are with the archive stat (followed by the SAPR stat, so July 2009 had 128 in the SAPR archive and 127 in the PR archive):
2009 Jan 138 (150); Feb 137 (136); Mar 147 (123); Apr 117 (132); May 125 (107); Jun 114 (122); Jul 127 (128)
2008 Jan 185 (184); Feb 187 (188); Mar 156 (183); Apr 178 (204); May 168 (184); Jun 179 (173); Jul 158 (176); Aug 164 (177); Sep 133 (161); Oct 148 (158); Nov 153 (149); Dec 121 (153)
2007 Dec 107 (156)
I figured it was better to be consistent for the Dec 2007 to July 2009 stats and make the switch in FAS, but some of the discrepancies are odd. If someone feels like checking, I would appreciate (I have July 2009 off by one from the SAPR archive, so there may be errors). Ruhrfisch><>°°03:38, 2 August 2009 (UTC)[reply]
Since May 14, 2008 User:PeerReviewBot has archived peer reviews, before then I or User:Allen3 archived peer reviews that were older than 14 days with no comments, or that were or at FAC or FLC. Before this date part of the discrepancy might be due to someone archiving their own PR and not adding it to the archive. For example, Category:March 2008 peer reviews has 165 PRs, while the March 2008 PR archive has 156, and the SAPR archive has 183. If someone listed an article twice for PR in a single month, they would have two SAPRs, but only be listed once in the category for that month. I am not sure how the category is defined (PR started that month or closed that month or something else). Ruhrfisch><>°°13:29, 2 August 2009 (UTC)[reply]
OK, I went and looked at the data and linked the sources used and made the following table. I am not sure what the discrepancies are due to - I think the double listing would show up in all places. Anyway here's the table - I think I would use the category number. Any thoughts or insights welcome, Ruhrfisch><>°°02:29, 3 August 2009 (UTC)[reply]
I added totals. The category is highest, PR archive lowest. My current preference is to use the archive for the earliest two months and the category thereafter. Ruhrfisch><>°°11:30, 3 August 2009 (UTC)[reply]
Where there appears to be a discrepancy between the PR archive and the PR category, the most likely culprit is template transclusion limits. I checked out Wikipedia:Peer review/January 2009 and the 34 missing articles correspond to the 34 templates which are not transcluded (scroll to the bottom of the page). The number of articles in the PR archive can be computed more readily by counting the number of articles on the corresponding VeblenBot page, e.g. This should compare robustly with the number of articles in the category, as it is a snapshot of the category at the moment when the archive was closed: this has been typically a month after the category was active. Geometry guy21:10, 3 August 2009 (UTC)[reply]
Thanks, so three things should give us the same number in theory - the monthly PR archive, the monthly PR category, and the monthly VeblenBot PR page. The archive page can be incomplete because of transclusion size issues, so it is not reliable. The category and VeblenBot pages only go back as far as February 2008, so before then we have to have to use the PR archive. Of the category and the VeblenBot pages, the category gives the number directly (have to count by hand on VeblenBot's page) and the category makes it clearer which things should not be counted - see User:VeblenBot/C/February 2008 peer reviews and Category:February 2008 peer reviews - it is clearer in the cateogry page that four of the things listed are WikiProject PRs and so should not count in the total for this page. I plan to use the catehory number from Feb. 2008 and will link to the page. Ruhrfisch><>°°13:10, 4 August 2009 (UTC)[reply]
OK, the stats page has been updated with numbers from the monthly PR category from Feb 2008 on. I adjusted the numbers in a few cases - some PRs from WikiProjects were in the cat, so I did not count those. I also removed the cat from a few copies - all of this means the table above is not always accurate, but the FAS page is. I linked the number to the cat page so anyone interested can check. Thanks to G'guy. Any thoughts on how to check the archives older than Feb 2008 for transclusion limit errors? Ruhrfisch><>°°18:05, 4 August 2009 (UTC)[reply]
You may recall at the beginning of the year I wrote a dispatch for the Signpost (Wikipedia:Wikipedia Signpost/2009-02-16/Dispatches) on the activity of the major content review processes, including PR. One of the results was the image at right . To obtain this I exported all the revisions of WP:PR and for each revision counted the number of occurrences of {{Wikipedia:Peer review/ in the wiki source of the revision. (Or the equivalent pages/templates after the VeblenBot automation). I still have the numerical data, so in principle I could write a script to go through these data and tally up how many reviews were removed from the page each month. (Though this would include malformed noms which were subsequently deleted). These numbers could then be compared to the numbers from the PR archive pages. I'll try to do this in the next few days. Dr pda (talk) 02:05, 5 August 2009 (UTC)[reply]
As promised I have tried rerunning over my data, however the numbers I got for 2005–2007 were quite different from those in the WP:FAS table. Part of this is that by just counting the number of occurrences of {{Wikipedia:Peer review/, the script counted malformed/joke noms, or double counted reviews removed by vandalism and subsequently restored. What one really needs to do is step through diff-by-diff looking at the titles of the reviews, and/or how long they were on the page (i.e. excluding noms which were reverted within a short period). Fortunately I discovered difflib in python, which allowed me to do exactly this. The numbers, and indeed the lists of articles, I get still don't tally with the numbers/articles on the archive pages (Wikipedia:Peer review/February 2005 etc). I don't mind going through and correcting the archives to add missing articles etc, but before doing so I wanted to doublecheck what convention we are using to assign the month. Is is the month the PR was opened, the month in which the last comment was made, or the month in which the review was removed from WP:PR?Dr pda (talk) 03:26, 14 August 2009 (UTC)[reply]
Thanks so much for doing this - the current archives are based on the month in which the peer review is closed, so to be consistent that should be used throughout. User:Allen3 used to do the archiving of PRs until the end of December 2007, and I am pretty sure this is how he did it too (should I ask him?). When I used the semi-automated peer reviews, I used the month the PR opened (as that was when the SAPR was generated), but now all the numbers are based on the category for that month (from Feb. 2008 on) or the archive for that month (before that). I really appreciate your work on this, but is this worth the effort? Ruhrfisch><>°°13:50, 14 August 2009 (UTC)[reply]
Hi, I updated the GA statistics for August, and I'm pretty sure the number you've given for FAs is wrong. There are 3374 FAs now (3 September), but two articles were promoted on 2 September, so the number for the end of August should be 3372, and the net number of promotions should be 28 (in either case the number should be red, since August has 31 days.) Lampman (talk) 16:58, 3 September 2011 (UTC)[reply]
Hello Ruhrfisch. I maintain the GA statistics page, and I think the number of FACs promoted in February is correct; it doesn't take into account the article promoted in March before you updated the FA stats page. I updated the GA stats page at 00:18 UTC on March 1st, and there were 4,181 FAs at that time. If you want to take No. 34 Squadron RAAF into account, the total should be 4,182 and not 4,181. AmericanLemming (talk) 06:44, 2 March 2014 (UTC)[reply]
If 34SQN isn't taken into account, that's correct, tks guys. I will double-check the whole thing as usual later today when I get a moment. Cheers, Ian Rose (talk) 06:57, 2 March 2014 (UTC)[reply]
Thanks to both of you - ideally the stats should be done just after midnight GMT of the last day of the month. Even if they are done late (as the Feb 2014 stats were this time) the idea is that the stats reflect what was done in the the month of February. Since 34SQN was promoted on March 1, it should count on current FACs, but not be counted in the FAs promoted stat. Ruhrfisch><>°°18:03, 2 March 2014 (UTC)[reply]
Sandy, re. this edit, the use of colour was deliberate based on the hidden note at on the main page that "no" is used if the figure is negative. Based on your edit and looking over the history of the stats, perhaps what's meant is "less than half" or some such -- pls let me know for future reference and so we can amend the note as necessary... Cheers, Ian Rose (talk) 07:45, 2 January 2015 (UTC)[reply]
hm, I had never seen that hidden note, and had never used that system. What I did always was:
30 is white
30–59 is light green
60 and above is dark green
below 30 is pink
The only negative that was in the chart before was from the Refreshing Brilliant Prose. Then, the two low months (less than 10) were Mar 2004 and Nov 2011, so I made them red. Perhaps what we really need to do is separate negative months from low months, because should we end up with some sort of new "sweeps-style" Refreshing, we could have a large negative like Jan 2004. But can we amend the note to not be counting days in a month? I had never done that-- just used 30 as a neutral white. SandyGeorgia (Talk) 16:49, 2 January 2015 (UTC)[reply]
Article count jumps by 100,000 in a day
@FAC coordinators: You may or may not have noticed that between March 28 and March 29 the article count on the English Wikipedia jumped from about 4,753,000 to about 4,848,000, an increase of roughly 95,000 articles overnight. I brought this up, first at Talk:Main Page and then at Wikipedia:Village pump (technical)/Archive 136#Article count jumps by 100,000 in a day.... To make a long story short, it seems that the automatically generated article count isn't especially accurate, and they've now started running some maintenance script at the end of every month to fix it manually. Thus, it does seem that the drastic jump in article count is legitimate rather than a bug or human error.
Anyway, it might be a good idea to add a note to this month's FA stats explaining the reason for the unusually large increase in the number of articles; I'm thinking of perhaps doing that for the GA stats page, which I maintain. AmericanLemming (talk) 04:11, 31 March 2015 (UTC)[reply]
Has anyone noticed that the counter works funny now? It's 4,855,939 at the moment, which would imply that only about 7,000 articles have been created this month (?!). GregorB (talk) 13:22, 21 April 2015 (UTC)[reply]
I have noticed that. In the past the article count would increase by about 1,000 every day, but it has been increasing very slowly this month, and for a week or so it was actually decreasing by a few hundred articles each day. I brought it up at the Village Pump's technical section, but no one got back to me on it. See Wikipedia:Village pump (technical)/Archive 136#GA count drops 3,500 overnight. It may be worthwhile to bring it up again; as that was a secondary concern of mine at the time (the GA count has since been fixed), it's understandable that it was overlooked. AmericanLemming (talk) 16:55, 21 April 2015 (UTC)[reply]
Thanks - just asked the question in the WP:VPT. My guess is that the March jump overshot the actual article count for some reason, and now it's "compensating". GregorB (talk) 17:16, 21 April 2015 (UTC)[reply]
6284 + 17 = 6301, not the 6302 indicated at WP:FA month end (Z1720, be sure to check that the tally matches as a way of finding errors). [5] (By the way, updating this chart is much harder than it looks, so thanks for taking it on :) SandyGeorgia (Talk) 12:49, 1 August 2023 (UTC)[reply]