This is the workshop support page for the user script RedlinksRemover.js. Comments and requests concerning the program are most welcome. Please post discussion threads below the section titled Discussions. Thank you. By the way, the various scripts I have written are listed at the bottom of the page.[1]
This script is functional
This script processes bulleted lists, removing the redlinked end nodes, reiteratively, until none are left. (A redlinked end node is a list item that is comprised of nothing more than a redlink, and that has no children.) After it has done that, this script delinks the remaining red links, and deletes red category links. It doesn't remove list item entries that have annotations, or that have children (indented entries beneath it).
This semi-automated editing script is currently alpha software.
It is still new and should be used with caution to ensure the results are as expected. Please check changes carefully before saving, and report any problems.
Script's workshop
This is the work area for developing the script and its documentation. The talk page portion of this page starts at #Discussions, below.
Description / instruction manual
This script is functional
This script processes bulleted lists, removing the redlinked end nodes, reiteratively, until none are left. (A redlinked end node is a list item that is comprised of nothing more than a redlink, and that has no children.) After it has done that, this script delinks the remaining red links, and deletes red category links. It doesn't remove list item entries that have annotations, or that have children (indented entries beneath it).
The redlink remover has two major uses (but it is not limited to these):
It can help clean up outlines that have accumulated too many redlinks.
It simplifies creation of outlines using standard templates. A problem with outline generation templates is that they include every possible link that a particular type of topic (say, provinces, or cities) might have, which creates outlines with lots of red links. Following up outline creation with this script will solve that problem. Tip: it is best to work on the outline with redlinks for awhile before using the redlink remover, because the script will delink those redlinks that have children, leaving them in as informative branches in the outline. Removing redlinks too early creates extra work as many of the topics may need to be added back in or relinkified.
How to install this script
Important: this script was developed for use with the Vector skin (it's Wikipedia's default skin), and might not work with other skins. See the top of your Preferences appearance page, to be sure Vector is the chosen skin for your account.
To install this script, add this line to your vector.js page:
Save the page and bypass your cache to make sure the changes take effect. By the way, only logged-in users can install scripts.
Explanatory notes (source code walk-through)
This section explains the source code, in detail. It is for JavaScript programmers, and for those who want to learn how to program in JavaScript. Hopefully, this will enable you to adapt existing source code into new user scripts with greater ease, and perhaps even compose user scripts from scratch.
You can only use so many comments in the source code before you start to choke or bury the programming itself. So, I've put short summaries in the source code, and have provided in-depth explanations here.
My intention is Threefold:
to thoroughly document the script so that even relatively new JavaScript programmers can understand what it does and how it works, including the underlying programming conventions. This is so that the components and approaches can be modified, or used again and again elsewhere, with confidence. (I often build scripts by copying and pasting code that I don't fully understand, which often leads to getting stuck). To prevent getting stuck, the notes below include extensive interpretations, explanations, instructions, examples, and links to relevant documentation and tutorials, etc. Hopefully, this will help both you and I grok the source code and the language it is written in (JavaScript).
to refresh my memory of exactly how the script works, in case I don't look at the source code for weeks or months.
to document my understanding, so that it can be corrected. If you see that I have a misconception about something, please let me know!
In addition to plain vanilla JavaScript code, this script relies heavily on the jQuery library.
mw is the alias for mediawiki (the mediawiki library)
These two aliases are set up like this:
(function(mw,$){}(mediaWiki,jQuery));
That also happens to be a "bodyguard function", which is explained in the section below...
Bodyguard function
The bodyguard function assigns an alias for a name within the function, and reserves that alias for that purpose only. For example, if you want "t" to be interpreted only as "transhumanist".
Since the script uses jQuery, we want to defend jQuery's alias, the "$". The bodyguard function makes it so that "$" means only "jQuery" inside the function, even if it means something else outside the function. That is, it prevents other javascript libraries from overwriting the $() shortcut for jQuery within the function. It does this via scoping.
The bodyguard function is used like a wrapper, with the alias-containing source code inside it, typically, wrapping the whole rest of the script. Here's what a jQuery bodyguard function looks like:
1(function($){2// you put the body of the script here3})(jQuery);
Many of my scripts create menu items using mw.util.addPortletLink, which is provided in a resource module. Therefore, in those scripts it is necessary to make sure the supporting resource module (mediawiki.util) is loaded, otherwise the script could fail (though it could still work if the module happened to already be loaded by some other script). To load the module, use mw.loader, like this:
// For support of mw.util.addPortletLinkmw.loader.using(['mediawiki.util'],function(){// Body of script goes here.});
The ready() event listener/handler makes the rest of the script wait until the page (and its DOM) is loaded and ready to be worked on. If the script tries to do its thing before the page is loaded, there won't be anything there for the script to work on (such as with scripts that will have nowhere to place the menu item mw.util.addPortletLink), and the script will fail.
The part of the script that is being made to wait goes inside the curly brackets. But you would generally start that on the next line, and put the ending curly bracket, closing parenthesis, and semicolon following that on a line of their own), like this:
1$(function(){2// Body of function (or even the rest of the script) goes here, such as a click handler.3});
This is the reserved word var, which is used to declare variables. A variable is a container you can put a value in. To declare the variable portletlink, write this:
varportletlink
A declared variable has no value, until you assign it one, such as like this:
portletlink="yo mama";
You can combine declaration and assignment in the same statement, like this:
varportletlink=mw.util.addPortletLink('p-tb','#','Remove red links');
Caveat: if you assign a value to a variable that does not exist, the variable will be created automatically. If it is created outside of a function, it will have global scope. For user scripts used on Wikipedia, having a variable of global scope means the variable may affect other scripts that are running, as the scripts are technically part of the same program, being called via import from a .js page (.js pages are programs). So, be careful. Here are some scope-related resources:
mw.util.addPortletLink: the ResourceLoader module to add links to the portlets.
portletId: the id of the portlet (that is, menu) where the new menu item is to be placed. The various menus ("portlets") are::
p-navigation: Navigation section in left sidebar
p-interaction: Interaction section in left sidebar
p-tb: Toolbox section in left sidebar
coll-print_export: Print/export section in left sidebar
p-personal Personal toolbar at the top of the page
p-views Upper right tabs in Vector only (read, edit, history, watch, etc.)
p-cactions Drop-down menu containing move, etc. (in Vector); subject/talk links and action links in other skins
href: Link to a Wikipedia or external page (the initial purpose of portletlink was to link somewhere)
text: Text that displays in the menu (the title of the
id: HTML id (optional)
tooltip: Tooltip to display on mouseover (optional)
accesskey: Shortcut key press (optional)
nextnode: id of the existing portlet link to place the new portlet link before (optional) (Don't forget: ids have a leading "#")
The optional fields must be included in the above order. To skip a field without changing it, use the value null, that is, no space between the quotes for that parameter.
To place the menu items in alphabetical order, and so that they don't move around in the menu, for your last menu item specify the id of an existing menu item to anchor it. Then set "next node" for the next to last item as the id for the menu item you just set, and so on.
Important: All we've done so far above is assign mw.util.addPortletLink to a variable. It won't do anything until we bind the variable to a click handler (see below).
click handler
To make a menu item that does something when you click on it, you have to "bind" mw.util.addPortletLink, via its variable, to a handler. Like this:
(The variable used in this example is "portletlink").
1$(portletlink).click(function(e){2e.preventDefault();3//do some stuff4}
The "handler" is the part between the curly brackets.
What is the default being prevented? Portletlink's default action is to link somewhere. We don't want it to do that, and so that is what e.preventDefault(); is for.
Calling a function
In JavaScript, a function is a subroutine, essentially, a program within the main program. Functions are usually placed at the end of the program, after its core, but can also be located in a library, like jQuery. You call a function by its name. The function "example" is called like this:
The location object pertains to the URL of the current document, and href is one of its properties.
window.location.href.indexOf
This applies the indexof method upon the URL, to return the index (starting position) of a given string. This can be used to check if the URL contains a specific string.
if (window.location.href.indexOf('action') >= 0 essentially means "if 'action' is in the URL". That is, its position in the URL is equal or greater than 0 (0 represents the first spot, 1 is the second spot, etc.), telling us that it is in there. If it is not there, it would return a -1.
window.location.href.substr
Gets part of the URL.
The substr method returns the substring from the provided start and end indexes, from within the string the method is applied to. If only a start index is provided, the substring will be from that index to the end of the string. In this case, the string is window.location.href (that is, the URL). Note that 0 represents the first character of the string.
So, window.location.href.substr(0,6) would return the first 7 characters of the URL.
That's not particularly useful, as we probably want to manipulate the string based on what is in it. For example...
What that returns is the beginning of the URL through the # character, which we can in turn use in concatenation. The following line of code concatenates (adds) ?action=edit to the substring, and then replaces the URL with it:
This method returns the value of the attribute specified for an element it is attached to (with a dot, for example someElement.getAttribute('attribute')). This allows elements to be processed by a particular attribut, such as their class.
This command makes a message box with a message appear, with an OK button. The script will not continue until the OK button is pushed.
The message is included within the parentheses. It can be a string, a variable, or an object. If it is a variable or an object, its value or contents is displayed in the message.
The script puts the target page into edit mode, but then doesn't edit anything Fixed
2017-04-06 The script runs the functions at the end of the script, when the "Remove red links" menu has not been clicked, and I don't know why
Desired/completed features
Completed features are marked with Done
Remove redlinked entries in outlines
Remove redlinked bullet entries that both have no annotation and have no children. (If one has an annotation, or a child, don't remove it.) Because this could create new candidates, this function needs to be looped.
Check for annotation
To check for children, see if any bullet entries that follow it have more bullets than it does
If no changes are made during a complete loop, stop. (How do you check for changes?)
To prevent infinite looping, stop after 10 iterations (it can always be run again)
When no more candidates are to be are to be found, remove redcats, and delink the remaining redlinks.
Save title to variable. Done don't have to. Can check title directly.
Some features will work only on outlines, and will check the title variable for "Outline of" first. Done used
(if match "Outline of" in title, then do....) Done used if (document.title.indexOf("Outline ") != -1) {}
Integrate anno.js (the annotation toggler).
get it working right first
For stream editing commands, the script will have an optional interactive mode.
For Macro compatibility, all toggles will have an on-"button" and an off-"button".
Entry linker (checks unlinked entry names for the existence of non-disambiguation page article titles. If one exists, linkify it.)
Entry inserter (checks template for entries missing in the current outline, then checks each title for existence.
If one exists, insert it, but not if it is a disambiguation page.)
Display a random outline, but not if currently in edit mode.
Display next outline in the main list of outlines, but not if currently in edit mode.
Development notes
Trycatch needed, and more
The Transhumanist, where you use local storage.getItem() or setItem() you should always wrap that in try catch, as it can fail at any moment (even if you checked previously). This can be due to the browser running out of storage space for the domain, or because the browser is running in privacy mode or with an ad blocker extensions or something. Also, your new RegExp() calls should be lifted outside of the for loops, so that they aren't continuously recreated. For wpTextbox1.value, realise that sometimes the content might be managed by an editor (The syntaxhighlighting beta does this for instance). We use the jquery.textSelection plugin to abstract way from these differences. Don't check document.title, check mw.config.get( 'wgTitle' ) or mw.config.get( 'wgPageName' ). And when you use mw.util.addPortlink, you have to ensure that the mediawiki.util plugin is loaded already, which you can do by using mw.loader.using. —TheDJ (talk • contribs) 14:47, 27 October 2017 (UTC)[reply]
@The Transhumanist: I am guessing that your problems are not caused by a lack of dependencies, but rather by the way you are using the localStorage object. According to the docs, you should be using localStorage.setItem('foo','bar'), not localStorage.foo='bar'. If you use the API in a non-standard way I wouldn't be surprised if there were differences between the way the various browsers handle it. — Mr. Stradivarius♪ talk ♪13:18, 12 February 2017 (UTC)[reply]
Actually, after some more reading, it seems that the localStorage.foo='bar' syntax is fine (although the setItem syntax is preferred). That link does give some other suggestions as to things that could be wrong, though - localStorage might not be implemented on old browsers, it might be disabled by users, or it might be full. — Mr. Stradivarius♪ talk ♪15:01, 12 February 2017 (UTC)[reply]
Also, I would use a unique prefix for your localStorage keys, maybe olutils_ (so the current key would be olutils_redlinks), to reduce the chance of clashes between your data and other localStorage data saved by MediaWiki or by other gadgets. — Mr. Stradivarius♪ talk ♪13:24, 12 February 2017 (UTC)[reply]
@Mr. Stradivarius: It had little to do with memory, but your suggestion provided the essential clue. Since I had 2 versions of the script running simultaneously, the second one worked because of data stored locally by the first one. Without that storage there, the second script failed, which became apparent when I customized the localstorage key per your suggestion. Which led me to a bug. I fixed the bug, and the now the second script works on its own. Though there are still some bugs (the menu item has to be clicked again after getting a preview, twice, for it to work, but it does work). Thank you! The Transhumanist00:59, 13 February 2017 (UTC)[reply]
@The Transhumanist: Also, all calls to LocalStorage should always be wrapped in a try catch. Localstorage can easily fail due to being full, or due to being in a privacy mode or some other restriction that the browser is placing. —TheDJ (talk • contribs) 07:32, 13 February 2017 (UTC)[reply]
This is the way Twinkle specifies variables in a regular expression; to my knowledge it's the only way to do it. The plus signs are acting as string concatenation operators (string + string = concatenation). And you couldn't express this in literal notation, because literal notation can't accept variables (it is literal after all).
As an example of using new RegExp, this regexp in literal notation: /^Hello\s+/gi is entirely equivalent to new RegExp('^Hello\\s+', 'gi'). Note the double escaping! This is because character escapes in regular expression are processed separately from character escapes in strings.
(edit conflict)@The Transhumanist: It's difficult to quickly assess exactly what's going on without seeing the data it's being run against and the matches you are seeing. Is it possible that there's actually multiple matches in the input text? E.g. if you look for "apple" in "apple, orange, pineapple", two matches is the expected result. You would need to look for "\bapple\b" to restrict both ends to word boundaries, but that would still give multiple matches against "red apple, green apple, orange". There is nothing about that code snippet which suggests that multiple matches should be unexpected behaviour.
I think your problem here is that you need to deal with the text before and after the thing the regexp is supposed to match. Looking at Alex's original script, I believe you need to use something like his original regular expressions, as it looks like they already deal with the beginning and end of the string. I don't see why you appear to be reinventing the wheel here, as it looks like Alex's script already deals with that issue.
As for "plus signs as used here", do you mean the string concatenation operators? If you don't recognise basic JS operators and string concatenation, I suggest that you may need to learn fundamental JS programming before continuing. Try the tutorials and guides at https://developer.mozilla.org/en-US/docs/Web/JavaScript.
Literal notation? If you feed "apple" into the above snipped, via the "redlinks" array, you'd get the equivalent of /(apple)/i. That's very basic stuff, so you should probably be doing some reading on Mozilla's MDN site (or some other JS learning resource).
Ok, now it's clearer exactly what you are talking about. This is expected behaviour, it's standard regexp group stuff as Syockit explained below. Don't use the term "nested RegExp" like that, as that's not what it is and that term just adds to the confusion here. Murph9000 (talk) 20:50, 5 May 2017 (UTC)[reply]
The parentheses creates a capturing group. The first match is the whole matched string, while the second one is the captured group. Try with RegExp(RegExp.quote(redlinks[i]),'i') and see if it works. Syockit (talk) 12:57, 5 May 2017 (UTC)[reply]
Wow. It's been many moons since anyone has asked me for JS help- I thought I'd become just a mostly-faded memory for a few editors. With that being said, Syockit is right as far as I can tell in that the parentheses create a capturing group. I'm not entirely sure why they're there at all- I'd use the same nodeScoop2 you currently have without the parentheses around the RegExp.quote; i.e. try:
I forgot the quotes. So I put those back, and adjusted the replace strings to account for the removal of the control group delimiters, and it worked. Now to try it on the current script... The Transhumanist02:29, 6 May 2017 (UTC)[reply]
Glad I could help. Best, --03:34, 6 May 2017 (UTC)
Perhaps you are looking for String.indexOf(). Oftentimes people discover regular expressions and somehow convince themselves that everything must be expressed in terms of regexes. If regex is not working for you, it is ok not to use it. 91.155.195.247 (talk) 20:07, 5 May 2017 (UTC)[reply]
You will have the starting position of the string and its length (= the length of the substring you are looking for). String.substring() will extract you the matching string - which will be the same as the string you were looking for, except possibly for case. This is how a programmer would do it, not with regexes. 91.155.195.247 (talk) 15:55, 7 May 2017 (UTC)[reply]
I cannot clear see what do you want to achieve, but I find these codes overkill. Mediawiki add titles of actual destinations as attribute title to links and class new for red links.
This jQuery one-liner simply unlinks all red links. This snippet actually inserts linked texts before links and then remove these links.
The function in before returns what to remain after link removal. The this refers to the currently iterated element due to jQuery's design. If we want to completely remove a link, make the function return nothing then. The following example completely removes red category links and treat other red links as usual.
Sorry, but I don't understand what you are trying to achieve. If you want to remove red links from the DOM (in the generated code of the view), then you can use Javascript (faster) or jQuery (slower) to remove or replace all of them eventually at once, or do more things on each of them in a loop. With Javascript you need to use one of "getElementsByClassName" (for example applied to class="new") or "getElementsByTagName" for all <a> elements, and then you can apply styles ('_color_', '_cursor_', …) or replace them with your own content such as their "innerHTML" values. With jQuery >= 1.2 you can use something like $(".new").replaceWith(function() { return $(this).text(); }); or $(".new").replaceWith(function() { return this.innerHTML; });, while with jQuery >= 1.4 you can use the unwrap function like this: $(".new").contents().unwrap();. jQuery seems to be shorter, but this is because you do not see the whole code that is behind the execution of it, and it is much slower than doing it in native Javascript (when it is well written, of course). All of them, Javascript and jQuery, should be wrapped into a document ready function (via Javascript or jQuery), a setTimeout functions or both. If you need to store their values, then you can create a for or a while loop for each of them and the do whatever you want to. Of course, if you are working on the source code, then the above does not apply at all. About the regex, I need more about the data, plus tests and examples. The reason for its multiple matches has been well explained above. Just a note, if you are sending and parsin a huge quantity of data, for example the whole content of an article, then something like PERL is always the faster and the better solution possible because it was conceived for reporting of the big log files such as those generated by a server. AWK and sed are also good with this. Unfortunately, I do not think that they are available here. –pjoef (talk • contribs) 12:18, 6 May 2017 (UTC)[reply]
The script is User:The Transhumanist/OLUtils.js, and the section we are working on here is for processing outlines, and starts with this:
if (document.title.indexOf("Outline ") != -1) {.
For outlines, the script is supposed to remove list item entries (including bullet and carriage return) that are comprised entirely of redlinks, but only if they have no children. Red end nodes. It goes through several iterations, just in case the removal of a red end node renders other red entries into end nodes. After all those have been removed, then the script deletes any red category links, and finally delinks the remaining embedded red links. I've provided a more in-depth explanation below under #What the script is supposed to do. For non-outlines, it just deletes red cats and delinks the rest of the redlinks. The Transhumanist04:09, 7 May 2017 (UTC)[reply]
The whole regex
The sample I posted at the beginning of this thread was simplified to show the problem that it was returning 2 matches instead of the expected 1. So, I thought the script might do unexpected replacements, but that has not happened (yet). But I've run into other problems...
The regex from the script is more involved than the sample, and is for matching the line the key topic (redlinks[i]) is included on plus the whole next line:
The reason the whole next line is included is because I'd like to delete entries based upon the type of line that follows (or more accurately, does not follow). If the entry is not followed by a child, then it gets deleted, but should be kept if it does have a child. The weird thing is, that the part matching the whole next line is in the 4th set of parentheses, so you would expect $4 to back reference that. In practice, it is $3 that accesses that capturing group. And I don't know why. Though the solution (ignoring the parentheses around the embedded RegExp, when counting the capturing groups) seems to be working. But, I've run into a worse problem...
// Here is the regular expression for matching the scoop target (to "scoop up" the redlinked entry with direct (non-piped) link, plus the whole next line)varnodeScoop2=newRegExp('\\n((\\*)+)[ ]*?\\[\\[\\s*'+(RegExp.quote(redlinks[i]))+'\\s*\\]\\].*?\\n(.*?\\n)','i');// To actualize the search string above, we create a variable with method:varmatchString2=wpTextbox1.value.match(nodeScoop2);alert(matchString2);// for testing// Declare match patternsvarpatt1=newRegExp(":");varpatt2=newRegExp(" – ");varpatt3=/$1\*/;// Here's the fun part. We use a big set of nested ifs to determine if matchString2 does not match criteria. If it does not match, delete the entry:// If matchString2 isn't emptyif(matchString2!==null){// If has no coloned annotation (that is, does not have a ":")if(patt1.test(matchString2)===false){// If has no hyphenated annotation (that is, does not have " – ")if(patt2.test(matchString2)===false){// ...and if the succeeding line is not a child (that is, does not have more asterisks)if(patt3.test(matchString2)===false){// ... then replace nodeScoop2 with the last line in it, thereby removing the end node entrywpTextbox1.value=wpTextbox1.value.replace(nodeScoop2,"\n$3");incrementer++;alert("removed entry");}}}}
The problem is patt3. I'm trying to check for the asterisks at the beginning of the second line. If there is one more asterisk on that line than in the line before it, it means it is a child. In which case I do not want to delete the parent. But, the above code deletes the parents anyways.
In the example below, $1 should match the asterisk at the beginning of the parent line, and $1\* (patt3) should match the asterisks at the beginning of the child line. But it doesn't seem to be working. And when I add an alert to test for the value of patt3 or $1, the script crashes!
* Parent
** Child
If $1 includes asterisks in it, does it return those asterisks escaped?
I did. See the RegExp below. Notice that the double escaped asterisk is inside a capturing group. When you use $1 to refer to that capturing group, will the asterisks in there still be escaped? When I try to use alert to test for $1, it crashes the script.
"*" is a quantifier (a special character) and, as well as all other special characters, it needs to be escaped when it is part of the pattern of characters that you want to find or replace. See: w3schools.com/jsref/jsref_obj_regexp.asp. About the use of the alert for debugging purpose I suggest you to use console.log() method to display data directly within the debugger of the browser. More @: w3schools.com/js/js_debugging.asp. The debugger itself should be also able to show you which and where is the error within your code. About the editing of the article and the DOM manipulation, it doesn't save the changes, but if an user is in the editor window/view and it presses the save button all changes that have been made to the content will be saved. –pjoef (talk • contribs) 09:26, 7 May 2017 (UTC)[reply]
P.S.: I haven't tested it out but probably $1 is "undefined". In this case you need to check for this before you use it: if ($1) …. –pjoef (talk • contribs) 09:34, 7 May 2017 (UTC)[reply]
Running the code in generated document seems to be easier because we can make use of HTML structure. A leaf link safe to remove is the only child of li.
Hi. Thanks for the suggestions. I have some questions for you: Would the code you provided edit the article, or just affect the view? I'm looking for editing solutions. How could a script remove children list items in the edit window? The Transhumanist03:57, 7 May 2017 (UTC)[reply]
I got your message. It looks like you may have gotten the help you need. When working with RegExp, I like to try them on some sample strings to see what each one is actually matching, and what it's returning. There's a great website for doing that: regex101. Nathanm mn (talk) 16:12, 6 May 2017 (UTC)[reply]
We still haven't figured it out. The problem I'm trying to solve is how to identify when a list item has a child. A child list item will have one more asterisk at the beginning than the parent. So, I set up a capturing group for the asterisks at the beginning of the parent (so $1 would be the back reference), and then try to match that number of asterisks plus one more in the child (using $1\*). But it isn't working. I am stuck. There are other criteria which the entries to be removed must fail, otherwise I wish to keep them. So simply getting rid of all children isn't what I'm after. We already know they are red linked entries, because the first half of the program puts all redlinks into an array, which we process in the second half of the program. Then the nested if structure checks first for whether the current redlink in the array has no entry. If it doesn't, then we check to see if it has no colon annotation. If it doesn't have a colon separator, then we check to see if it doesn't have a hyphenated annotation. If it doesn't have an en dash separator, then we check to see if it has no children. If it doesn't have a child, then we delete it from the wiki source, modifying the actual article itself.
Once all redlinked entries that fail our tests are removed, then the rest of the program mops up, deleting red category links, and delinking all redlinks that still remain after that. We know, due to the extensive filtering we just subjected them to, that they are all embedded redlinks, the content of which we want to keep. I'll make a sample below that presents examples of the data instances to be processed. The Transhumanist22:12, 6 May 2017 (UTC)[reply]
Geology – this text is an annotation. And here is an embedded redlink 1. After all the end node (dead end) redlinked entries are removed, this redlink will be delinked.
Redlink 9 – this annotation will prevent this entry and all its parents from being deleted. They will however be delinked after list item removal and relinked category removal are completed.
What we want to do is remove the list entries for which the topic is a redlink, but which do not have annotations, and which do not have children. Then we delete redlinked categories, and delink whatever redlinks are leftover — those will be by definition embedded, such as redlink 1 and redlink 3. Redlink 3 is embedded by virtue of having children.
Redlink 2 is a dead end. It is an end node in the tree structure that contains only a redlink. It gets deleted.
The script goes through the list multiple times, until it no longer finds dead end redlinks. This is because when it removes a redlinked end node, that may cause its redlinked parent to become a dead end node (such as when it has no other children). Multiple iterations catch these. So the entire branch starting with Redlink 10 will be deleted.
Here is the problem I've run into: the script currently and erroneously deletes the Redlink 3 list item. Because $1\* or $1\\* do not seem to be identifying the Redlink 4 list item as having more asterisks in the wikisource than the Redlink 3 list item. I do not know why. What should happen is that Redlink 3 would be retained because of Redlink 4, and after Redlink 4 is removed, then Redlink 3 is checked again and is kept by virtue of having Psychology as a child. But, when Redlink 3 is deleted in error, it makes Psychology a child of Geology, thus ruining the tree structure.
All this processing is to be done in the editor, so that the redlinked entries are actually removed from the article.
Your patt3 is off for a couple of reasons. First, with the $n regex matches, in general you access them using RegExp.$1 (which will be a string containing the match), not just $1 – except for within String.replace function, when just $1 is used in the replacement string [1]. Secondly, with regex literals, what you type is literally what you get as the regex string. So var patt3 = /$1\*/; will literally be interpreted as /$1\*/ (where $ asserts position at the end of the string; 1 matches the character 1; \* matches the character *).
What you could use instead is var patt3 = new RegExp("\\*{"+(RegExp.$1.length+1)+"}"); which, for example, will give you the regex /\*{3}/ when the RegExp.$1 match is "**" - Evad37[talk] 04:59, 7 May 2017 (UTCt)
I'll try it and will let you know how it works. By the way, what about var patt3 = new RegExp("$1\*");. Why won't that work? (That was the first thing I tried, before going literal). The Transhumanist23:14, 7 May 2017 (UTC)[reply]
$1 as part of a string doesn't have any special meaning, except within the string .replace function. So var patt3 = new RegExp("$1\*"); would give you the regex /$1*/. To use the actual match instead of $1, you would use var patt3 = new RegExp(RegExp.$1 + "\*"); which would e.g. give you the regex /***/ for a match "**". To actually get valid regex, the match would have to be escaped (note also that the single slash in "\*" doesn't get preserved unless it is double-escaped as "\\*") . - Evad37[talk]23:55, 7 May 2017 (UTC)[reply]
Thank you Evad. Using your code, the script now works, matching about 90% of what it is supposed to. So far, I've cleaned up the all the country outlines for Africa. Now working on Asia. I'm not sure why it is skipping some entries that it shouldn't, but I'm sure I'll figure it out by observing as I use it. The Transhumanist22:39, 11 May 2017 (UTC)[reply]
RedlinksRemover.js – remove red linked list items that are end nodes (last item of a branch), reiteratively, and delink the rest, from outlines and lists.
Search results page enhancements
SearchSuite.js – suite of search features, each on a switch, so you can turn them on and off as desired (and it remembers the switch positions). SearchSuite replaces all the search scripts below.
StripSearchSorted.js – provides menu item that turns sorted strip search on/off; Includes the features of all the other search scripts below, and sorts the results alphabetically. Co-written by User:Evad37; he did the heavy-duty programming. StripSearch.js – provides menu item that turns strip search on/off; strips search results down to bare pagenames. StripSearchSansRedirecteds.js – strips search results down to bare pagenames, with redirected entries removed. No off switch. StripSearchInWikicode.js – strips search results down to bare pagenames, and presents as bullet list with pagenames enclosed in double square bracket link delimiters (just like in wikicode). Redirected entries are also removed. No off switch.
StripSearchSimple.js – strips all search results down to bare pagenames. No off switch.
Viewing enhancements for list and outline editors hunting for list items
ViewAsOutline-AllPagesWithPrefix.js – adds list item wikicodes to the on screen results of All pages with prefix, for easy copying/pasting into outlines and lists. No off switch.
ViewAsOutline-Category.js – removes the alphabetical headings, and adds list item wikicodes, for easy copying/pasting into outlines and lists. No off switch. ViewAsOutline-Book.js – converts page to outline format on screen, with wikicodes for easy copy and paste into outlines and lists. No off switch. ViewAsOutline-CategoryTree.js – on menu item click, converts tree to outline format on screen, with wikicodes for easy copying/pasting into outlines and lists. Refresh page to undo. ViewAsOutline-Templates.js – converts navboxes and sidebar boxes on current page into outline format, with wikicodes displayed, for easy copying/pasting into outlines. No off switch. (Needs to be enhanced to show base links rather than pipes.)
ViewAsOutline-Glossary.js – converts glossaries to list format on screen, with wikicodes for easy copy-and-paste into outlines and lists. No off switch. (In early development, very rough. Currently, only converts bullet list glossary format.).
Coming eventually
QuickPortal.js – portal tool, for creating and restarting portals. Will be expanded for modification and maintenance as well.
OutlineDedupeHolding.js – remove duplicate list items from the outline's holding sections (See also, holding bin, place these, general concepts, and list section). That is, it will remove from each of these sections, in turn, all topics that exist anywhere else in the body of the page (not in templates). StripSearchFilter.js – narrow down search results. OutlineMain2LI.js – convert main links in outline to list items. TopicPlacerFromBin.js – Topic placer (loads topics from holding bin into an array, then assists in placing each one). For use on lists and outlines. FetchCategory.js – import category to present location. FetchCategories.js – import categories to tagged locations. FetchSection.js – import section to present location.
TopicSender.js – prompts for parent topic (outline), then checks
if it is already there, if it isn't, then it sends it to the receiving section of that outline) PlaceCategory.js –
OutlineViewConventional.js – change the viewed formatting of the current outline into that of a conventionally indented outline, without headings.