Sections immidiatly after the POLL (12 Apr 2005 - 10 August 2005) last posting 8 October 2005
Suggestion for increasing granularity
Wiggle room needed
With/without diacritics: how about "anything goes if you can prove you can clean up your own mess?"
Wrongtitle excess
Existence versus common.
Even foreign words used as foreign words need to be fully Anglicized - Jimbo Wales
Native spelling
Write for the reader
Proposal and straw poll regarding place names with diacritical marks
Following a comment by Philip Baird Shearer on my Talk page, I have moved this discussion and straw poll here from Wikipedia_talk:Naming_conventions. Dpbsmith(talk) 17:51, 9 Apr 2005 (UTC)
This PROPOSAL is to add language to the policy. This might be a clarification of existing policy, it might be a statement of de facto policy, it might be a change in policy, I can't really tell.
The proposal is prompted by a current debate at Talk:Zürich on Zürich vs. Zurich. This particular debate happens to be complicated by uncertainty as to which is the official spelling, and also by uncertainly as to either spelling is really overwhelmingly predominant in English. I don't want to address these issues of fact here, I just want to clarify policy for cases where competing names differ only by diacritical marks. Dpbsmith(talk) 15:08, 9 Apr 2005 (UTC)
Proposal
Whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used.
up◦land 18:00, 9 Apr 2005 (UTC). I support the proposition in general, but question whether it is applicable in the particular case of Zurich/Zürich.
Urhixidur Moderately support; my beef would be a) use of "most common" over "more correct", and b) what is the definition of "most common" (Google hit counts are a very poor measure)? | Urhixidur 18:49, 2005 Apr 9 (UTC)
DopefishJustin(・∀・) - This is what I was thinking would be the best policy, and if there are any cases that need to be decided individually, I don't see how that would be a problem. Wikipedia policies are not inviolable law. DopefishJustin(・∀・) 23:30, Apr 9, 2005 (UTC)
Andrew pmk 16:29, 10 Apr 2005 (UTC) See explanation on Talk:Zürich. However, there should always be a redirect from the version without the diacritical mark.
Curps 17:22, 10 Apr 2005 (UTC) - Both versions need to exist, and both will get used in appropriate contexts (in a list of world cities, you'd probably use "Zurich"; in an article about elections in Switzerland, you'd probably use "Zürich"). So, it's not such a big deal after all, merely a question of which one is the redirect and which one is the article title. It's probably better to use the version with diacritics as the page title, because the absence of diacritics bothers the people who care about them a lot more than their presence bothers the people who don't care about them. -- Curps 17:22, 10 Apr 2005 (UTC)
affects other cases, such as El Nino. dab(ᛏ) 18:51, 10 Apr 2005 (UTC)
DmitryKo This is a step in the right direction. The convention is alredy being followed on a case-by-case basis - see Zürich, Göring, Führer, El Niño/La Niña etc. DmitryKo 23:01, 11 Apr 2005 (UTC)
Haukurth02:06, 23 May 2005 (UTC) This would be a useful guideline (even if it may not apply in every single case). We currently use Reykjavík rather than Reykjavik and I think that's a good example of this policy being followed in practice.
—Pt(T) 20:28, 12 Jun 2005 (UTC) For example, in Estonian the diacritics change the pronunciation and meaning notably. Käpp is pronounced as cap and means a paw, while kapp is pronounced as cup and means a cupboard. I think there are even examples where omitting the diacritics frome one's name makes it actually insulting and offensive!—Pt(T) 20:28, 12 Jun 2005 (UTC)
Instantnood — 21:02, Jun 26, 2005 (UTC) - except for loan words that have existed in English, incorporated into the English vocabulary without the diacritic marks, for a very long time.
Sounds about right. — Chameleon 21:10, 26 Jun 2005 (UTC)
There are pros and cons, but in the end, as long as we use Latin alphabets and there is no autimatic pronaunciation help, using national letters will do more good then harm. Of course, redirects from plain English version are a must. --Piotr Konieczny aka Prokonsul PiotrusTalk 30 June 2005 12:36 (UTC)
Stong support. I don't like that laziness and ignorance is about to take over Wikipedia. Redirects from spelling without diacritics easily solves the problems for the lazy and ignorant. -- Elisson | Talk 30 June 2005 22:11 (UTC)
Absolutely. — Trilobite (Talk) 1 July 2005 09:08 (UTC)
Now that the wiki handles it, sure. But make sure the redirects are there! Alphaxτεχ 3 July 2005 07:30 (UTC)
I agree with what Curps said above. dbenbenn | talk 7 July 2005 17:12 (UTC)
Support. It's the system most print encyclopedias seem to use, and it's the one I've been following in my articles on Cameroon (e.g., Yaounde vs. Yaoundé). Writing the title without the mark looks uneducated or lazy. Amcaja12:43, 11 July 2005 (UTC)
Support. Native names with diacritics are far more accurate. Mandel 13:48, July 20, 2005 (UTC)
Support. Before I read the text of the proposal, I was going to say "Iff the English is equivalent to the native sans diacritics" and since the proposal does say that, then unconditional support. These words aren't English; if we have English versions of them (i.e. Moscow, Cologne) then this would not be an issue, because those are English words. But they are foreign words, and professionalism demains we keep the original spelling as long as we're using the original word. --Golbez 23:21, August 18, 2005 (UTC)
Support (provided that noone objects to a redirect page from the illiterate version (e.g. Zurich) for those of us with restricted keyboard mappings.) Red King23:08, 7 October 2005 (UTC)
Support Why are users getting so worked up about using Zürich instead of Zurich, what's the big problem? I see also the naming wars about the "ß". Unless the Kulturminister don't come up with a new law to completely abolish the "ß" in favour of the "ss", it remains, simple as that. Gryffindor13:48, 22 October 2005 (UTC)
The question this policy deals with will come up rarely, and in every case there will be specific issues affecting the resolution. The proposed policy is a poor substitute for Wikipedians' judgement. —MichaelZ. 2005-04-9 15:40 Z
Thryduulf 15:56, 9 Apr 2005 (UTC) I think that these cases should be decided on an individual basis.
The proposal says: "Whenever the most common English spelling is... " What that basically means is that "we shouldn't use the most common English spelling if it happens to lack the diacritical marks of the native spelling." But why should we create policy in the English Wikipedia that prevents us from using the most common English spelling? Nobbie 03:42, 10 Apr 2005 (UTC)
Suspicious! For now I hold the "most common name"-principle to be superior — given that the person/whatever is sufficiently known to start with. Ruhrjung 13:13, Apr 10, 2005 (UTC)
Oppose. The policy is to use English. That implies that if most English speakers would write the name without the umlaut, the spelling to use on this Wikipedia is without the umlaut. It is also the version that most English-speakers will tend to search on, and the version that is easiest to type on a keyboard set up for the English language. --Tony Sidaway|Talk 13:43, 10 Apr 2005 (UTC)
Oppose. The proposal actualy say that the général rule "common English spelling" should be replaced with the "native spelling". Is this an English Wikipedia or an international one? I vote use the English language. Philip Baird Shearer 10:30, 11 Apr 2005 (UTC)
Oppose. Trying to apply specific general rules like this to the idiosyncracies of spelling conventions is like trying to mend a watch with a sledgehammer. Let's have general guidelines, but keep them simple and vague because the right thing to do differs from language to language and case to case. Gdr 11:41, 2005 Apr 11 (UTC)
Oppose. The rôle of díacritics in English is to show the user as an overeducated snob. Vietnamese and Turkish names would become impossible as page titles. --Audiovideo 00:03, 12 Apr 2005 (UTC)
Oppose. No one ever accused the English language of being consistent or logical. We spell the way we spell and change our spellings with impunity. Articles should be placed at the common modern English spellings and moved when those change. Rmhermen 13:20, Apr 12, 2005 (UTC)
Oppose. Letters with diacritics are different from letters without diacritics. If we're going to use the English names, we should always use the English names, and that means without the diacritics. (This follows Chicago style). Remes 19:24, 13 Apr 2005 (UTC)
Oppose. Certain diacritics should be allowed (but by no means should they be considered obligatory) where there is a demonstrable tradition of using those diacritics in contemporary English prose, like façade, El Niño, résumé, führer (but not for example rôle), but diacritics from languages other than German, French, Spanish and possibly a few other closely related languages, should be avoided except when being presented as examples of that language. As Audiovideo mentions above, names from languages that use many diacritics whose significance is probably unfamiliar to most English readers, like Turkish, Vietnamese, and Polish, should be avoided in normal English-language text. In particular, diacritics that represent sound distinctions that aren't realized in the English pronunciation of words are not necessary, such as tone marks in Vietnamese or kreska n in Polish. The diacritics in the examples I give above are acceptable because they call attention to the fact that the letters they modify are pronounced differently than the ordinary English pronunciation of those letters. Nohat 03:59, 14 Apr 2005 (UTC)
Oppose. There are a few exceptions, such as those suggested by Nohat (altho' I disagree with the stated rationale), but, for the most part diacritics are not included in common English spellings. Niteowlneils 04:47, 18 Apr 2005 (UTC)
Oppose. This is English language and there is long-established practice not to use diacritics, with some exceptions, e.g., in loan phrases. Mikkalai 03:55, 24 Apr 2005 (UTC)
Oppose. We should present information in the way most accessible to English-speaking readers. Even more important is encouraging people to be bold in editing. Someone who happens to know something about Zürich should be able to write it up as Zurich, without having to use anything beyond the characters on a common English-language keyboard. I know it's not all that hard, but it would still put some people off. JamesMLane 17:58, 15 Jun 2005 (UTC)
Oppose. I hate seeming like I oppose other languages or other cultures or whatever, but this is the English Wikipedia, and non-English information needs to be presented from inside an English "wrapper". kmccoy(talk) 19:57, 15 Jun 2005 (UTC)
Oppose, especially with all the new letters possible in the new MW version that are absolutely never used in English. Proteus(Talk) 28 June 2005 22:48 (UTC)
Oppose. This would only complicate the wording and wouldn't achieve anything really useful. --Joy [shallot] 29 June 2005 17:49 (UTC)
Strongly oppose. 'Common names' means common names. I don't care if people think that means English speakers are lazy, obstinate, callous, insensative (sp?), or whatever. The fact is that most native English speakers NEVER use diacritics, so the 'common names' policy in most cases will favor the plain-text version of the name. There ABSOLUTELY should not be a blanket policy reversing a 5-year old Wikipedia standard--some names may actually have enuf currency to include the diacritics--should be decided case by case. And as far as any current exceptions to the rule, A) two wrongs don't make a right, and B) My WP time has been reduced lately because of personal matters, so I haven't had time to oppose every existing exception--just the proposed changes. I wish people would stop investing so much energy in things that were decided 5 years ago, and just move on, building the encyclopedia as then defined. 4-5 years and 500,000 articles later is far too late to redefine the rules, unless there is an extremely compelling reason, and I don't think that requirement is met here. Niteowlneils 30 June 2005 08:16 (UTC)
Also, many of the diacritics are not visible on many browsers--the focus should be on making information as accessible as possible. Niteowlneils05:35, 3 November 2005 (UTC)
Oppose (if this vote is still going). The abandonment of real English names is sad: Lyons, Brunswick, Leghorn, Cracow, Bombay etc. The diacritics issue is yet another concession to political correctness and a reduction in knowledge: English does not use diacritics, and the few words that have them are not yet English. There is no need to use Québec or Zürich. --Henrygb22:11, 25 July 2005 (UTC)
Oppose. This is an English encyclopedia, and we should use the most commmon English spelling. Just as we do not pronounce Paris as pair-EE in English, we shouldn't adhere to foreign spellings either. --131.111.193.12021:21, 7 August 2005 (UTC)
Oppose as policy. This work-around would commit us to using Zürich and all similar cases, but using Nuremberg and so forth, because Nuremberg differs from Nürnberg by more than a diacritic. That is just too silly. Compare Fowler's Modern English Usage. Septentrionalis22:17, 9 August 2005 (UTC)
Oppose. Why would we use native names? This is in English. Common usage should prevail. Christopher Parham(talk) 20:33, 2005 August 10 (UTC)
OpposeWP:UE exists for exactly the reason of preventing such discussions. The most commonly used name by English speakers should prevail. Unless most people have started typing the umlaut, it should be "Zurich".—Larineso22:48, 11 August 2005 (UTC)
Oppose use the most common name in english. Use diacritics only when these are commonly used in english, and when the matter is dubious (usage is about equal), do NOT use the diacritics for easier entry and display. Redirs from diacritic forms are of course fine. DES(talk)19:36, 15 August 2005 (UTC)
Oppose use english. Even as policy (Zurich -> Zürich vs. Nürnberg-> Nuremberg) doubtful. Problems with Vietnamese will maybe convince the international spelling brigades one day. Tobias Conradi(Talk)20:07, 18 August 2005 (UTC)
Oppose. It would be pretty silly for an encyclopedia to call a Germany's third largest city "Munich" but a Canada's second largest city "Montréal". Of course we should use accents when they are commonly used in English (eg "Île de la Cité"). - Farquard19:00, 23 August 2005 (UTC)
Oppose. This is the english language wikipedia. Its not hard to figure out what the common english spelling is. Chuck13:21, 29 August 2005 (UTC)
Strongly Oppose. Most native English speakers don't use them, and for non-native English speakers, they probably don't know the ones that aren't used in their native language - e.g. does the average native-Spanish speaker know Croat diacritics? Plus to which, using them is blatant discrimination against some languages (Greek, Russian, Arabic, etc) and in favour of others (German, French, etc). Why give some languages favoured treatment? Noel(talk)23:40, 29 August 2005 (UTC)
Oppose Unless the diacritics are commonly used in English we should use English orthography. older≠wiser 19:55, September 10, 2005 (UTC)
Oppose. With exceptions as suggested by Nohat and agree with Henrygb regarding English names of places/historical people. – AxSkov (☏) 01:24, 18 September 2005 (UTC)
Oppose. Article titles and in-text references should be in English; to do otherwise might be perceived as being pretentious. However, non-English spellings should also appear upfront with English spellings and only when they are germane to the discussion (e.g., found in a non-English title, etc.) -- with redirects to English titles/articles -- or if a translation is unavailable or impractical. Orwell advocated simplicity in writing and discouraged use of foreign phrases when possible; shouldn't we when doing so with characters? E Pluribus Anthony08:07, 20 September 2005 (UTC)
Strongly oppose The vast majority of English speakers do not even use diacritics in any way shape or form. Relying on redirects, e.g. of état to etat, would result in huge numbers of redirect pages and still there would be gaps. The only possible way to permit diacritics in English WP titles would be to set the search function so that e-with-any-diacritic equals e-with no-diacritic in both directions, that is both in search term and in title. Non-latin characters have no such equivalents, so they can't work at all in searches. JohnSankey23:28, 4 October 2005 (UTC)
Strongly oppose This is the English wikipedia, and we should use the English alphabet. The proposal would also hide information not only from some web searches but also from many page searches in browsers. The foreign spellings certainly ought to be mentioned; the English ones should be used. Gene Nygaard21:07, 7 October 2005 (UTC)
Oppose: spellings in the modern English alphabet should generally be preferred in titles for the reasons Gene Nygaard gave. CDThieme22:41, 20 October 2005 (UTC)
Oppose: the present wording should be interpreted as covering diacritics: If a native spelling uses different letters than the most common English spelling (eg, Wien vs. Vienna), only use the native spelling as an article title if it is more commonly used in English than the anglicized form. The article title names the English article about the object, not the object. Distinctions between local/English spelling/naming should be in the intro. Rd232talk13:22, 25 October 2005 (UTC)
Oppose. This is the English lang WP. Curps's argument about who's bothered by the issue is nicely pragmatic, but the ultimate extension of it would lead to abandoning the whole NPOV policy. Nurg22:36, 29 October 2005 (UTC)
johnk 29 June 2005 15:29 (UTC) I am ambivalent about this. It seems to me that diacritics/accents deriving from French, Spanish, Portuguese, German, and perhaps the Scandinavian languages are frequently used in English, and, as NoHat notes, indicate to English speakers differences in pronunciation. Diacritics from other languages are much less commonly seen, and there is significance is much less aware, so the primmary effect is to make words look odd. What does "Ċ" indicate, for instance? Or "Ő"? On the other hand, it doesn't seem terribly important, and I don't think it's exactly a matter of English vs. non-English names. So I'm going to abstain, especially since I think that diacritics in names from French, Spanish, Portuguese, German, and probably the Scandinavian languages ought to be used regardless. johnk 29 June 2005 15:29 (UTC)
"This particular debate happens to be complicated by uncertainty as to which is the official spelling, and also by uncertainly as to either spelling is really overwhelmingly predominant in English. I don't want to address these issues of fact here..."
So this proposal is to add a "convention" which may not even apply to this particular debate. Does it apply anywhere? Why make up rules that don't affect to any debated issue? Each case is different, and when there is a dispute we can resolve it with discussion and consensus. If this happens to come up, how do we know in advance that this will be the best solution?
If this were stated as policy, I believe it would resolve the issue with Zurich/Zürich. I just wanted to make it clear that Zürich/Zurich is not a pure case since both spellings seem to be used both by natives and by English speakers.
The general policy is "give priority to what the majority of English speakers would most easily recognize." Note that the word is "recognize," not "write." It can be argued that diacritical marks do have any effect on "recognition," i.e. Zürich and Zurich are precisely equally recognizable.
I just reviewed Wikipedia:Naming conventions (use English) and could not determine an answer from that page. What do you get when you apply that page to the Zürich/Zurich question? It says "use the most commonly used English version of the name for the article (as you would find it in other encyclopedias)," but the Britannica, Encarta both use Zürich even though the American Heritage Dictionary and Merriam-Webster both use Zurich. Do you think that page is saying that encylopedias are more authoritative than dictionaries? Moreover, the page you cite notes "a trend in part of the modern news media and maps to use native names of places and people, even if there is a long-accepted English name." How does that apply in deciding Zurich/Zürich?
Based on observation of some existing articles, Mel Etitis asserts that my proposal is existing practice. If he/she is right and we're doing it already, why not codify existing practice? Dpbsmith(talk) 16:10, 9 Apr 2005 (UTC)
Your proposal implies that it may not affect Zürich/Zurich, but now you're saying it would. You believe.
Respected authorities use different conventions. No matter how you cut it, there would still be lots of debate on Zürich/Zurich. There would be even more about whether your proposal applies to it.
Also, it's inappropriate to put forth a proposal which would render moot a vote in progress (or would it? Let's start some more discussions). This is a needless complication. —MichaelZ. 2005-04-9 16:34 Z
I'd still appreciate an answer to my question. In your opinion, a) does the policy articulated on Wikipedia:Naming conventions (use English), answer the question of what the article on the biggest city in Switzerland should be titled, and b) if so, is the answer Zürich or Zurich? Dpbsmith(talk) 02:22, 10 Apr 2005 (UTC)
Are you asserting that Zurich is the most common English spelling, and that it is is simply the native spelling with diacritical marks omitted? —MichaelZ. 2005-04-10 06:36 Z
Well, how about this. Remember, you say the situation is already covered by Wikipedia:Naming conventions (use English). So, hypothetically, let's posit that: in English "Zurich" and "Zürich" are, together, the most common spelling; "Zürich" is far from rare, but "Zurich" is more common; and that in Switzerland the two forms are, together, the most common spelling; "Zurich" is far from rare, but "Zürich" is more common. If that were the case then, under currently stated policy, what should the article be titled? Dpbsmith(talk) 11:53, 10 Apr 2005 (UTC)
If we can't even answer a simple question about the use case that the proposed policy is designed to fix, then how can this policy be helpful in any other case? —MichaelZ. 2005-04-10 15:30 Z
by Philip Baird Shearer
I think that this poll is a very bad idea. The wording is such that it is forcing a confrontation Eg:
Support: #Urhixidur Moderately support; my beef would be a) use of "most common" over "more correct", and b) what is the definition of "most common" (Google hit counts are a very poor measure)? | Urhixidur 18:49, 2005 Apr 9 (UTC)
Oppose: Thryduulf 15:56, 9 Apr 2005 (UTC) I think that these cases should be decided on an individual basis.
I brooded quite a bit about the wording. I was trying to keep it very short and simple. First, I was trying to articulate what some had said is the de facto policy. I may not have done a good job of articulating it. Second, I brooded about "most common," but that's exactly the language used in Wikipedia:Naming convention—"Use the most common name of a person or thing that does not conflict with the names of other people or things," so the fact that the "most common" name will often be disputed can't be helped unless some better criterion can be enunciated.
I thought that it was constructive to say that in the special case of names with an without diacritics we should have a rule that avoids the need for trying to determine which of the two forms is more common in English.
I thought very seriously about adding something to suggest that this should be a guideline for new articles but should never be used as a reason to move an existing article.That is, in general, articles should not be moved from a no-diacritic version to a with-diacritics version or vice versa. I realize that enshrining inconsistency bothers some people, but this is very much like our policy on U. S. versus British spelling, and I think it would be a good idea. Dpbsmith(talk) 11:53, 10 Apr 2005 (UTC)
With an argument like the one over the name Zurich, I can see the value of the initial poll. But not the second one because all it shown is that opinions are still split close to 50/50. In this case because the topic is much more complicated, it is better to use the fussy logic as exists at the moment than force one or the other style, because neither side has made a strong enough case to convince the other and it has a English Wikipedia wide impact effecting many many articles. So in this case I think Polls are evil -- Philip Baird Shearer 10:30, 11 Apr 2005 (UTC)
Well, apologies for evilness. But I couldn't and still can't tell what existing policy is on this matter. Perhaps because there isn't any. Perhaps there shouldn't be any. I think the discussion has been interesting and not terribly heated. I particularly like Curps' remark: "It's probably better to use the version with diacritics as the page title, because the absence of diacritics bothers the people who care about them a lot more than their presence bothers the people who don't care about them." At least I now know what my own (certainly humble, possibly evil) opinion is, which is that:
Existing pages should not be moved simply to add or remove diacritical marks. I'd like to see a formal adoption of a policy of intentional inconsistency, as is the case with U. S./British spelling and usage. No argument that's been made convinces me that any such move either way can possibly be worth the bother. Search and lookup issues can be handled by redirects and by making sure that both spellings appear on the page somewhere, anywhere. Dpbsmith(talk) 13:10, 11 Apr 2005 (UTC)
There is a policy but people choose to ignore it:
primary author does not seem to work. There are dozens of articles which have been changed from primary author eg El Nino and Hermann Goering. The Second Battle of Zurich was changed because it was argued that as Zurich was at Zürich so should the battle name for consistency.
In many cases, once the change has started some people insist on changing every instance from diacritic free to including only a version with diacritics (apart from references and external links). EG Ubeda and until recently "fixed" with a first line bodge El Nino and Hermann Goering. Is this surprising when Jerzy can write in the Talk:Zürich history "Oppose change: [to Zurich. The] ugly and jarring absence of umlaut looks *ignorant*". From that statement I suspect it is unlikely that [User:Jerzy|Jerzy]] will respect primary author if he comes across "Zurich" link on a secondary page. Philip Baird Shearer 17:06, 11 Apr 2005 (UTC)
comment cesarb 17:09, 9 Apr 2005 (UTC) (But remember to always create a redirect from the english spelling in these cases)
Even if there is a redirect from "the english spelling", if the word is not included in the text without diacritics, then it will not be found by many external search engines with default settings EG: http://www.google.com.auhttp://www.google.co.nzhttp://www.google.co.ukhttp://www.google.co.za all work the same way, they differentiate on diacritics eg a search on "Zaire" and "Zaïre" return diffrent pages. http://www.google.iehttp://www.google.ca seem to be set up as bylingual (one of which uses diacritics) Germany http://www.google.de returns similar results to google.ca and google.ie, so it is a perceived cultural diffrence by Google not a technical one. Other popular external seach engines like Ask Jeeves will fail to find Zurich if all the spellings of the word on the Zurich page are with a Ü. --Philip Baird Shearer 09:47, 10 Apr 2005 (UTC)
I'm surprised that it sometimes doesn't ignore diacritics. I recall it didn't ignore in the past, and then some time ago started ignoring (which is way more useful, considering the amount of people who drop the diacritics either when searching or when writing a page). Probably what happened is that it started redirecting Brazilian users to http://www.google.com.br around that time. Even then, I still think keeping the diacritics is the best option (we're not doing SEO, are we? With the redirects, typing the name in the internal search box will always return the right page). --cesarb 13:05, 10 Apr 2005 (UTC)
If the text is not embedded in the page without diacritics many native English people will not find the article with external search engines. Why restrict the audience for an article by adopting a user unfriendly approach. Wikipedia can not explain to a person that a word can be spelt with diacritics if they can not find the page in the first place. Why restrict the audience who can read the article by not allowing the word to appear in the article in a diacritics free version? This is not a symmetrical problem. Anyone native English speaker who uses diacritics in an external search engine is either going to be using one which wraps in none diacritics or will have the nouse to try on both. --Philip Baird Shearer 10:30, 11 Apr 2005 (UTC)
by SebastianHelm
Nobbie made a good point above, which almost swayed me. I then thought about proposing to qualify the wording with something like "if two spellings are at least not uncommon, and if the modification isn't too distracting". However, I still decided to support the initiative because it offers a clear rule. If we add fuzzy criteria then we'll end up spending more time discussing than contributing (as can be seen in the Zürich case). I think there is no harm in adding diacritics. To be exact, the rule should probably say: "Consider all spellings stripped of diacritics. If the most common English one equals the native one (stripped of diacritics as well), we will used this spelling including diacritics." But the proposal comes close enough. — Sebastian(T) 10:50, 2005 Apr 10 (UTC)
From borrowed Général to native General
Whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used.
Stripping diacritics (Funny foreign squiggles) is the second step of intergrating a foreign borrowd words into English. At what point does a word go from being a foreign one to being an English one? I suggest that it is common English usage should be the guide not the above.
By the above sentence, we should store articles under Napoléon, Général etc. Just how long does a borrowed word have to be written in English stripped of diacritics before it becomes an native English word? Does this mean that if General is moved to Général as this wording would suggest that it is OK for anyone to alter any text anywhere in Wikipedia to link to Général instead of General "because that is the name the page is stored under and is the main link"? If not, why is General an exception to the rule and how does one tell? --Philip Baird Shearer 10:30, 11 Apr 2005 (UTC)
Yes, and if this convention is adopted, then we'll be moving hotel to hôtel, without any debate. We'll also add diacritics to Québec and Montréal, although this is not the most common practice in Canadian English. —MichaelZ. 2005-04-11 19:46 Z
Diacritic/Non-diacritic usage
If we are adopting :Whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used. What about Diacritic#Non-diacritic usage? Are all of those characters verboten unless in common English usage. How does any English speaking person who is not an expert tell the difference (without looking at Diacritic/Non-diacritic usage)? Seems like a lot of extra work when Occams razor and keep it simple can be used by following "common English usage". --Philip Baird Shearer 10:30, 11 Apr 2005 (UTC)
Do not use diacritics
If we are adopting :Whenever the most common English spelling is simply the native spelling with diacritical marks omitted, the native spelling should be used. then is it always possible to put on the first line:
Word or (in the native language spelling: ŴöŖď)
As this policy states that the "most common English spelling" is being replaced with a "native spelling"? If not, is the proposed Wikipedia policy saying that the only way to spell a forign word, which is spelt with diacritics in the forign language, is with diacritics and if it is not spelt that way it is wrong in English? Or does it have this format:
ŴöŖď (/Sometimes/Mostly/Nearly always/Always spelt in English Word) eg
Général (Always spelt in English: General)
or must it only be in the format
Général ... [with no mention of the most common English spelling at all]
I have done my best to educate Philip about the meaning of diacritic, name, word, loanword, transliteration, and related concepts, but to my chagrin, obviously without much, or any, success, as the numerous "funny forign squiggles" comments on this page testify. In a nutshell, this poll concerns non-English names exclusively, without applying to loanwords at all. dab(ᛏ) 10:47, 11 Apr 2005 (UTC)
Please educate me when does a borrowed word become an English word? eg is Zurich a loanword or a German word? This directly effects if such a word would be under this policy if this policy is implemented. This suggested policy without constraints is in danger of making some articles in en.wikipedia into Franglais and Germish, etc (See loanword#Terms) Philip Baird Shearer 12:19, 11 Apr 2005 (UTC)
you are asking a meaningful linguistic question now! Fortunately, we are talking about proper names, so we do not need to know the precise answer to that, for this particular policy. dab(ᛏ) 20:10, 11 Apr 2005 (UTC)
So the proposed convention applies to proper names only? Better reword it and start a new vote. —MichaelZ. 2005-04-11 19:49 Z
I hope you (and all other voters) have noted the title of this section, Proposal and straw poll regarding place names with diacritical marks (emphasis mine). dab(ᛏ) 20:06, 11 Apr 2005 (UTC)
If you just drop the proposed wording into the page, it won't mean just place names. If you're planning to change it, then why are we voting on this? Please at least update the proposal to something realistically usable, and place a prominent note that it's been changed. —MichaelZ. 2005-04-11 20:53 Z
However, on second thoughts, this proposal may need to be applied with a grain of salt, namely in contexts (language transliterations) where some names match the common English spelling by coincidence. E.g. we'd not move Krishna to Kṛṣṇa, but we'd move Ramayana to Ramāyaṇa (granted, both aren't placenames, but you get the idea). We may need to restrict this policy to languages spelled with the Latin alphabet natively, and come up with separate guidelines for individual transliteration schemes. Yes it's complicated. That's as it should be, we are writing an encyclopedia after all: We're not making it more complicated than need be, merely just as complicated as it is in reality. dab(ᛏ) 20:18, 11 Apr 2005 (UTC)
It's a rough proposal. It's a straw poll. Suggestions for more precise wording that captures the concept are certainly welcome. Dpbsmith(talk) 21:27, 11 Apr 2005 (UTC)
They would break too. Just check your own comment — they have been converted into entities with value above 255. They will only work after the English Wikipedia is converted to UTF-8. --cesarb 02:06, 12 Apr 2005 (UTC)
this policy should anticipate the software update. Unicode titles will be supported in the near future, and we are using Template:Wrongtitle in the meantime. dab(ᛏ) 08:12, 12 Apr 2005 (UTC)
by Gdr
Applied rigorously this proposal would butcher many spellings, including, in my opinion:
The proposal is too broad to be usable in cases where the "use English" naming convention concides with technical restrictions. In many articles whose title cannot include diacritics we use the the native spelling in the body but make use of the diacritic-less spelling in the title. The number of these articles is non-trivial, witnessed by the amount of links to the titlelacksdiacritics template, so having too trivialized policy verbiage that conflicts with those cases makes the overall naming convention self-contradicting and useless. --Joy [shallot] 16:15, 12 Apr 2005 (UTC)
It appears that many votes are targetting specific groups of articles, but the policy cannot tolerate this kind of an approach.
There are articles where English pronunciation or transliteration de facto doesn't exist - nobody should really have any objection on the inclusion of diacritics in article titles on e.g. some obscure Polish writer or a Hungarian village, and a name that doesn't follow English pronunciation rules. It makes absolutely no difference to the English viewers how it's written because the title is just a bunch of letters to them.
However, there are articles where English pronunciation exists, and differs from the native one. The painfully obvious example is Zurich - I can't spell SAMPA, but it's different from the German one. In such a case, regardless of how it's spelled natively, I think it's fair for the native speakers to back off and let the English form be used on the English Wikipedia.
Between those two cases there's the case where the English spelling is not obvious or standardized. For example, the tennis player whose Croatian name is "Ivo Karlović" is often spelled "Ivo Karlovic" in English texts that don't support diacritics (think CNN etc). Sometimes the English-speaking announcer pronounces that as Karlovich, sometimes Karlovick. The first form emulates the Croatian pronunciation, the second doesn't. Ivo wouldn't turn around if you called out "Karlovik!" (well, at least when he was younger and unaccustomed to that :). How does one decide that "Karlovic" is the right form? The rationale for that ("let's just strip 'em") is at least as trivial as the rationale for using "Karlović" ("let's copy the original verbatim").
All in all, having one single poll that doesn't discriminate doesn't seem like the smartest idea to get a real consensus. --Joy [shallot] 30 June 2005 14:49 (UTC)
Proposal on the use of ß
There has been a discussion at Wikipedia talk:Manual of Style on the use of the German eszett character ß on Wikipedia. Consensus appears to be developing that even though it is a Latin character, it is one that is not likely to be familiar to many English readers, and so its use should eschewed in article titles. I propose adding this comment to the page:
The German eszett character ß, which represents ss, is unfamiliar to many English readers. Unlike other non-English Latin letters, such as ø and letters with diacritics, it is not readily recognized as a variant of a familiar letter. It should not be used in article titles. Nohat20:47, 28 August 2005 (UTC)
I object. ß and ss are not the same. We are writing an encyclopedia. Encyclopedia should use real names, not their approximations. -- Naive cynic22:47, 28 August 2005 (UTC)
Do you also propose that articles about Chinese topics have their Chinese-language names at the article titles? ß is not an English letter or even anything like an English letter. When it comes to article titles, it should be treated the same way we treat Chinese characters. No one is suggesting that we don't include the "correct" German spelling in the article itself. This proposal only concerns article titles. Nohat02:22, 29 August 2005 (UTC)
Yes, Chinese topics that don't have English names should have their Chinese-language names (in the Latin script) as the article titles as well. It's already a policy, so I don't have to propose it. -- Naive cynic06:55, 29 August 2005 (UTC)
But the Chinese-language names in the Latin script are "approximations", not "real names", which are in the Chinese script. The reason we use these approximations is because the Chinese characters are not identifiable to most English readers. The same principle applies to ß. It's not identifiable to most English readers, and so shouldn't be used in page titles. Nohat00:28, 30 August 2005 (UTC)
There is a considerable difference between Großherzogtum Luxemburg and 自由的百科全书. Even if someone haven't had an occasion to encounter a German letter before, presence of one unknown symbol doesn't make the whole word unidentifiable.
I'd expect correct names to be used especially in the article titles. It is, of course, a good idea to mention alternative spellings in the article text, if they are not obvious. -- Naive cynic16:48, 30 August 2005 (UTC)
How about a word like "iß"? Completely unintelligible if the reader isn't familiar with the letter ß. The same uselessness as 自由的百科全书. There is a qualitative difference between iß and a word that only uses letters that are used in English. The page already says "If a native spelling uses different letters than the most common English spelling (eg, Wien vs. Vienna), only use the native spelling as an article title if it is more commonly used in English than the anglicized form." No article title whose title might have ß has the form with ß more commonly used in English than "ss", so explicitly disallowing ß is just a straightforward application of this already-existent policy. Finally, your assertion that "ß" is "correct", with the implication that "ss" is incorrect is mistaken. We don't chastise the Japanese for spelling English words using katakana; why should English writers be expected to spell foreign words using letters that English doesn't have? Strauss, for example, is merely how that name is spelled in English. If you don't believe me, look it up in a dictionary. No English dictionaries spell anything, including borrowed German words and names, with ß. Nohat05:32, 31 August 2005 (UTC)
I'm not sure why do you think that I claim that transcribing ß as ss is incorrect. It definitely isn't. Neither I'm going to chastise anyone for using this convention, especially considering the fact that I'll, myself, happily use it in informal, and, in many cases, in formal writing. When writing an encyclopedic definition, I'd like, however, to hold myself to higher standards.
I'm also concerned of consequences of this proposal for languages like Azeri. Azeri uses the Latin alphabet, with one special letter, ə, as in, e.g., Xankəndi (and yes, I know it should be moved to Stepanakert, that's not the point). Will this letter be banned from article titles as well? Admittedly, it is hardly unknown for English readers, since it is used in English phonetic notation. -- Naive cynic08:23, 31 August 2005 (UTC)
ə is just turned e and is easily identifiable as such (especially given, as you mention, that it is used almost universally in English phonetic notations). Turning a letter is equivalent to adding a diacritic. On the other hand, there is no such easily recognized visual relationship between ss and ß. To someone who only knows English, ß is completely opaque. It looks like B, not ss. Nohat08:32, 31 August 2005 (UTC)
To me ß looks like an old style s followed by a z. But that's because I know what it's called :) Before I did it looked like a B to me too. I don't think ə is easily identifiable at all. It doesn't look like a turned e to me, it just looks like a new letter. - Haukurth20:22, 1 September 2005 (UTC)
I have to admit that connection between ss and ß isn't obvious. Oh well, I won't object to your proposed addition. Even if I'm not entirely happy with it, there is some point behind it. -- Naive cynic14:01, 31 August 2005 (UTC)
To many English speakers with a mathematics or science background, especially without familiarity with the German language, the biggest problem is that ß looks like β, the Greek letter beta commonly used in those contexts. Plus, in the old extended ASCII character sets, there was usually only one letter used by those who wanted either of these; whichever it was supposed to be wasn't that important. Gene Nygaard14:47, 17 October 2005 (UTC)
Confirming my observation above, from ß article: "the original IBM DOS codepage, CP437 (aka OEM-US), conflates the two characters, assigning them the same codepoint (0xE1) and a glyph that minimises their differences." Gene Nygaard15:06, 17 October 2005 (UTC)
Indeed. ß looks like beta or B - but from my point of view this is a plus. The point about having names in the Latin alphabet is that even if you don't really know how to properly pronounce them you can at least come up with something. Pronouncing ß as a voiced labial plosive is indeed an awkard mispronunciation - but at least it's something. And two people unfamiliar with the letter are likely to come up with a similar mispronunciation. Let's look at some other letters. The Polish Ł probably looks like an L to most people. People unfamiliar with Polish who come across a word with Ł in it will probably pronounce it as an L and think of it as a kind of L. But in reality it's pronounced completely differently. The same goes for the Icelandic Ðð and Þþ. Most people will see them as a variant of 'd' and a variant of 'p' respectively. They're not pronounced like that in reality but at least you can think of them as vaguely familiar entities. This is categorically different from the experience of a reader encountering a row of Japanese symbols if she isn't familiar with Japenese. The sequence 浄土宗 is completely unhelpful. A sequence like "Þor" in my father's name can at least be thought of as "Weird P - O - R".
But of course we aim to do better than leaving the reader with thoughts of "weird P's" etc. We want to provide accessible information on pronunciation and, where appropriate, common transliterations. - Haukur Þorgeirsson18:11, 17 October 2005 (UTC)
Big deal. The Polish Ł indeed looks like an L. The Poles know that as well as we do (Polish jokes notwithstanding). It should come as no surprise that people "pronounce it as an L and think of it as a kind of L". But that is the Poles' problem, not anybody else's. Nobody is forcing them to continue to use that mutation. If it bothers them, they know what they can do to fix it. Gene Nygaard23:48, 18 October 2005 (UTC)
In fact, the Polish Ł not only "looks like" a lind of L, it "is" a kind of L, namely "hard L" (or "velar" L) which happens to be pronounced as /w/ in contemporary Polish (but in other Slavic languages, such as Russian, the etymologically corresponding consonant is pronounced as an l with the tongue tip much farther back than even in English "fall"). Cf. also (among others) the fact that Polish "Wrocław" /'vrɔtswav/ with ł, corresponds to German "Breslau" with l; Wisła is the Polish name of the Vistule (both names from Latin "Vistula"); etc. - Tonymec04:23, 19 October 2005 (UTC)
The Swiss always transliterate ß to ss, as does every German with a typewriter that doesn't produce ß, when doing crossword puzzles, or when capitalising the letter. This is always allowed if no ß is available. The Swiss have done this since the 1930s, probably due to the wonderfully overloaded Swiss typewriter layout. (Switzerland being quatrolingual needs to accommodate French accents like é and ç, as well as German ones ö, ä, ü. There simply aren't enough keys for also including ß.) The last Swiss newspaper, the respected NZZ, abandoned the ß in 1974. The German spelling reform of 1996 confirms the ß-less Swiss regional variant. Arbor07:30, 30 August 2005 (UTC)
I just added the recommendation for ß from the MoS page, where a consensus formed to avoid it but to mention the German spelling in parentheses. --Tysto 08:15, 2005 August 30 (UTC)
I agree with your wording but not your hast. You really ought to have waited 5 days without a contra view before saying that a rough consensus had emerged on this page. (Why 5 days, the usual time used to build a consensus. see WP:VDF and WP:RM) It is only 2 days since Naive cynic objected to this and (s)he has not indicated that (s)he now agrees with this change. Only Nohat had postitivly said that this change should take place before you made the change that is hardly a consensus. Taking a consensus from a page which the editors of this page may or may not read can not be described as a consensus for a change to this page. As the topic was already under consideration here I think you should have indicated you intention on this talk page and waited to see if anyone objected -- Philip Baird Shearer10:13, 30 August 2005 (UTC)
I disagree with doing away with ß. By the same logic we should not use þ and ð which are present in the names of many Icelandic people with articles on Wikipedia. - Haukurth17:35, 31 August 2005 (UTC)
I disagree, they should be used in article titles. I think there's a pretty clear consensus for that among people editing articles on Icelandic people, places and things. Here are just a few examples:
Please do not start moving these around. Such an action would be fought tooth and nail by Icelandic Wikipedians, me included. Why forbid ð but allow æ? Is it really completely clear that æ is a variant of a letter in the English alphabet? How about œ? If those are obviously 'ae' and 'oe' (and I don't think they are) then why isn't ß obviously 'sz' (and I don't think it is)? I'm removing the ß dictum from the page. I think more time is needed to discuss the matter before a consensus can be declared. - Haukurth20:12, 1 September 2005 (UTC)
Frankly, I oppose the use of æ in article titles, also. It is Wikipedia, not Wikipædia. In fact I oppose the use of diacritics, or any character outside of 7-bit (127 character) ascii, in article titles, but I think that ß, þ, and ð are larger problems than æ and Œ, and those in turn are larger problems than diacritics. But clearly non-english characters such as ß, þ, and ð, and Chinese, Cyrillic, and Hebrew characters (all of which I think I have seen used in article titles) ought to be absolutely forbidden in article titles. This is the english-language wikipedia, and all articles should be english words if at all possible, and certianly ought to be in the alphabet used for writing english. DES(talk)20:33, 1 September 2005 (UTC)
well, ascii is a rather arbitrary set. Why not restrict ourselves to [A-Z0-9] for article titles, so titles will be correctly rendered on 1960s terminals? I support the idea that every article should have an ascii-only redirect, but I really see no reason at all not to use the capability to have unicode titles introduced with the last mediawiki update. dab(ᛏ)07:30, 2 September 2005 (UTC)
þ and ð in addition to ß should not be permitted in article titles and should not be used in article text, except when giving the Icelandic/German name for something. They are not English letters and are not easily identifiable as variants of English letters, like say æ and Œ are (and they are). The criterion should be "can someone who only knows the English alphabet look at it and figure out what letters to write?" This is true of accented letters and the so-called ligatures like æ and Œ, but is not true of þ, ð and ß, which are misleading. þ looks like b or p; ð looks like d; ß looks like B. Furthermore, both æ and œ are in contemporary use in modern British usage, clearly establishing them as legimitately English letters.
If you can't figure out what letter it is by looking at it, then it shouldn't be allowed in titles. It's a criterion with a sound logical premise, and follows logically from already-existing policies. Nohat07:52, 2 September 2005 (UTC)
wow. who just gave you definition power over English alphabet? If the term has any meaning at all, it certainly does include þ and ð. Your approach would also seem to suggest, "don't use difficult words people don't understand". What people? 12 year olds? Six year olds? well, then there is nothing to stop us from merging English Wikipedia into simple:, is there. The correct approach is, rather, to explain things you think some people will not understand. And by "explain" I don't mean wordy sermons along the lines of "the funny thing in the title is a letter that was used a long, long time ago by the Vikings", but a simple hyperlink, either to thorn or to icelandic language or to transliteration. If people want to know about Thorn or the Icelandic language, they can go and read about it. If they don't want do know, no problem, leave them alone. Thorn, in particular, was in use up to Milton's time. Now, if Milton didn't write in English, I don't know who did. dab(ᛏ)08:01, 2 September 2005 (UTC)
English that uses thorn is not the same language that Wikipedia is written in, so arguments about what was used then do not apply to the English of today. Indeed, English alphabet says "in Modern English orthography, þ, ȝ, ð, and ƿ are obsolete". Second, I'm not saying that we shouldn't use those characters at all, just that they shouldn't be used for page titles, or in general text, where their use negatively impacts usability without any reasonable positive effect to offset it. Their use should be restricted to identifying the native forms of names and so forth, like we do for other things that don't natively use the English alphabet. We wouldn't use Chinese characters in page titles or interspersed into article text. Why should þ and ð be an exception? Because they were once upon a time used in English? In modern English, þ is just as foreign as 中, and they should be treated the same. Nohat08:55, 2 September 2005 (UTC)
I agree that thorn is obsolete in English orthography, of course. I do not agree that "[to] modern English, þ is just as foreign as 中". 中国 is a redirect, and that's as it should be, because it has a romanization, chung-kuo. Now, chung-kuo is also a redirect, because there is a common English name, China. If we were talking about some remote Chinese village, however, there would be no common English name, and we would use the romanization (not the kanji). þ doesn't need to be romanized, it is used in languages written in the Latin alphabet. How many times have I argued on this very page that romanizations should be used for article titles, unless there is a common anglicized name. Point in case, we use Thor (nobody suggests a move to Þórr!), because it is the familiar anglicized spelling, but we use Seyðisfjörður because there is no familiar anglicized variant of the name. I accept that one may be of various minds about the question, but please at least try to be aware of the details rather than lumping all together, transliteration/romanization/anglicization/ascii/diacritics/ISO 8859/UTF-8, into naive "English" vs. "Foreign" categories. "China" is an "English" name. "Thor" is an anglicized spelling (the "English" name being thunor or thunder). "Seydhisfjordhur" is not so much anglicized than crudely ascii-ized. dab(ᛏ)08:00, 3 September 2005 (UTC)
Your example about Þórr and Thor is a good one. This is exactly what we've agreed on in Wikipedia:Naming conventions (Old Norse/Old Icelandic/Old English) - use the familiar English form where one exists, as in the case of Thor and Odin. For other characters we use the Old Norse spelling, rather than choosing between several different possible anglicizations. There isn´t really a familiar English form of, say, Veðrfölnir because Veðrfölnir himself isn't familiar. Every English translator of the Eddas (the main source of Norse mythology) seems to have his own idea of if and how to anglicize the names, leading to the chaos that, using Wiglaf's standard, we are now trying to fix. - Haukurth14:48, 3 September 2005 (UTC)
Ah, I can see that it might be confusing but check the page again. The 'No Agreement' subheading refers to an earlier proposal (using the Orchard/Lindow forms). Currently those of us who are actively working on the Norse mythology pages are in agreement as to how to proceed. - Haukur Þorgeirsson08:03, 6 October 2005 (UTC)
I am not confused. There is disagreement with the "standardization" attempt up and down that page, and it should not be confused with policy. The policy remains "use the most common name in English" for the article title. Jonathunder08:26, 6 October 2005 (UTC)
It's not intended as a set-in-stone policy. It's a guideline on how to apply Wikipedia:Naming conventions (use English) to Old Norse names. We all agree on using the common English form where one exists. Currently those of us who are most actively working on the ON pages (Wiglaf, Salleman and myself) make use of that standard. If you have a comprehensive counterproposal we'll certainly be happy to discuss it. What none of us wants is to have some sort of unscientific Google-fight over every single name à la requested page moves at its worst. That would consume vast resources and we'd end up with a very inconsistent corpus of Norse mythology articles. We're currently slowly working on normalizing the titles in accordance with our guideline. That means we're often moving pages from Swedish, Icelandic, German or obsolete 19th century anglicized forms to the normalized forms. Of course our guideline isn't a perfect solution but I think the result will be a great improvement on the previous mess. Thanks for taking an interest. - Haukur Þorgeirsson11:28, 6 October 2005 (UTC)
Just because a letter (or in this case a ligature) is unfamiliar to people is no reason to avoid its use. Let's not dumb down any more than necessary. There is no Academy laying down the rules of English (thank goodness), so there are no hard and fast rules, including the alphabet. Many English words retain accents from foreign languages (or may do so): naïve, rôle, façade, fiancée, etc. Why should ß be any different. See the list of English words with diacritics for dozens of examples of "foreign" accents in English words. --Stemonitis10:39, 3 October 2005 (UTC)
I should mention that when the above words are written in standard English texts (ie. newspapers, books, magazines, etc.), these words are often written without diacrtics, eg. naïve → naive, rôle → role, façade → facade, fiancée → fiancee, etc. Even café → cafe, unless someone is being trendy. --Mark 05:51, 6 October 2005 (UTC)
I wouldn't say very common for 'façade', 'résumé', 'fiancée' and 'café', but equally common with their non-diacritic versions. These words are in the process of losing their diacritics; the words role and naive have pretty much finished this process of diacritic loss. I know for certain that I never use diacritics for the words role, naive and cafe, but sometimes use diacritics for 'façade' (as 'c' before an 'a' is /k/), 'resumé' (to distinguish it from the word 'resume' (start again)), and 'fiancée'. I think most English speakers who stick with the diacritic versions do so, because it is 'chick' or 'trendy', so they can appear more cosmopolitan. --Mark 08:15, 7 October 2005 (UTC)
"Why should ß be any different"? Because its use has not been established by longstanding convention. English words with diacritics are unusual, so much so that apparently they're judged worthy of their own article. :-) In fact, I'd call them living fossils, and say it's unlikely that any more will ever come into the language. Dragging ß into English would be an uphill battle for the NY Times or other major source of language standards, and it's quixotic for WP to even try. Stan16:26, 3 October 2005 (UTC)
Are you sure there aren't some reasonably new loan-words in English that have diacritics? 'Führer', maybe? Or, I don't know, 'gemütlich' - how old is that? Or maybe there's something even newer. It's an interesting point - maybe you're right but it's not immediately evident to me. In any case no-one is going to drag ß into English. The only issue is whether we should use it in German proper names; book titles, song titles, street names etc. Take Straße des 17. Juni as a test case. - Haukurth18:37, 3 October 2005 (UTC)
I understand that english-speaking users are getting worked up about using the "ß" instead of the "ss", but reality remains that there are certain words in German which cannot be really written with "ss" because the pronounciation would change too. There is a difference between saying "Straße" and "Strasse", which would be more similar to "Trasse", as well as "Groß" and not "Gross", and "Weiß" and not "Weiss". The other problem is that family-names in the German-speaking areas are sometimes written with a "ß", and not a double "ss". On another topic same goes for Umlaute. Some family-names are written with an Umlaut "ä" "ö" "ü", and some are written with "ae" "oe" "ue", but still pronounced as an Umlaut. However written without. So best to stick with how it is done in the native tongue, instead of imposing some supposed "english" system that is neither correct here nor there. Unless the Kulturminister come up with a new law to completely abolish the "ß", it remains and will be used. Gryffindor14:27, 22 October 2005 (UTC)
Using the English alphabet when writing is English is correct. Any claims to the contrary are nonsense.
The half-letter ß does avoid one of the major problems—it is never used as the initial letter, so it has a lesser effect in messing up indexing in things such as categories in Wikipedia. That's also why I called it a half-letter; it doesn't even have an uppercase form. The same cannot be said about the other letters under discussion here, however; they do often mess up the indexing horribly, whether it is done manually in a list within an article, or automatically (unless manually overridden) in things such as categories. Gene Nygaard14:42, 22 October 2005 (UTC)
Alphabet vs character set
This article seems to confuse the terms alphabet and character set. It recommends using the "Latin alphabet," but the Latin alphabet does not include the letter J, for example. It apparently means the Latin character set, but seems overly broad as well as muddy. I think it should just recommend using the "English alphabet" (26 letters) with diacritics. That would help make it clear that ß, for example, should be avoided. Are there any letters outside the 26 English letters that are really acceptable for English speakers? --Tysto 08:15, 2005 August 30 (UTC)
YES IF VE VSE THE LATIN ALPHABET VE VILL HAVE PROBLEMS. Use the English alphabet. However do not include with it a recommendation to use diacritics. Philip Baird Shearer
The expression "Latin alphabet" does not mean "the alphabet which the Romans used to write with in Classical times". In particular, the (modern) Latin alphabet, unlike the Classical Romans' alphabet, has lowercase letters, additional letters like U, J, W and even thorn; it also has (in its various national variants) digraphs like œ, æ, etc., and letters with diacritics like é, è, ê, ë, ę, ē, ĕ, ç, č, ć, ĉ, õ, ø, etc. Similarly what is known today as "the Cyrillic alphabet" is not the alphabet which Cyril and Method used to evangelize Bulgaria. -- Tonymec17:05, 14 September 2005 (UTC)
Well, accented characters are generally speaking acceptable, even though they're not ordinarily used in English. However, the more "exotic" the diacritic, the less likely it is to be understood correctly. In other words, diacritics used in French and Spanish are almost universally understood (if not what the diacritic means, then that it's "some variety" of the unadorned letter), but less familiar ones, like under-comma and double acute, probably will result in greater puzzlement. I don't think in the end that accented letters provide a real impediment to understanding in the way that the other "Latin" letters like ß, ð, and þ would. So yes, I agree in principle that it should say limited to 26 English letters, but should be explicit that there isn't consensus to forbid accented letters. Nohat09:21, 30 August 2005 (UTC)
I do not think "diacritics used in French and Spanish are almost universally understood" is correct. I think diacritics used in French and Spanish are almost universally ignored when read in English texts is more accurate; and if a native English speaker does not recognise the forign letter like ß, ð, it is not possible to ignore them so they should be avoided. Philip Baird Shearer09:44, 30 August 2005 (UTC)
I agree with Phillip here that English speakers generally do ignore diacritics over/under/thru letters, so a 'ü' would just be seen as a 'u' and pronounced as such, 'ł' as 'l', 'ę' as 'e', etc. But þ, ð or ß are different, because to an average English speaker they are either unrecognisable or resemble other letters such as p, d and B, which of coarse don't sound the same at all. English speakers can and do have problems with these letters. This is English Wikipedia after all, so we should cater primarily for English speakers and then secondly for others. – AxSkov (☏) 06:17, 14 September 2005 (UTC)
You know, just because you can ignore diacritics in English text does not also mean you can also ignore parenthetical comments. :-) Nohat09:52, 30 August 2005 (UTC)
The bioggest problem with diacritics, to me, is that they are not easily entered into the search box, and not conssitantly handled by search engines. Their effects on the words involved are also often uncelar. Personaly i support, as i said above, a limitation of article titles to pure 7-bit (127 character) ascii. DES(talk)20:38, 1 September 2005 (UTC)
To add my point of view, I do understand the argument that ß is opaque to many English readers. I disagree, but I am willing to accept the point. I am not quite sure what to think of ð and þ. My gut feeling is that the former is in, but the latter is out. Dunno.
However, I do know what to think about the Hungarian double acute accent, diacritics and undercomma. My English dictionary (COD) contains garçon and even ångström (the latter only as a variant spelling, not as an entry). The argument “Use English” or “This is the English-language Wikipedia” simply won’t dispel those symbols, for the COD certainly “uses English”, and Henry Higgins would have a word or two to say about people who use other dictionaries. Many of my English maths books and articles spell Paul Erdős with a double acute accent, and even though I haven’t tallied them, that spelling might even be the predominant English spelling of the man’s name. (A result, no doubt, of the facility offered by TeX for entering the name, and the childish technophilia of many of my colleagues. But that is not the point.) Majority or not, I am certain the Hungarian double acute accent is used for Paul Erdős in most contexts where this is technologically feasible. Which it certainly is on Wikipedia.
In short, I am willing to accept the argument from unintelligibility with respect to a small handful of roman letter forms, ß being an example. (On the other hand, I would gleefully support a movement to educate the unwashed masses about what ß, ð and þ are. I am happy to learn new stuff, including letters I didn't know before, and I assume the readers of an encyclopaedia share that sentiment. I suppose the argument from enlightenment cuts both ways here: we want to spread knowledge, including about spelling, but on the other hand we don’t want such spellings to obfuscate unnecessarily.) But I don’t accept the argument from Use English. Use English tells us to use Cologne instead of Köln. It says nothing about Paul Erdős. Arbor08:39, 5 September 2005 (UTC)
yes, thank you, that's what I'm trying to say. There is no problem at all with Paul Erdős as long as Paul Erdos redirect there. Even the unwashed masses will recognize ő as an o plus Funny Foreign Squiggles. I am also amenable to replace ß with ss, since that's a familiar practice even in German orthography itself (Swiss orthography never uses ß at all, and in Germany, ss is acceptable when ß is unavailable, and in ALLCAPS). Surely, otoh, you are not suggesting we use ð and not þ? Surely, case by case, it's either both or neither? dab(ᛏ)08:47, 5 September 2005 (UTC)
I simply don't know what to think. I have a weak spot for ß and ð and þ. I do expect to find Thor under Thor, but on the other hand, I think the use of ð in Icelandic place names is quite appropriate and non-intrusive. I agree that it seems strange to allow ð but not ß and þ, so—letting myself be ridden by that hobgoblin of little minds, consistency—it seems I must admit all three, forbid ð against my better judgement, or shut up about it. Arbor10:53, 5 September 2005 (UTC)
Good call about Thor. I think I am slowly coming around to embracing both ð and þ, seeing that they are used (and of use) mainly if proper names for things Icelandic. Does this mean I was rash or limp-wristed in my abandonment of ß? After all, þ looks no more like t than ß looks like s. (Far be it from me to assume lewdness in Wikipedia's readers, but I have the suspicion that the spelling Þorn (horse) is easily misread as a P.) By analogy, would then not ß be an acceptable spelling as well? Or is the proliferation of German things (Gauß, Strauß) the reason to transliterate the “opaque” character ß to ss, arguing perhaps that “German proper names are used in a wider context than Icelandic proper names, and already have a traditional English transliteration, which Wikipedia should adopt.” But then what about obscure German proper names (Warschauer Straße, Gießen)—shouldn't they be treated with the same attitude as Eikþyrnir? As I said, I am unsure what to think. Maybe I am mistaken in my assumption that ß and ð and þ should be treated the same. Arbor08:54, 8 September 2005 (UTC)
I think having the articles at the properly spelled names (Gießen), with redirect from the "simplified" versions (Giessen) would be the best solution. And mention the simplified version (and perhaps a pronunciation key) in the first line. Markussep10:36, 12 September 2005 (UTC)
I fervently disagree with Markussep regarding his solution, it is not a very good solution. In all the English language reference books I've been looking through – which includes atlases, encyclopaedias, etc. – all these books use 'ss' instead of 'ß', but they do use the letters þ, ð and letters with diacritics (esp. my Doubleday Atlas). It is standard in English orthography to transliterate 'ß' to 'ss'. Something tells me that the proponents of using ß have no idea what is orthographically standard in English. To many English speakers 'ß' can very easily be mistaken for B and pronounced as such, this causes confusion in the pronunciation of words that include this letter, so if Giessen was spelt Gießen then it would be approx. pronounced /giːbən/ rather than /giːsən/ by an average English speaker. I find calling words that use 'ss' rather than 'ß' a simplified version is wrong to an English speaker, when it's just standard transliteration. – AxSkov (☏) 05:47, 14 September 2005 (UTC)
I agree with AxSkov that keeping esszett is more radical than keeping eth and thorn. I would be happy with a convention that expands esszett to ss, but leaves eth and thorn alone (except for cases of familiar names like Thor), and keeps accents on vowels. Esszett does have a distinctive value in German, but only in a few cases (like Masse vs. Maße). For most names it is perfectly permissible to expand to ss. dab(ᛏ)05:55, 14 September 2005 (UTC)
Eszet to double-s is a standard even in German, for example when writing in ALL CAPS, in a Swiss locale, or on a typewriter without an ß key. Thorn and edh are probably unrecognizable to most native English speakers, indeed to most people with no knowledge of Scandinavian languages. Thorn could IMO be transliterated as th, edh as dh or even as d, e.g., in Loftleidir-Icelandic for Loftleiðir-Icelandic, an aviation company, or Gudmundsson for Guðmundsson, a patronym: e.g. nowadays Tomas Gudmundsson redirects to Tómas Guðmundsson. Is the difference between d and ð just a diacritic — a bar through the tail of the d — or is it more than that? If it is just a diacritic, then the above redirection is in line with present Wiki guidelines; if it is more than that, then IIUC Guðmundsson ought to redirect to Gudmundsson rather than vice-versa. -- Tonymec16:47, 14 September 2005 (UTC)
What is Wikipedia for? For distributing knowledge. By eliminating characters which exist in olde english and modern languages, you are trying to sanitize for users without in any way benefiting them. The use of ð, þ, æ, ö, í and the whole enchilada allows them to see that there is more out there than just those 26 letters and it broadens their minds instead of causing an implosion in their brains as you seem to believe users can't handle anything other? This is like allowing someone live his life in a protected bubble and then allow him to be shocked when he leaves it and finds out that no one understands what he refers to and tells him, there is no such place as Isafjordur, but we do have an Ísafjörður, is that what you were looking for? I find this talk of eliminating accents to be incredibly short-sighted and political correctness gone mad. Limiting to the latin alphabet (as opposed to cyrillic, arabian, hebrew etc) is of course perfectly acceptable, but trying to limit it further to a tiny subset of 26 letters... --Stalfur11:32, 8 October 2005 (UTC)
I don't think the pronunciation should be an argument in this discussion. There are many names without diacritics of funny foreign thingies that don't give the average English speaker a clue about how they should be pronounced, like Zschopau, Clwyd or Worcester. BTW how many average English speakers would recognize the thorn as something with a "th" sound, and not b or p? Markussep09:19, 14 September 2005 (UTC)
The case of Mexico, newly mentioned in the main page, looks enlightening to me. "Mexico", pronounced /'mɛksɪkøʊ/, is so well entrenched in the English language that it is no more perceived as foreign; people aren't conscious of the fact that it comes from Mexican Spanish "México" /'meʃiko/. The same might apply to "Paris" /'pærɪs/ vs. /pa'ri/ but since they're written the same, it doesn't change how to title the wiki page. I would advance that the same process of "anglicization" is happening in the case of "Zurich" /'zjʊərɪk/ vs. "Zürich" /'tsyriç/. The dispute seen here may be due to the fact that the process is less complete and/or that the -ich final is not perceived as English, as evidenced by some people's tendency to overcorrect "Munich" /'mju:nɪk/ to /mju:nɪç/ or even to "Münich" with a spurious umlaut, not realizing that the German name is actually "München" /'mynçən/. — Tonymec22:00, 11 October 2005 (UTC)
I added this to the paragraph on text but user:Curps
If a word can be spelt in English with or without diacritical marks then both variant spellings should be given early in the article for educational reasons and to aid external access to the page from the rest of the world wide web.
User:Curps deleted it with the comment "actual practice is relevant, because that's what this page documents; rv recently added sentence which doesn't reflect current actual practice, and you haven't shown consensus for this proposed change"
So I am laying out the stall here:
There are many many pages where this is already done. Eg Zurich and El nino.
It would reduce a lot of conflict over page names.
It helps with external searches as not all search engines are diacritic smart and almost all are not smart in all languages.
This is only making explicit to this hot issue what is already covered by other guidelines:
WP:MOS states "if possible, make the title the subject of the first sentence of the article
WP:UE "The body of such an article, preferably in its first paragraph, should list all of the other names by which the subject is known, so those too can be searched for."
The thing is that for umlauts and graves most English speaking people can ignore them but it gets more difficult the more exotic diacritic marks, particularly where it also alter the font eg "Gdańsk" which on the screen and bold font I am using comes out as a blob with a line above it. Rather than try to decide which diacritic marks can be ignored and which ones can not, it is easier if a simple rule of having both spellings is included. Then if the character comes over as a "?" or whatever at least the word can still be read using the 26 letter of the English Alphabet. If a letter is not in use in modern English, then the transliteration really ought to be included anyway.
If anyone has a reason for rejecting an explicit addition to the paragraph I would like to hear what it is and see if there is a for of wording we can agree upon. --Philip Baird Shearer12:38, 12 October 2005 (UTC)
I disagree with Philip on a lot but I think this proposal of his has some merits that should not be ignored. First of all it moves us one step away from the winner-takes-all mentality that so pervades naming disputes here. I'd like to reword it a bit - we should handle the case where more than one ascii-spelling is used in English in addition to the native spelling. On Höðr, for example, we have Hod, Hodur and Hoth as common ascii anglicizations. Since there are also several other anglicized spellings in use (Höd etc.) I decided not to clutter the lead with them and use a separate section linked to with a footnote. But essentially I support Philip's proposal. Many editors want to use ascii versions, some readers will find them helpful and I can see no harm in including them (though, of course, we should try to do so elegantly and avoid clutter as far as possible). - Haukur Þorgeirsson12:54, 12 October 2005 (UTC)
Seems like a good sound common-sense proposal to me. We're not trying to suppress or favor any particular spelling, we're trying to serve readers and make information easy to find. Let the consensus begin here. Dpbsmith(talk)23:08, 12 October 2005 (UTC)
I hate to sound cynical or not assume good faith, but Philip seems to be in favor of applying a general rule only when this would favor his own preferences; however, if a general rule is not in his favor he insists on a case-by-case vote.
We had a survey in this talk page about diacritics. After 3 1/2 months, the tally in favor was 70% to 30%. After six months, it has now dipped to just under 60%, still a fairly respectable margin (a US presidential election by such a margin would be termed a "landslide"). Some might argue that the reduced margin reflects changing opinion over time, but I believe it's more likely the proponents simply considered it a settled issue and moved on, and aren't even aware of straggler votes still coming in. It's unusual for a survey to last this long... certainly, if it had been terminated after 3 1/2 months at the end of July at 70% to 30%, there would have been little ground for complaint on procedural grounds.
Given that a survey involving discussion and voting by dozens of people has apparently not settled anything in Philip's mind, and that Philip still insists that any page moves have to be determined by case-by-case voting, I really don't see how this new discussion involving just the handful of us is going to settle anything either.
In particular, Philip asserts that the Requested move vote at Úbeda in March trumps the global survey here that ran from April to... well for all I know it's still technically going on. In other words, it seems that the status quo at each article should apparently be grandfathered, in which case the outcome of any discussion here would apply only to new articles, while modifications to any particular article would require an individual vote or discussion at thousands of articles' talk pages.
You could argue a "famous" exemption for Mexico and Panama and Romania, I don't think anyone's even tried to move them, but surely no such exemption applies to the sleepy little town of Úbeda, and if you look at Category:Municipalities in Spain, a large number of municipalities have diacritics. Yet he seizes on the technicality of the earlier requested move vote... surely the survey was intended to establish a global policy and supersede earlier case-by-case voting. That was the whole point of having the survey in the first place.
Philip's "anti-squiggle" agenda will probably stir up a hornet's nest of User:Kolokol 's wherever he goes... as a longtime Wikipedia user he is surely aware of the Gdańsk/Danzig edit wars that lasted for years, yet from what he wrote above it seems he wants to blithely saunter in and mess with Gdańsk (of all articles, this is the third rail of Wikipedia articles if there ever was one).
I'm not interested in a recipe for chaos. If that's not what Philip has in mind, then he needs to clarify his position. What exactly will it take for him to accept a majority opinion that is different from his own? And what exactly do you have in mind when you say your proposal would reduce a lot of conflict over page names ? Somehow I doubt you are conceding anything on that issue. -- Curps00:46, 13 October 2005 (UTC)
Re Mexico and Panama and Romania -- I never envisioned the above ongoing vote as referring to country names. Country names are clearly not the same case as Úbeda and, I would also add, Zurich and Gdansk probably make up a third group, situated somewhere between those two extremes. –Hajor01:00, 13 October 2005 (UTC)
I am not going to debate with you, Curps, the use, or none use, of funny forign squiggels in page names that is not what this sectio is about. My position on this is well known and there is no need to repeat it here.
There was a proposal to move Zaire see Talk:Zaire.
Côte d'Ivoire is a country with funny French squiggles in its name and I for one have no problem with that one as it is in common English usage unlike the ones I object to. But I do approve of the "(often called Ivory Coast in English)" on the first line as it aids search engines and give a little information for those who do not know that Ivory Coast is also used in English.
The word Zurich is were it is at the moment because of a WP:RM vote. It has both words in the title.
The Gdansk vote, over the use of Gdansk or Danzig for different historical periods, was for Gdansk not Gdańsk so why is Gdańsk used instead of that which was voted for?
Altering the introduction of an article does not take a vote. Curps, you are the one who said that "you haven't shown consensus for this proposed change" yet so far you have not given a reason for not adding a phrase to encourage the usage of both, other than it might upset some people, but it is the removal of a spelling which tends to upset people not the addition of a spelling. As the words Zurich and Zürich appear on the first line of an article, it does not realy matter whether the article name is under Zurich and Zürich. If only one name appears on the title line then it becomes much more important what the name of the article is, as most people have a preference over the spelling of the name on the first line, but with both forms of the word are included in the first line far less people are going to object to the name of the article.
Note that while it is fairly common to use the French spelling Côte d'Ivoire including diacritics in English text, as well as the English version "Ivory Coast", the third option is also very common in English—the diacritics-less version of the French spelling, Cote d'Ivoire:
"This guideline is supposed to drive Wikipedia usage, not be driven by it"
I think this is a very interesting point and I'd like to say a few words about it. To begin with it is not self-evident to me that this is or should be the case. If you're writing a traditional encyclopedia this top-down model might make sense.
You write some guidelines, give them to your writers and tell them to follow them. But Wikipedia doesn't work like that. Wikipedia is very decentralized. Many users will simply never bother to read the Manual of Style or other policy pages. They may neither know nor care that such documents exist and still be useful editors.
Personally I feel it is useful for the policy and style pages to describe current practice. I also feel it is strange to prescribe something that is decidedly not current practice. A recent example is the ban on 'þ' and 'ð' on WP:MOS. Some people interested in the MOS thought this made sense and decided to insert it. But in actual practice 'þ' and 'ð' are used in titles of the articles of the names of every Icelander which has those letters in their names. As far as I am aware of, anywhow. So for a while the MOS prescribed something that was diametrically opposite what was actually being done.
That's one reason I removed that paragraph.
I think it's very useful to know what the current practice is. Things like: "We don't usually use diacritics in country names but we normally do use them in city names." Things like: "We normally do use thorn but in some cases we replace estzett with 'ss'". This bottom-up way of writing style guidelines seems to me to have a place on Wikipedia alongside more traditional
centralized discussions. Thoughts? - Haukur Þorgeirsson16:13, 14 October 2005 (UTC)
I agree to a large extent. It's problematic and likely futile to try to overturn what appears to be an established practice merely by making an edit to a style guide: all that will be accomplished will be that the editors of any articles affected will become aware of the existence of the style guide and edit it back (and if it truly is an established practice, it is very likely they will be a majority). In this sense, a style guide is often more descriptive than prescriptive: it helps editors to understand established practices and avoid inconsistency. -- Curps16:34, 14 October 2005 (UTC)
Exactly! But there's also a more subtle point. The general opinion of people interested in policy pages may differ from the general opinion of people editing a certain subset of articles. In those cases it can be important not to alienate people doing useful work with heavy-handed legislation from above. - Haukur Þorgeirsson22:06, 14 October 2005 (UTC)
No I disagree. For example there are two views on the policies and the guidelines. If one comes across a page which is way out of kilter then one uses the guidelines to steer the page back onto the strait and narrow, and if one has a pet page where someone comes in and starts quoting the WP policies and guidelines then they are using them as a bludgeon. It depends on the end of the stick one is holding. For months I ignored this area because I did not have an interest in them until someone renamed a page I had written stating that as Zürich was the page name any other page with Zurich in it title should be called Zürich.... That lead to WP:RM which led here.... So usually whether one wants to or not if one is an active editor one ends up in the Wikipedia: pages.
I do however think that whether the statment "This guideline is supposed to drive Wikipedia usage, not be driven by it" is correct or incorrect the adopted style of Wikipedia guidlines is not reportage or descriptive, but prescriptive even if it is only reporting what is happening and that the edits which were removed were tending to the descriptive not the prescriptive Philip Baird Shearer22:34, 14 October 2005 (UTC)
You make some good points. I too feel I got dragged into this when someone suddenly wanted to move Baldr to Balder - and then Höðr to Hodur and Lóðurr to Lodur. I sympathize with you on the battles of Zurich. This is what I'm getting at above - don't alienate people happily working in their little corner. If some military history geek wants to call it "Battle of Zurich" then it might be best to leave him alone. I do think descriptive style guides are useful, as per Curps. I want to know what's currently being done and where the controversies are. That equips me to make intelligent decisions on what to do myself. - Haukur Þorgeirsson
No, what happened was one admin suddenly moved the article to Baldr and several people objected. After much discussion, it was moved to the most common English name of Balder. Through that discussion, people discovered lots of articles had been moved away from English names, under the guise of a little-advertised renaming project to standardise things that came from Old Norse on Icelandic spellings, even if it means using characters that aren't in the modern English alphabet. For some of those articles, attempts were made to move them back, but huge efforts were made to recruit folks from Icelandic and other Nordic projects to vote, and now most page move discussions end in no consensus. [- CDThieme]
This is a misrepresentation of facts and an unwarranted attack on a good admin. Wiglaf asked on Talk:Baldur whether people thought the article should remain at the Modern Icelandic name Baldur (where it was created in 2001). Four months passed without comment so he felt free to move the article to Baldr (the unmodified original name, used in any number of English texts). Half a year later there was a proposal to move the page to the anglicized form Balder - where it had never been before. The "little-advertised" standard you speak of (to use the original Old Norse names in most cases, instead of choosing between the various anglicizations) was advertised on Wikipedia:Naming conventions, on Talk:Norse mythology and I've been advertising it on my user page for months. Those who were actually editing articles on Norse mythology knew about it. People coming in through WP:RM in a crusade against diacritics may not have. The vote on moving Höðr to Hodur went 20 to 8 in favor of Höðr. Yes, I advertised that vote but so did those in favor of the move. - Haukur Þorgeirsson09:21, 15 October 2005 (UTC)
And it is you who is advocating Icelandic spellings. You said you were fine with a move to Baldur (the Modern Icelandic name) and you supported a move to Hodur. The only way to go from Höðr to Hodur is by going through Modern Icelandic Höður and stripping the diacritics. - Haukur Þorgeirsson09:31, 15 October 2005 (UTC)
I wasn't moving for a vote - I wanted a discussion! :) This policy page is about using English names, not about using the English alphabet, using the English language, using English date conventions, using English sausages or anything else. There are other policy pages for that where appropriate. I know you probably do want a policy saying we should use the English alphabet but currently this policy page does not say that we should. - Haukur Þorgeirsson13:29, 15 October 2005 (UTC)
You seem to have overlooked the first part of this articles own name, Haukur. It's not just about "using English names" but also a subpage of Wikipedia:Naming conventions, so it only deals with the use of English names in one specific context:
"Naming conventions is a list of guidelines on how to appropriately create and name pages."
The current title better reflects that, and is less likely to be misunderstood, than your proposal.
Template for articles with accented characters (also ß etc.) in titles
I didn't know exactly where on this page to put this announcement, so I'm putting it here. I have created a template to draw users' attention to the fact that a title, such as Großglockner contains a foreign character, what that character is, and what the title would be if one used only the "standard" 26 unaccented English letters. I dare say the wording is not perfect, but it's a step in the right direction, I think. I hope this can be seen as a compromise between rigidly using only the 26 letters and using unfamiliar characters.
I suggested something similar (but less specific) at the MoS page: Wikipedia talk:Manual of Style#Variant spellings/names in one place?. I would like to see this applied to a wider range of "alternative names"—the straightforward transliteration -ß- to -ss- is far from the most difficult problem we are facing. The template (or, even we don't us a template, just the convention) should be applicable to Höðr as well. I would expect something like this at the top of the article: "'Höðr is also given in English texts as Hödr, Höthr, Hödhr, Höd, Höth, Hödh, Hodr, Hothr, Hodhr, Hod, Hoth, Hodh, Hödur, Hodur, Hother, Höder or Hoder." A simpler example would be Paul Erdős (commonly(!) spelt in English(!) texts with a Hungarian double acute accent), which ought to inform the reader that he can be spelt (in English text) with just an umlaut, like this: Paul Erdös, as well as without any umlauts at all: Paul Erdos. Finally, the Hungarian form Pál Erdős must be mentioned as well. In short, I like your suggestion but would love to see a concrete wording that encompasses more variant spellings and names. Arbor08:52, 17 October 2005 (UTC)
Indeed, the template as currently conceived can only cope with a single strange character, but it would make sense to include all possible (or likely) spellings (and mis-spellings). I'm not sure that a simple list of alternatives is enough, though; a short explanation of why the name is sometimes given differently (because of unfamiliarity with, mistrust of, or prejudice against characters outside the English "ABC") is necessary, and is what I was aiming at with the template. In particular, I think linking to the unusual letter (e.g.ő), thereby giving details of who uses it when, and how it's pronounced, is vital. --Stemonitis09:01, 17 October 2005 (UTC)
To get back to our actual role here, the naming of articles, the fact that the Paul Erdös spelling is much more common in English than Erdős (with any variant of first name) means that it is absolutely essential to have redirects from Erdös–Borwein constant as well as Erdos–Borwein constant to Erdős–Borwein constant. There are at least half a dozen other article names with similar problems. Shouldn't there be more about this on this project page? Gene Nygaard13:15, 17 October 2005 (UTC)
I like the idea of linking to the letter in question, like ő (provided this is appropriate). To elaborate, I am somewhat split between two conflicting mission statements. Our aim is to spread knowledge. The page at Carl Friedrich Gauss should tell about a famous mathematician, giving lots of detail that "normal" people might not care about. For example, I can learn that his brain weighed 1,492 g. Does everybody care? Certainly not. But this is an encyclopaedia, which I visit in order to have knowledge thrust down my throat. So how should we spell him? His name is Gauß, after all! Answer 1: use the English transliteration. This is how he is normally called (out of typographical laziness or sound convention, it makes no matter), and people should not be burdened by the extra information about how is name is written in German. After all, to most people ß look like B. Answer 2: Obviously, he should be spelt correctly (Gauß), learning this single letter once and for all takes half a second, and is very much in line with the reader's apparent goal to inform herself about German culture. This is an encyclopaedia, for crying out loud! If people object to actually learning something maybe they should switch on the telly instead.
Now, I think that both answers are right. But they obviously disagree, and both do it for the same reason: both want to disseminate information. That's why I have a hard time making up my mind (I can find half a dozen equally valid objections to both positions—let's not repeat them here). But no matter how I (and the rest of WP) will eventually decide on this matter, the idea of linking to ő seems to be sound. Arbor11:48, 17 October 2005 (UTC)
I think you've both made some very good points here, Stemonitis and Arbor, and I like the new template. I think it looks cool on Großglockner. Now, it is true that people come to an encyclopedia seeking knowledge, i.e. they come here to learn something they don't already know. On the other hand people are often more interested in information about subject X than in information about the name of subject X. Thus I think we should be careful not to drown the reader in the latter when she arrives at a page. For example I think it would look a bit cluttered to list all the possible anglicizations of Höðr at the top of the article, let alone if I were to outline the rationalization for each. That's why I put this information in a separate section and linked to it with a footnote. Hopefully a reader bewildered by the name will click on the footnote and be taken to the information. There are now also redirects in place from the variants and pronunciation information (IPA and sound files) in the section about the name. I welcome all suggestions on how we can make the article as accessible as possible. It's something of a test-case, especially since it got a lot of votes in its WP:RM.
Now, I'm definitely of the opinion that we ought to distinguish between Strauß and Strauss, as well as between Þorsteinsson and Thorsteinsson. If we don't we're just throwing away information. Our challenge, for the best encyclopedia in the world, is to combine accuracy with accessibility. That problem can't be defined away with a policy decision - it will take hard work. - Haukur Þorgeirsson12:45, 17 October 2005 (UTC)
Sorry to say but I don't really like this template until we have definetly resolved the "ß" issue, because it gives a free-for-all to start moving and changing every article with an "ß". For example I found this template on the Straße des 17. Juni. Sorry to say but to imply that changing it to Strasse des 17. Juni is alright is wrong. Unless the Kulturminister come up with a new law (again) to abolish the "ß", Straße is always written with an "ß", not "ss". The pronunciation would also change in that case, making the issue even more complicated. Gryffindor14:32, 22 October 2005 (UTC)
We have been quite careful to make the template state that only where there is some overriding reason is "Strasse des 17. Juni" (for instance) to be used, and that where there is no good reason the ß should be kept in place. It's difficult to explain all that in just a couple of lines of text, but I think the current solution is quite effective. Your statement that "Straße" is always spelt with ß is (as you know) not entirely true. In capitals, in URLs, and other places where ß is not available, "ss" is used. That's all we're trying to say in the template, not that "Strasse" is an acceptable variant for everyday use, just that there are times when the reader might see it written differently. --Stemonitis14:44, 22 October 2005 (UTC)
Feel free to suggest a new wording, we've gone through several already. But the fact is that many non-German texts will transliterate the name, even when there are no typographical reasons to do so. We're trying to report on that. Wikipedia can describe usage but it doesn't have the power of prescribing it - except insofar as we have a tiny influence through the forms we choose to use ourselves. Thus, by our using of the ß in the name we imply that this is the form we find to be more suitable for use in a reference work. - Haukur Þorgeirsson20:36, 22 October 2005 (UTC)
Google returns:
about 20,900 English pages for "Straße des 17. Juni" -Strasse
about 38,500 English pages for "Strasse des 17. Juni" -Straße
Now that is not to say that "Straße des 17. Juni" is wrong in English but why should "Strasse des 17. Juni" be defined as wrong as it is about 1/3 more popular in a Google search than "Straße des 17. Juni"? I am not saying that a Google search should necessarily be the arbitrator of such decisions, but neither should foreign rules on such things.
Why should "Straße des 17. Juni" be "more suitable for use in a reference work"? than "Strasse des 17. Juni"? Why not keep it simple and use the most common English spelling in all cases where there is a clear majority? Or perhaps to encourage new article development, leave it in whatever the primary author of an article (which is not a stub) has written it as, unless there is anothe clear guideline for change as we do with the BE/CE Wikipedia spelling divide. Philip Baird Shearer21:03, 22 October 2005 (UTC)
There aren't any foreign rules involved. There is a foreign name, which is "Straße des 17. Juni". People will expect an encyclopedia to use the more pedantically correct option. That is how we best serve our readers. In any case our template does not define the 'ss' spelling as wrong. - Haukur Þorgeirsson21:26, 22 October 2005 (UTC)
I think you are on thin ice. Are you suggesting it is more pedantically correct to refer to a name in China in Mandarin? If not why is it more correct to use "ß" instead of "ss" when the guidelines say use English and common English usage probably would come down on the side of "ss"? However I think that the most important point is that both spellings appear on a page and the template will encourage that. It is just I do not want the template to be an excuse for not using English, and the template should not be worded is such a way that it implies that "ß" is "more correct" than ss in English. Unless your (inevitable) answer puts a burr under my saddle, I will not reply to anything more on the template here, as I doubt I will add anything to the coversation I have not already said many times before ;-) -- Philip Baird Shearer21:45, 22 October 2005 (UTC)
Burr! :) Mandarin is one of the languages spoken in China. Many article titles on Wikipedia carry names from that language. We transliterate them in the titles, though. I guess I would call the use of Chinese characters more pedantically correct than the use of transliterations - but if I were to fight on that battlefield I'd probably jump into the trench for the most pedantically accurate transliteration method, with tonal marks and the rest. But this is a very different issue (and one where I'm completely out of my depth). For one thing the Chinese character script is very far from providing a phonetic or phonemic representation of the Mandarin language (or other languages it is used to write). And I'm not sure "transliteration" is even a good word for something like Hànyǔ Pīnyīn. The usual word, I gather from Wikipedia, is "romanization". Back to our template I quite agree that it shouldn't imply that the transliterated variants are incorrect. We should just say that they are commonly used where post-ascii characters are unavailable or, well, where people just don't want to use them for whatever reason. That's reality and we should describe reality. As I've stated many times before I assign far less importance to the "most common" guideline than you do, especially when it manifests in Google tests. That guideline is often sidelined, even in article titles where diacritics or foreign names are no issue. Take Wallis, Duchess of Windsor. By the way, I had a lovely 北京烤鸭 in Chinatown tonight :) - Haukur Þorgeirsson22:11, 22 October 2005 (UTC)
An attempt to build a consensus
Clearly as the straw poll shows the Wikipedia community is at present not able to build a consensus over the use of diacritics in article names. Over the last year this issue has wasted a lot of time for a lot of Wikipedia editors. So I propose that to reduce conflict by adding the following to the WP:UE page:
Words with diacritics need not be respelled to contain only the 26 letters of the English alphabet, nor vice-versa; for example, either Zurich or Zürich is acceptable. If agreement can not be reached over the spelling of a word, then consider following the spelling style preferred by the first major contributor (that is, not a stub) to the article. If more than one spelling commonly used in English then all commonly used spellings should appear on the page, with the most common spellings in the first section.
The first sentence is an adaptation of the current WP:UE "American spellings need not be respelled to British standards nor vice-versa; for example, either Colour or Color is acceptable." The second WP:MOS#National varieties of English "If all else fails, consider following the spelling style preferred by the first major contributor (that is, not a stub) to the article." The third is an attempt to adapt the WP:UE phrase "However, any non-Latin-alphabet native name should be given within the first line of the article (with a Latin-alphabet transliteration if the English name does not correspond to a transliteration of the native name)"
I don't think its very clear, and also seems to be attempting to go beyond the scope of the authority of us on Wikipedia:Naming Conventions: a list of guidelines on how to appropriately create and name pages.
There are significant differences in the application of a "first major contributor" rule here, as compared to national varieties of English. I'd throw that idea out right away. Gene Nygaard20:34, 6 November 2005 (UTC)
It also doesn't deal with the issue of what names with diacritics do to things like indexing of categories, especially when it involves the first letter, to a lesser extent for the next few letters. Gene Nygaard20:40, 6 November 2005 (UTC)
First of all, categories can be handled very simply: an accented letter sorts the same way as an unaccented letter. Thus for instance for Úbeda we use:
[[Category:Municipalities in Spain|Ubeda]]
This is pretty standard, it's a solved issue.
Regarding Philip's proposal, I'll just copy what I wrote on his talk page:
The "spelling style" refers only to British vs. American spellings such as "color" vs. "colour", and normally applies to words within an article, not to the title (since titles are proper names, there will usually be a definitive official spelling). The reason we have to be flexible here is because of national sensitivities, we don't wish to favor one dialect of English over another.
However, in cases where no national sensitivies or characteristic dialect differences are involved, then Wikipedia, like any other publication, uses a style guide and in general we expect consistency of usage. For instance titles that are created in all-lowercase by the original author are modified to the capitalized versions. Titles created with the disambiguation "(movie)" are modified to use "(film)" instead, and we don't have a case-by-case vote for each and every one.
In the case of diacritics, it is not the case that British and American usages diverge, so it is really purely a style-guide issue, and we need to have a consistent style, not case-by-case voting. Unanimity would be wonderful, but failing that we have a healthy majority in favor of titles with diacritics and the recent WP:RM vote at Úbeda reflects that once again.
We need to have one consistent style for routine non-famous names, with debate only needed in exceptional cases (eg, country names like "Mexico", which as far as I know nobody has ever proposed to rename). Case by case voting or random inconsistency over thousands of articles is really not at all acceptable.
The "foreignchar" template was introduced by someone as an attempted compromise. Gene, wouldn't diacritic title + foreignchar + properly-sorted-category be an acceptable solution to you? -- Curps21:11, 6 November 2005 (UTC)
Well, I don't understand why we should use the national spelling and not the most common English spelling (as recorded by, e.g., the OED) in the case of Zurich/Zürich and not in the case of Mexico/México. Personally I am in favour of using the most common English spelling throughout (e.g., Zurich without umlaut, Liège with accent) and to mention the national-language spelling(s) (and also the transcriptions of non-Latin-alphabet spellings if any) in the first paragraph. - Tonymec00:19, 7 November 2005 (UTC)
We've been debating this since April or earlier. The result has consistently been a significant majority in favor of using diacritics in titles, at times as high as 70% to 30%, currently approximately 60% to 40%. The issue now is simply, how do we move forward. This is a style-guide issue and we need some consistency, it's simply not acceptable to have thousands of individual votes at every single page. We can debate the "famous" or exceptional cases like Zurich/Zürich, but routine cases should follow the general rule. In a recent case where a vote was nevertheless held (Úbeda), there were twice as many votes in favor of the diacritic as against. It's time to move on. -- Curps01:24, 7 November 2005 (UTC)
We essentially agree on the issue, Curps, but I have some procedural concerns. To begin with the poll is currently running at less than 60% in favor and has been for some time. Even the most watered down definition of consensus on Wikipedia is a 60% supermajority. Nevertheless it is true that we really do need a policy here and majority rule, brutal though it is, may be one viable option - but then we must try to take the concerns of the large minority into account one way or another. There must be some attempt at compromise if we don't want an open revert war on the policy page.
Now, personally, I think the currently running poll is flawed. Taken literally it would indeed make México out of Mexico, Panamá out of Panama and România out of Romania - something which doesn't really have much support at all. It would also, on the face of it, "settle" the Zürich/Zurich issue - even though a poll on that specific name was deadlocked at 31/31 last time I checked. And yet it seems very clear to me that we should have a policy to settle "routine" issues like Úbeda and there we do probably have a sizeable majority in favor of retaining the "foreign" character (though it is worth remembering that the article was originally moved to Ubeda through a vote).
I doubt people have patience for another poll, but if I could go back in time and change the wording of this one I would include an exemption for country names and capital cities in it. The support for having the article on the capital city of Latvia at Rīga rather than Riga may not reach the support for the measure as a whole. Personally I'd even be happy to concede Reykjavík (my preferred location) to Reykjavik if it meant I'd be left alone with Seyðisfjörður and Þorlákshöfn.
The poll also has the disadvantage of lumping together French characters like 'é', which will display correctly with just about every computer system used to read Wikipedia, and Vietnamese characters like 'ồ', which will fail to display on many systems. Personally I'm fine with one rule for both - but many people are not, as you can see on the recent vote at Talk:Ho Chi Minh.
There are many complex problems involved. However, the facts on the ground as well as the majority in the poll do strongly favor the use of "foreign" characters. It would be nice to see some acknowledgment of that in the policy pages.
And if we are going to rewrite this page then we should rewrite it so that it actually means something - and the same thing to different people. Currently I read the page, see that Spanish names don't need transliteration, and come up with Úbeda. Philip reads the page and sees that we should use the "most common English name". He does some Googling and comes up with Ubeda. Who's right? Both or neither since the policy does absolutely nothing to clarify when a "common English name" exists and when it does not and yet, this concept of "common English name" is what the policy is based on!
As it is currently written the policy is worse than useless since all sides of a naming dispute can use it to justify their preferred version. Let's try to come up with something more concrete. - Haukur Þorgeirsson20:43, 8 November 2005 (UTC)
I would also warn against using the first major contributor criterion. Many articles about towns in German-speaking areas (for instance) will have been written by native German speakers, who often assume (I know from experience) that either English-speaking readers, or the wiki software, will not be able to cope with such outlandish German letters as ß, ä, ö and ü, and so change everything to the spellings that they think English speakers will be more comfortable with. I'm sure that they are mistaken; the software can clearly cope, and I think that English speakers can also cope with foreign letters every so often. This phenomenon is no doubt not restricted to German. All in all, this means that articles will be created at a spelling that the first major contributor knows to be wrong. --Stemonitis08:58, 7 November 2005 (UTC)
The "first major contributor" part is also entirely contrary to Wikipedia:No ownership of articles; the first contributor doesn't have any greater rights than later contributors. The "don't keep switching back and forth between British vs. American spelling" guideline simply reflects that fact that we don't favor one dialect or the other, but for issues that don't involve any such dialect differences, it's simply a style-guide issue where consistency is needed. -- Curps15:19, 7 November 2005 (UTC)
It is not about ownership it is about reducing conflict. The WP:MOS#National varieties of English includes the following "If an article is predominantly written in one type of English, aim to conform to that type rather than provoking conflict by changing to another. and "If all else fails, consider following the spelling style preferred by the first major contributor (that is, not a stub) to the article.". I see little difference between National varieties of English and, given that the Wikipedia community is split over the issue, changing words to have or not have diacritics. --Philip Baird Shearer10:36, 17 November 2005 (UTC)
Any normal publication in English would consider the issue of American vs. British spelling to be a style issue where consistency is needed. Since we have our peculiar rule there we could have a peculiar rule here too. I can't say I support Philip's proposal, though. I could maybe accept some sort of freezing-the-status-quo rule for borderline cases like Zürich and Riga but I don't think it's viable in the general case. Either we include accents in the names of small Spanish towns or we don't. It would be very confusing and misleading for the reader to include them in some cases and not others - much more confusing than including both British and American spellings. Incidentally this is the current situation with ß - we have Weißenburg-Gunzenhausen on the one hand and Weissenburg in Bayern on the other. I'm not happy with that and I can't believe anyone is. - Haukur Þorgeirsson20:57, 8 November 2005 (UTC)
Any "normal" publication has a geographical center where its headquarters and most of its staff are located. The Economist uses British spelling, and the Wall Street Journal uses American spelling. However, Wikipedia must avoid favoring one dialect over the other because we welcome contributors from all over the world, so we make a conscious choice to live with inconsistency in this regard. However, there are no British vs. American dialect issues involved for diacritics. It's purely a style guide issue and consistency is needed. Famous names like Zürich and Rīga can be debated, but for the rest we need to stick to a general rule. -- Curps21:40, 8 November 2005 (UTC)
It is precisely the same issue as AE CE. Most of the people voting on an issue involving Polish or Icelandic, or where ever, who vote in favour of local diacritics tend to come from those countries (or have a strong interest in the local language). Sometimes they cross over into other debates but they are most passionate about their own patch. Most of the people not supporting the use of diacritics tend to come from countries which do not use them. Philip Baird Shearer02:48, 17 November 2005 (UTC)
But diacritics are used routinely almost everywhere except English-speaking countries, so it's mainly the English speakers (and not even all of them) who oppose the idea of diacritic usage. The pro-diacritic side comprises almost everyone else. The reason there's a stalemate is that the English speakers (i.e. speakers of no other language) outnumber the internationals on the English Wikipedia. A sensible solution would seem to be that (in the absence of an widely-used English name, e.g. Vienna), we leave it up to the locals: Americans would decide what is the correct spelling of American towns, the British decide about British towns, the Spanish about Spanish towns, the Germans German ones, the Polish Polish ones, and so on. What right have we English speakers to impose our spelling rules on foreign placenames? (And before someone suggests it, no, of course this doesn't mean using Arabic or Chinese writing systems, etc.) --Stemonitis07:16, 17 November 2005 (UTC)
It's not as simple as just a question of who comes from what linguistic background. My native language (French) uses characters like â à Ç ç É é ê è ë î ï ô œ Œ û ù which have been called "funny French squiggles" elsewhere on this page. However I don't push systematic use of diacritics. This is the English-language Wikipedia and I am in favour of using the most common English-language form in it; when I have a doubt, I check some respectable source like the unilingual dictionaries published by the Oxford University Press, where I find "Zurich" (with no umlaut) but "Düsseldorf" and "Liège" (with diacritics) with no mention of a different spelling in each case. - Tonymec12:23, 19 November 2005 (UTC)
This is an English (language) Wikipedia, not an International Wikipedia, so why not use the most common English spellings and the ones that most English readers find the most comfortable? This is the major reason for the AE CE compromise, it is not that Americans can not read "colour" and know what it means, it is because it looks wrong just as "color" looks wrong to New Zealanders. Most French and German diacritics are easily ignored but the more outlandish the language the more difficult it is to ignore them and more jarring they are. I know form the acres of text written on this that I am not going to convince you. Which is precisely why I am suggesting a compromise (Just as is done in the (WP:MOS) to reduce conflict:
Words with diacritics need not be respelled to contain only the 26 letters of the English alphabet, nor vice-versa; for example, either Zurich or Zürich is acceptable. If agreement can not be reached over the spelling of a word, then consider following the spelling style preferred by the first major contributor (that is, not a stub) to the article. If more than one spelling commonly used in English then all commonly used spellings should appear on the page, with the most common spellings in the first section.
If most articles about Spanish towns in en.wikipedia.org are written by people who use diacritics then diacritic usage will predominate for Spanish towns. -- Philip Baird Shearer10:15, 17 November 2005 (UTC)
I have some reservations about that, but it seems a reasonable compromise for this situation. I do not, however, accept that "reader comfort" is a good criterion. Wikipedia is here to inform, not to comfort. --Stemonitis10:37, 17 November 2005 (UTC)
Looking around on en.wikipedia.org, it seems very clear to me that it is indeed an International Wikipedia written in English. Why else would be consider having detailed articles about the road system in Germany or any old small town in Spain with "29% of employment in the service-sector". I also rather resent any language being described as outlandish, it seems xenophobic. I can appreciate that there are sometimes technical problems with old browsers, but for people with good fonts in their browsers (and that's an ever increasing number of the reader base) let me ask the following question: Why is the French é more reasonable than let's say the Latvian ī or indeed the Vietnamese ồ. Stefán Ingi15:08, 17 November 2005 (UTC)
Outlandish is not xenophobic, the further a language's origins are from English the more outlandish it tends to be. It is a two way street, the further English is from a language the more outlandish English will seem to a native speaker of that language. Written representations of languages tend follow a similar pattern. --Philip Baird Shearer13:50, 19 November 2005 (UTC)
I'm not quite sre how the above proposal addresses the instance of page names, which can be established long before the first major contributor shows up. I'd argue that in the instance of naming a page, the first major contributor is actually the pages creator. Is that what is meant by this proposal. Hidingtalk17:39, 17 November 2005 (UTC)
No, you are misunderstanding me. That guidance from the MoS does not apply to article names, only to article text. I am asking you to explain how your proposal addresses article names, something the MoS does not address. As I say, in choosing the name of an article, can you tell me why the first person to name an article on the topic is not the first major contributor? What else could be more significant in deciding the name of an article? I would propose that to clarify this point we choose the name of the oldest article on the place, whether that includes diacritics or not. Hidingtalk15:36, 19 November 2005 (UTC)
I do not think it would not be in the Wikipedia community interest for a naming convention and the MOS to have different specifications, it makes more sense to coordinate them. Also the last thing we need is 1,000s of stubs created with or without diacritics to lay claim to to "preferred" names. "first major contributor (that is, not a stub)" usually works well for AE CE spelling and I do not see why it would not work for page names, as that is the effect it has on AE CE page names (see Petrol and the heated debates over the name). Besides in the future a section may be added to the MOS about diacritics usage within a page in which case it defiantly should be coordinated with the AE CE divide. --Philip Baird Shearer19:53, 19 November 2005 (UTC)
I do not see why the current common english guideline can not work just as well then. It's worth remembering that these are only guidleines, and a consensus on an article's talk page for a move to a title with diacritics is strong enough allow non-conformity. I can't see the problem you're proposal is attempting to fix. As there is no section in the MoS on this issue, there is no non-conformity between the two, and it isn't ultimately neccessary to have the two conform, since they address different areas, the MoS addressing article text and the naming conventions addressing article names. I also stand by my reasoning above with regards to the flaws in your proposal. Hidingtalk14:04, 21 November 2005 (UTC)
we have been discussing this for half a year and I am more than sick of it by now. Not because there are different opinions, but because there seems to be no progress. In November, we still get proposals containing the phrase "the English alphabet". I have tried to show several times that the question can be broken down into several parts: in deadlocks like this, looking for and agreeing on sub-questions is the only way forward. My attempts at differentiation were largely ignored, and I'll not re-submit them, because they apparently tend to blow the "the English alphabet has 26 letters, what is there to discuss?"-mindset. A "first major contributor" rule is a horrible solution, and a testimony to the sad state of things on the BC/CE front, and certainly not something to be emulated. For the time being, I'll try to just live with the fact that there can be no progress anytime soon. Maybe the {{foreignchar}} template is not such a bad thing, as silly as it looks. I used to hate templates that seem to assume the reader is a moron, but Wikipedia tends to be so littered by these that I stopped being annoyed at them. dab(ᛏ)16:04, 21 November 2005 (UTC)
Because Hiding it cuts both ways "a consensus on an article's talk page for a move to a title with no diacritics is strong enough allow non-conformity" We all spend hours and hours deciding what the consensus is on each individual page with no guideline, hence my suggestion for a compromise. See Ubeda for a change both ways and far more effort expended on a name than the text of the article, (Most of which I translated from the Spanish page months and months ago). To say that there is no connection between the MOS and NC is just not true because the MOS influences spelling (and spelling consistency within a page) and the NC dictates page name on the first line, so as I said before look at Petrol (and Tram and orange (colour)) to see how the MOS influences page names. --Philip Baird Shearer20:18, 22 November 2005 (UTC)
Philip, I'm well aware of the workings at those articles. If people wish to invest hours and hours deciding a consensus on where a page belongs, they will do so regardless of what the naming conventions and the MoS dictate, these are merely fall back positions for when no consensus can be reached. The fall back position in this case, as I have stated above, is, in my opinion, fine as it is. There is obviously a connection between the MoS and the naming conventions. However I will allow you to beat me into submission and retire from this debate because we are going around in ever decreasing circles. Hidingtalk09:31, 23 November 2005 (UTC)
More attempts to build a consensus - transliterations
I agree with Philip. One way or another it's time we started acting like adults and come to some sort of compromise here. Spending hours on different requested move pages rehashing the same old arguments is not the way forward (I'm as guilty of this as anyone).
I also agree with dab that we should try to break the problem down into more manageable pieces. Philip has been campaigning for the inclusion of transliterated versions where article titles use "foreign" characters. That's one part of the larger question and now I'll try to break it down into even smaller questions:
Should we consistently provide transliterations for characters which are not obviously derived from the 26 letters of the English alphabet (þ, ð, ß etc.)? [Yes, I suppose. I think 'foreignchar' or something similar is good for this.]
Should we consistently provide transliterations for ligature characters (æ, œ etc.)? [I don't know. Those two ligatures, at any rate, seem to be reasonably well known to readers of English.]
Should we consistently provide transliterations when the most "correct" transliteration is different from the most "obvious" one ('ü' becomes 'ue' rather than 'u', 'þ' becomes 'th' rather than 'p' etc.)? [I suppose we should. But when the obvious version is common too, as in ü > u, that possibility can be mentioned as well.]
Should we provide transliterations when they simply involve stripping diacritics (the infamous Úbeda > Ubeda). [I'd rather not, really, but I can swallow it if it helps build consensus. I certainly feel that the foreignchar template is overkill in cases like that.]
Should we handle characters differently based on their availability in common fonts? Is it fair to privilege e.g. Icelandic characters like 'þ' and 'ð' above Vietnamese characters like 'ễ' and 'ố'? [I have not formed an opinion. Obviously we want to be accessible to as many people as possible but I can sympathise with the Vietnamese editors who feel this is a raw deal for them.]
You seem to be making the assumption that the article title should contain foreign characters in the name. What if we make the assumption that they do not then how does that effect your proposals (for example does it throws out the foreignchar template). --Philip Baird Shearer14:38, 23 November 2005 (UTC)
I'm not really proposing anything, I'm just trying to define some questions which might be useful. Yes, the above only applies to article titles which have some "foreign" characters in them. When the article title uses an English/Anglicized name we have other, and probably easier, questions to answer. For example I think we're both fine with stuff like "Asgard (Old Norse: Ásgarðr)". - Haukur Þorgeirsson14:48, 23 November 2005 (UTC)
It is my impression that very few people, if any, advocate the use of one spelling in the title and another spelling in the article. If we're going to use accented and unusual characters, then we'll be using them in titles as well, since there's no technical reason not to. If you intend to use unaccented titles, then {{wrongtitle}} (or something similar) comes in to play instead of {{foreignchar}}! I fail to see any advantage of having an unaccented title for an accented article. Next suggestion, please. --Stemonitis14:46, 23 November 2005 (UTC)
Stemonitis have you read the previous acres of talk on this? It is not wrong to use words without funny foreign squiggles when writing or naming an article in English unless common usage dictates that foreign squiggles are used. Look at Talk:Zurich as an example. This is primarily a cultural issue not a technical one. That is why I am trying to find a compromise.
Haukur Þorgeirsson I am also fine with "Höðr (often anglicized as Hod)", the problem is that once one tries to define what is an acceptable, one runs into lots of points of view for example: "Why is the French é more reasonable than let's say the Latvian ī or indeed the Vietnamese ồ. Stefán Ingi". So if common English usage is guide for the Anglicizition. --213.86.212.21616:48, 23 November 2005 (UTC)
The English alphabet includes diacritics
Look up "outré" and "soupçon" in the dictionary (for instance Merriam-Webster (m-w.com) or the Random House Dictionary of the English Language or the American Heritage Dictionary (as used at dictionary.dcom). These are English words for which the only correct spelling includes diacritics (according to the above three dictionaries and probably others).
Also consider the somewhat old-fashioned but still perfectly valid spelling variant "coöperation". Note that this "ö" is a purely English letter: the original French is coopération, with plain "o"!
Also consider also our article Encyclopædia Britannica. There seems to be a conspicuous lack of urgency by anyone to "correct" this spelling to "ae" instead of "æ".
If "é" and "ç" are required to correctly spell certain English words, why wouldn't they be used in the names of French cities?
If "ö" is acceptable in English spellings that are not derived from any foreign spelling, then why wouldn't it be used in the names of German cities?
If "æ" and "œ" are acceptable spelling variations in English words that are derived from Latin, then why wouldn't they be used in names used in Norse mythology?
The argument that English spellings only use the 26 ASCII-conformant letters simply doesn't hold.
I more or less agree with you that these letters are probably familiar to most English readers. But saying that "soupçon" is the only spelling of the word is a bit too prescriptive for my tastes - a Google search yields twice as many pages for "soupcon". A lot of people agree with what Jimbo said years ago: "My perspective is that if I don't see it on my keyboard, and if I didn't sing it in the alphabet song, it's 'fancy' and therefore should be avoided." That would make æ, œ, ö, ç and é fancy. Very well. But many names - including some from English-speaking countries - do include those fancy letters and in those cases I think we should use them in the relevant article titles. That's what we're currently doing and that seems to be the majority opinion around here - but our policy is far from explicit on the point. Those who give primacy to "most common spelling" arguments and brandish Google as their weapon have a lot to stand on in our policies. They understandably get frustrated when they have to yield again and again.
It's also important to remember the context of a letter. The English 'ö' and the German 'ö' are not really the same - and the latter is often transliterated with 'oe'. - Haukur Þorgeirsson09:52, 23 November 2005 (UTC)
The ö in coöperation isn't really a different letter. It is a diaresis rather than an umlaut. It is also rarely used any more.
That usage is, of course, one reason an umlauted o is more acceptable in English usage rather than the thorn or edh and esstset. Gene Nygaard14:33, 23 November 2005 (UTC)
The process of Anglicization can take time and for some borrowed words will never take place. French meues being one example (Althought I understand that even that may be changing [5]) and "marché ouvert" another.
BTW dictionary.com has outre as well as outré (and this link) Also I think for such an discussions only the OED will do. --Philip Baird Shearer10:13, 23 November 2005 (UTC)
Consulting the OED will probably only add dozens of English words where the spelling uses diacritics. The best you could hope for is that "soupcon" might be listed in the OED as a rare spelling variant... so rare that widely used unabridged dictionaries don't list it at all. -- Curps17:33, 23 November 2005 (UTC)
The body of such an article, preferably in its first paragraph, should list all of the other names by which the subject is known, so those too can be searched for.
to this:
The body of each article, preferably in its first paragraph, should list all common names by which its subject is known, so those too can be searched for.
I think the second version is clearer since it is completely unclear what "such an article" was supposed to refer to. This is a guideline that should apply to every article.