Re: [Wikitech-l] The lang tag in <HTML> ain't identical to $wgContLanguageCode

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

Re: [Wikitech-l] The lang tag in <HTML> ain't identical to $wgContLanguageCode

Gerard Meijssen-3
Brion Vibber wrote:

> Shinjiman wrote:
>> <html xmlns="" xml:lang="XXX" lang="XXX">
>> The lang (and xml:lang) attribute defined at the HTML tag in some language is
>> not correct and it's supposed to not making this value identical to
>> $wgContLanguageCode.
> Incorrect; it *is* supposed to be the value of $wgContLanguageCode, as by
> definition $wgContLanguageCode is the RFC 3066 language code for the language of
> the wiki's content.
> A reasonable case might be made that when variant display conversion is engaged,
> the lang attribute should be overridden.
>> For example there's no such language tag called "simple",
> Indeed there's not; that would be "en".
> Note that $wgContLanguageCode is not the same as the *domain name* or *interwiki
> identifier*. These are separate issues.
>> according to ISO639, RFC1766, RFC3066 (R1,R2). Hence for my previous patch
>> that submitted to Bug:5790. The main purpose of the patch is adding a new
>> Language Tag Mapping against the user interface language which using the
>> incorrect language tag.
> That would be stupid and useless. Instead, use the correct code to begin with.
> -- brion vibber (brion @
The case for the simple wikipedia is indeed obvious. More problematic is
when you want to link a wikipedia that uses a code that will never be
accepted as a language code because it is considered a language family.
Or a code that is used for another language. Or a language where the
code is specific while the wikipedia uses it to indicate a larger
language "continuum". Another issue is that language codes are retired;
this leads to a different interpretation of the meaning of the ku, fa
and several others (this is part of ISO-639-3)

Having meaningful links between the Wikimedia codes for interwikil inks
and language codes is not trivial. For WiktionaryZ we are going to
standardise on ISO-639-3 and have CLEAR codes that identify languages
that are not recognised at present. One consequence is, that the Babel
templates will be the ISO-639-3 codes as well.

RFC 3066 indicates to be reserving tags for subsequent revisions of the
ISO-639 code. ISO-639-3 clearly states that the codes will not be
recycled. It also says that this principle will be maintained for any
future revisions of the code. It is therefore safe to use ISO-639-3.


Wiktionary-l mailing list
[hidden email]