Revamping interwiki prefixes

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Revamping interwiki prefixes

This, that and the other
Long message coming up... please be brave and take a look :)

This is a proposal to try and bring order to the messy area of interwiki
linking
and interwiki prefixes, particularly for non-WMF users of MediaWiki.

At the moment, anyone who installs MediaWiki gets a default interwiki table
that
is hopelessly out of date.  Some of the URLs listed there have seemingly
been
broken for 7 years [1].  Meanwhile, WMF wikis have access to a nice, updated
interwiki map, stored on Meta, that is difficult for anyone else to use.
Clearly something needs to be done to sort this out.

What I propose we do to improve the situation is along the lines of bug
58369:

1. Split the existing interwiki map on Meta [2] into a "global interwiki
map",
    located on MediaWiki.org (draft at [3]), and a "WMF-specific interwiki
map"
    on Meta (draft at [4]).  Wikimedia-specific interwiki prefixes, like
    bugzilla:, gerrit:, and irc: would be located in the map on Meta,
whereas
    general-purpose interwikis, like orthodoxwiki: and wikisource: would go
to
    the "global map" at MediaWiki.org.

2. Create a bot, similar to l10n-bot, that periodically updates the default
    interwiki data in mediawiki/core based on the contents of the global
map.
    (Right now, the default map is duplicated in two different formats [5]
[6]
    which is quite messy.)

3. Write a version of the rebuildInterwiki.php maintenance script [7] that
can
    be bundled with MediaWiki, and which can be run by server admins to pull
in
    new entries to their interwiki table from the global map.

This way, fresh installations of MediaWiki get a set of current, useful
interwiki prefixes, and they have the ability to pull in updates as
required.
It also has the benefit of separating out the WMF-specific stuff from the
global
MediaWiki logic, which is a win for external users of MW.

Two other things it would be nice to do:

* Define a proper scope for the interwiki map.  At the moment it is a bit
   unclear what should and shouldn't be there.  The fact that we currently
have
   a Linux users' group from New Zealand and someone's personal blog on the
map
   suggests the scope of the map have not been well thought out over the
years.
   My suggested criterion at [3] is:

     "Most well-established and active wikis should have interwiki
     prefixes, regardless of whether or not they are using MediaWiki
     software.
     Sites that are not wikis may be acceptable in some cases,
     particularly if they are very commonly linked to (e.g. Google,
     OEIS)."

* Take this opportunity to CLEAN UP the global interwiki map!
** Many of the links are long dead.
** Many new wikis have sprung up in the last few years that deserve to be
    added.
** Broken prefixes can be moved to the WMF-specific map so existing links on
    WMF sites can be cleaned up and dealt with appropriately.
** We could add API URLs to fill the iw_api column in the database
(currently
    empty by default).

I'm interested to hear your thoughts on these ideas.

Sorry for the long message, but I really think this topic has been neglected
for such a long time.

TTO

----

PS. I am aware of an RFC on MediaWiki.org relating to this, but I can't see
that
gaining traction any time soon.  This proposal would be a more light-weight
way
of dealing with the problem at hand.

[1] https://gerrit.wikimedia.org/r/#/c/84303/
[2] https://meta.wikimedia.org/wiki/Interwiki_map
[3]
https://www.mediawiki.org/wiki/User:This,_that_and_the_other/Interwiki_map
[4]
https://meta.wikimedia.org/wiki/User:This,_that_and_the_other/Local_interwiki_map
[5]
http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.list
[6]
http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.sql
[7]
https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance.git/master/rebuildInterwiki.php 



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Nathan Larson
On Thu, Jan 16, 2014 at 4:53 AM, This, that and the other <
[hidden email]> wrote:

> 1. Split the existing interwiki map on Meta [2] into a "global interwiki
> map",
>    located on MediaWiki.org (draft at [3]), and a "WMF-specific interwiki
> map"
>    on Meta (draft at [4]).  Wikimedia-specific interwiki prefixes, like
>    bugzilla:, gerrit:, and irc: would be located in the map on Meta,
> whereas
>    general-purpose interwikis, like orthodoxwiki: and wikisource: would go
> to
>    the "global map" at MediaWiki.org.
>

Why is it worth the trouble of maintaining two separate lists? Do the
Wikimedia-specific interwiki prefixes get in people's way, e.g. when
they're reading through the interwiki list and encounter what is, to them,
useless clutter? As the list starts getting longer (e.g. hundreds,
thousands or tens of thousands of prefixes), people will probably do a Find
on the list rather than scrolling through, so it may not matter much if
there's that little bit of extra clutter. Sometimes I do use those
Wikimedia-specific prefixes on third-party wikis (e.g. if I'm talking about
MediaWiki development issues), and they might also end up getting used if
people import content from Wikimedia wikis.


> * Define a proper scope for the interwiki map.  At the moment it is a bit
>   unclear what should and shouldn't be there.  The fact that we currently
> have
>   a Linux users' group from New Zealand and someone's personal blog on the
> map
>   suggests the scope of the map have not been well thought out over the
> years.
>   My suggested criterion at [3] is:
>

People will say we should keep those interwikis for historical reasons. So,
I think we should have a bot ready to go through the various wikis and make
edits converting those interwiki links to regular links. We should make
this tool available to the third-party wikis too. Perhaps it could be a
maintenance script.


>     "Most well-established and active wikis should have interwiki
>     prefixes, regardless of whether or not they are using MediaWiki
>     software.
>     Sites that are not wikis may be acceptable in some cases,
>     particularly if they are very commonly linked to (e.g. Google,
>     OEIS)."
>

Can we come up with numerical cutoffs for what count as "well-established",
"active", and "very commonly linked to", so that people know what to expect
before they put a proposal forth, or will it be like notability debates,
and come down to people's individual opinions of what should count as "very
commonly linked to" (as well as a certain amount of
ILIKEIT<https://en.wikipedia.org/wiki/Wikipedia:Arguments_to_avoid_in_deletion_discussions#I_like_it>and
IDONTLIKEIT, even if users deny that's the basis for their decision)?
We might get the help of WikiIndex and (especially) WikiApiary in getting
the necessary statistics.


> ** Many of the links are long dead. (snip)
>
** We could add API URLs to fill the iw_api column in the database
> (currently
>    empty by default).
>

Those two should be uncontroversial.


> Sorry for the long message, but I really think this topic has been
> neglected
> for such a long time.
>

It's okay, it's a complicated subject with a lot of tricky implementation
decisions that need to be made (which is probably part of why it's been
neglected). Thanks for taking the time to do a thorough analysis.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

This, that and the other
"Nathan Larson"  wrote in message
news:CAF-JeUxsM-jQ85nij+OALA=[hidden email]...

> Why is it worth the trouble of maintaining two separate lists? Do the
> Wikimedia-specific interwiki prefixes get in people's way, e.g. when
> they're reading through the interwiki list and encounter what is, to
> them,
> useless clutter?

I can't say I care about people reading through the interwiki list.
It's just that with the one interwiki map, we are projecting "our"
internal interwikis, like strategy:, foundation:, sulutil:, wmch: onto
external MediaWiki installations.  No-one needs these prefixes except
WMF wikis, and having these in the global map makes MediaWiki look too
WMF-centric.

> Sometimes I do use those
> Wikimedia-specific prefixes on third-party wikis (e.g. if I'm talking
> about
> MediaWiki development issues)

This is a good argument to include gerrit:, rev:, mediazilla: etc. on
the global interwiki map.

> and they might also end up getting used if
> people import content from Wikimedia wikis.

They're mainly used in meta-discussions, so I doubt this is a concern.

> People will say we should keep those interwikis for historical
> reasons. So,
> I think we should have a bot ready to go through the various wikis and
> make
> edits converting those interwiki links to regular links. We should
> make
> this tool available to the third-party wikis too. Perhaps it could be
> a
> maintenance script.

Amen to this.  https://bugzilla.wikimedia.org/show_bug.cgi?id=60135

> Can we come up with numerical cutoffs for what count as
> "well-established",
> "active", and "very commonly linked to", so that people know what to
> expect
> before they put a proposal forth, or will it be like notability
> debates,
> and come down to people's individual opinions of what should count as
> "very
> commonly linked to" (as well as a certain amount of
> ILIKEIT<https://en.wikipedia.org/wiki/Wikipedia:Arguments_to_avoid_in_deletion_discussions#I_like_it>and
> IDONTLIKEIT, even if users deny that's the basis for their decision)?
> We might get the help of WikiIndex and (especially) WikiApiary in
> getting
> the necessary statistics.

I don't see the need for instruction creep here.  I'm for an inclusive
interwiki map.  Inactive wikis (e.g. RecentChanges shows only sporadic
non-spam edits) and non-established wikis (e.g. AllPages shows little
content) should be excluded.  So far, there have been no issues with
using subjective criteria at meta:Talk:Interwiki map.

> It's okay, it's a complicated subject with a lot of tricky
> implementation
> decisions that need to be made (which is probably part of why it's
> been
> neglected). Thanks for taking the time to do a thorough analysis.

And thank you, Nathan, for your contributions.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l 



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Nathan Larson
On Thu, Jan 16, 2014 at 7:35 PM, This, that and the other <
[hidden email]> wrote:

> I can't say I care about people reading through the interwiki list. It's
> just that with the one interwiki map, we are projecting "our" internal
> interwikis, like strategy:, foundation:, sulutil:, wmch: onto external
> MediaWiki installations.  No-one needs these prefixes except WMF wikis, and
> having these in the global map makes MediaWiki look too WMF-centric.
>

It's a WMF-centric wikisphere, though. Even the name of the software
reflects its connection to Wikimedia. If we're going to have a
super-inclusive interwiki list, then most of those Wikimedia interwikis
will fit right in, because they meet the criteria of having non-spammy
recent changes and significant content in AllPages. If you're saying that
having them around makes MediaWiki "look" too WMF-centric, it sounds like
you are concerned about people reading through the interwiki list and
getting a certain impression, because how else would they even know about
the presence of those interwiki prefixes in the global map?


> I don't see the need for instruction creep here.  I'm for an inclusive
> interwiki map.  Inactive wikis (e.g. RecentChanges shows only sporadic
> non-spam edits) and non-established wikis (e.g. AllPages shows little
> content) should be excluded.  So far, there have been no issues with using
> subjective criteria at meta:Talk:Interwiki map.


I dunno about that. We have urbandict: but not dramatica: both of which are
unreliable sources, but likely to be used on third-party wikis (at least
the ones I edit). We have wikichristian:
(~4,000<http://www.wikichristian.org/index.php?title=Special:Statistics>content
pages) but not rationalwiki: (
~6,000 <http://rationalwiki.org/wiki/Special:Statistics> content pages).
The latter was rejected<https://meta.wikimedia.org/w/index.php?title=Talk%3AInterwiki_map&diff=4573672&oldid=4572621>awhile
ago. Application of the subjective criteria seems to be hit-or-miss.

If we're going to have a hyper-inclusionist system of canonical interwiki
prefixes <https://www.mediawiki.org/wiki/Canonical_interwiki_prefixes>, we
might want to use WikiApiary and/or WikiIndex rather than MediaWiki.org as
the venue. These wikis that already have a page for every wiki could add
another field for interwiki prefix to those templates and manage the
interwiki prefixes by editing pages. Thingles
said<https://wikiapiary.com/w/index.php?title=User_talk%3AThingles&diff=409395&oldid=408940>he'd
be interested in WikiApiary's getting involved. The only downside is
that WikiApiary doesn't have non-MediaWiki wikis. It
sounded<http://wikiindex.org/index.php?title=User_talk:Leucosticte&diff=prev&oldid=144256>as
though Mark Dilley might be interested in WikiIndex's playing some
role
in this too. But even WikiIndex has the problem of only containing wikis;
the table will have to have other websites as well.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Arcane 21
If I might weigh in here, I don't see the harm in including all the WMF wikis onto the interwiki map.

MediaWiki is intensely related to the WMF, so those links make logical sense and it does no harm to include them in my opinion.

> Date: Thu, 16 Jan 2014 22:40:37 -0500
> From: [hidden email]
> To: [hidden email]
> Subject: Re: [Wikitech-l] Revamping interwiki prefixes
>
> On Thu, Jan 16, 2014 at 7:35 PM, This, that and the other <
> [hidden email]> wrote:
>
> > I can't say I care about people reading through the interwiki list. It's
> > just that with the one interwiki map, we are projecting "our" internal
> > interwikis, like strategy:, foundation:, sulutil:, wmch: onto external
> > MediaWiki installations.  No-one needs these prefixes except WMF wikis, and
> > having these in the global map makes MediaWiki look too WMF-centric.
> >
>
> It's a WMF-centric wikisphere, though. Even the name of the software
> reflects its connection to Wikimedia. If we're going to have a
> super-inclusive interwiki list, then most of those Wikimedia interwikis
> will fit right in, because they meet the criteria of having non-spammy
> recent changes and significant content in AllPages. If you're saying that
> having them around makes MediaWiki "look" too WMF-centric, it sounds like
> you are concerned about people reading through the interwiki list and
> getting a certain impression, because how else would they even know about
> the presence of those interwiki prefixes in the global map?
>
>
> > I don't see the need for instruction creep here.  I'm for an inclusive
> > interwiki map.  Inactive wikis (e.g. RecentChanges shows only sporadic
> > non-spam edits) and non-established wikis (e.g. AllPages shows little
> > content) should be excluded.  So far, there have been no issues with using
> > subjective criteria at meta:Talk:Interwiki map.
>
>
> I dunno about that. We have urbandict: but not dramatica: both of which are
> unreliable sources, but likely to be used on third-party wikis (at least
> the ones I edit). We have wikichristian:
> (~4,000<http://www.wikichristian.org/index.php?title=Special:Statistics>content
> pages) but not rationalwiki: (
> ~6,000 <http://rationalwiki.org/wiki/Special:Statistics> content pages).
> The latter was rejected<https://meta.wikimedia.org/w/index.php?title=Talk%3AInterwiki_map&diff=4573672&oldid=4572621>awhile
> ago. Application of the subjective criteria seems to be hit-or-miss.
>
> If we're going to have a hyper-inclusionist system of canonical interwiki
> prefixes <https://www.mediawiki.org/wiki/Canonical_interwiki_prefixes>, we
> might want to use WikiApiary and/or WikiIndex rather than MediaWiki.org as
> the venue. These wikis that already have a page for every wiki could add
> another field for interwiki prefix to those templates and manage the
> interwiki prefixes by editing pages. Thingles
> said<https://wikiapiary.com/w/index.php?title=User_talk%3AThingles&diff=409395&oldid=408940>he'd
> be interested in WikiApiary's getting involved. The only downside is
> that WikiApiary doesn't have non-MediaWiki wikis. It
> sounded<http://wikiindex.org/index.php?title=User_talk:Leucosticte&diff=prev&oldid=144256>as
> though Mark Dilley might be interested in WikiIndex's playing some
> role
> in this too. But even WikiIndex has the problem of only containing wikis;
> the table will have to have other websites as well.
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
     
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

This, that and the other
Nathan is right that I am contradicting myself a bit.  It's true that if you don't
look at the interwiki map, you'll never know what's there - you'll never know that
WMF is stuffing the default map full of its own junk.  What I really meant to say is
that external users will feel short-changed that we get to add "our" internal
interwikis to the global map, yet they aren't allowed to add their internal wikis
(equivalent to our strategy, outreach, etc) to the global map, for any given reason.

I'm not getting a coherent sense of a direction to take.  Do we split the existing
interwiki map into a local and a global map (as I originally proposed)?  Do we start
from scratch, rewriting the interwiki map from a blank slate, or do we start with
what we've got?  Do we flood external MW users with a ton of new prefixes, or do we
ship a mostly empty table to new MW installations?  Do we scale right back and limit
ourselves to a small core of interwiki prefixes?  Do we take up Tim's idea and toss
interwikis altogether?



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Bináris
In reply to this post by This, that and the other
2014/1/17 This, that and the other <[hidden email]>

> "Nathan Larson"  wrote in message news:CAF-JeUxsM-jQ85nij+OALA=
> [hidden email]...
>
> Nice qouting. :-)

>
>
> I can't say I care about people reading through the interwiki list. It's
> just that with the one interwiki map, we are projecting "our" internal
> interwikis, like strategy:, foundation:, sulutil:, wmch: onto external
> MediaWiki installations.  No-one needs these prefixes except WMF wikis, and
> having these in the global map makes MediaWiki look too WMF-centric.


One central intwerwiki map with an extra flag? May be branched for WMF and
general as well as maintined together.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

go moko




>________________________________
> From: Bináris <[hidden email]>
>To: Wikimedia developers <[hidden email]>
>Sent: Friday, January 17, 2014 10:07 AM
>Subject: Re: [Wikitech-l] Revamping interwiki prefixes
>
>
>2014/1/17 This, that and the other <[hidden email]>
>
>> "Nathan Larson"  wrote in message news:CAF-JeUxsM-jQ85nij+OALA=
>> [hidden email]...
>>
>> Nice qouting. :-)
>
>>
>>
>> I can't say I care about people reading through the interwiki list. It's
>> just that with the one interwiki map, we are projecting "our" internal
>> interwikis, like strategy:, foundation:, sulutil:, wmch: onto external
>> MediaWiki installations.  No-one needs these prefixes except WMF wikis, and
>> having these in the global map makes MediaWiki look too WMF-centric.
>
>
>One central intwerwiki map with an extra flag? May be branched for WMF and
>general as well as maintined together.
>
>
>What if filling the interwiki table with predefined links was an installation option, possibly with several lists, and void?
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Nathan Larson
On Fri, Jan 17, 2014 at 9:00 AM, go moko <[hidden email]> wrote:

> >What if filling the interwiki table with predefined links was an
> installation option, possibly with several lists, and void?


Probably won't (and shouldn't) happen, since we're trying to keep the
installer options close to the bare minimum. Changing or clearing the
interwiki table is pretty easy with the right script or special page. The
fact that, when the installer's interwiki list was changed in 2013, it was
accurate to say "This list has obviously not been properly updated for many
years. There are many long-dead sites that are removed in this patch"
suggests that third party wikis are pretty good at ignoring interwiki links
they don't want or need.

I disagree that "collisions are very rare, and none of the alternatives
seem viable or practical". Collisions (or whatever one would call them)
happen fairly often, and the resulting linking
errors<https://en.wikiquote.org/w/index.php?title=User%3ALeucosticte&diff=1503289&oldid=1503284>can
be hard to notice because one sees the link is blue and assumes it's
going where one wanted it to.

It wouldn't be such a problem if wikis would name their project namespace
Project: rather than the name of the wiki. Having it named Project: would
be useful when people are importing user or project pages from Wikipedia
(e.g. if they wanted to import userboxes or policy pages) and don't want
the Wikipedia: links to become interwiki links. I would be in favor of
renaming the project namespaces to Project: on Wikimedia wikis; that's how
it is on MediaWiki.org (to avoid a collision with the MediaWiki: namespace)
and it seems to work out okay. I'll probably start setting up my third
party wikis that way too, because I've run into similar problems when
exporting and importing content among them. Perhaps the installer should
warn that it's not recommended to name the meta namespace after the site
name.

Tim's proposal seems pretty elegant but in a few situations will make links
uglier or hide where they point to. E.g. "See also" sections with interwiki
links (like what you see
here<https://en.wikipedia.org/wiki/Help:Interwiki_linking#See_also>)
could become like the "Further reading" section you see
here<https://en.wikipedia.org/wiki/Wikipedia:Policies_and_guidelines#Further_reading>in
which one has to either put barelinks or make people hover over the
link
to see the URL it goes to.

Interwiki page existence detection probably wouldn't be any more difficult
to implement in the absence of interwiki prefixes. We could still have an
interwiki table, but page existence detection would be triggered by certain
URLs rather than prefixes being used. I'm not sure how interwiki
transclusion would work if we didn't have interwikis; we'd have to come up
with some other way of specifying which wiki we're transcluding from,
unless we're going to use URLs for that too.

In short, I think the key is to come up with something that doesn't break
silently when there's a conflict between an interwiki prefix and namespace.
For that purpose, it would suffice to keep interwiki linking and come up
with a new delimiter. But changing the name of the Project: namespace would
work just as well. Migration of links could work analogously to what's
described in bug 60135<https://bugzilla.wikimedia.org/show_bug.cgi?id=60135>
.

TTO, you were saying "I'm not getting a coherent sense of a direction to
take" -- that could be a good thing at this point in the discussion; it
could mean people are still keeping an open mind and wanting to hear more
thoughts and ideas rather than making too hasty of a conclusion. But I
guess it is helpful, when conversations fall silent, for someone to push
for action by asking, "...so, in light of all that, what do you want to
do?" :)
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Nathan Larson
I forgot to mention, another problem is that you can't even import
Wikipedia: namespace pages to your wiki without changing your interwiki
table to get rid of the wikipedia: interwiki prefix first. The importer
will say, "Page 'Wikipedia:Sandbox' is not imported because its name is
reserved for external linking (interwiki)." As I
noted<https://bugzilla.wikimedia.org/show_bug.cgi?id=60168#c2>in bug
60168, it's unclear why we would standardize the other namespace
names (Help:, Template:, MediaWiki:, User:, etc.) from one wiki to the
next, but not Project:
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Bartosz Dziewoński
On Fri, 17 Jan 2014 17:33:52 +0100, Nathan Larson <[hidden email]> wrote:

> it's unclear why we would standardize the other namespace
> names (Help:, Template:, MediaWiki:, User:, etc.) from one wiki to the
> next, but not Project:

The User: namespace differs per-wiki in many languages other than English, to reflect the name of the wiki (something akin to "Wikipedia-editor", "Wikisource-editor" etc.).

I don't think standardizing Project: is a good idea, in particular because the name could be confused with "Wikiprojects", which on some wikis (also non-English) have a separate namespace for them.

--
Matma Rex

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l