Revamping interwiki prefixes

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Revamping interwiki prefixes

This, that and the other
Sorry about the borked line wrapping in the previous message - I'm
resending it so you can read it properly!

----

This is a proposal to try and bring order to the messy area of interwiki
linking and interwiki prefixes, particularly for non-WMF users of
MediaWiki.

At the moment, anyone who installs MediaWiki gets a default interwiki
table that is hopelessly out of date.  Some of the URLs listed there
have seemingly been broken for 7 years [1].  Meanwhile, WMF wikis have
access to a nice, updated interwiki map, stored on Meta, that is
difficult for anyone else to use.  Clearly something needs to be done.

What I propose we do to improve the situation is along the lines of
bug 58369:

1. Split the existing interwiki map on Meta [2] into a "global
    interwiki map", located on MediaWiki.org (draft at [3]), and a
    "WMF-specific interwiki map" on Meta (draft at [4]).
    Wikimedia-specific interwiki prefixes, like bugzilla:, gerrit:, and
    irc: would be located in the map on Meta, whereas general-purpose
    interwikis, like orthodoxwiki: and wikisource: would go to the
    "global map" at MediaWiki.org.

2. Create a bot, similar to l10n-bot, that periodically updates the
    default interwiki data in mediawiki/core based on the contents of
    the global map. (Right now, the default map is duplicated in two
    different formats [5] [6]which is quite messy.)

3. Write a version of the rebuildInterwiki.php maintenance script [7]
    that can be bundled with MediaWiki, and which can be run by server
    admins to pull in new entries to their interwiki table from the
    global map.

This way, fresh installations of MediaWiki get a set of current, useful
interwiki prefixes, and they have the ability to pull in updates as
required.  It also has the benefit of separating out the WMF-specific
stuff from the global MediaWiki logic, which is a win for external users
of MW.

Two other things it would be nice to do:

* Define a proper scope for the interwiki map.  At the moment it is a
   bit unclear what should and shouldn't be there.  The fact that we
   currently have a Linux users' group from New Zealand and someone's
   personal blog on the map suggests the scope of the map have not been
   well thought out over the years.
   My suggested criterion at [3] is:

     "Most well-established and active wikis should have interwiki
     prefixes, regardless of whether or not they are using MediaWiki
     software.
     Sites that are not wikis may be acceptable in some cases,
     particularly if they are very commonly linked to (e.g. Google,
     OEIS)."

* Take this opportunity to CLEAN UP the global interwiki map!
** Many of the links are long dead.
** Many new wikis have sprung up in the last few years that deserve to
    be added.
** Broken prefixes can be moved to the WMF-specific map so existing
    links on WMF sites can be cleaned up and dealt with appropriately.
** We could add API URLs to fill the iw_api column in the database
    (currently empty by default).

I'm interested to hear your thoughts on these ideas.

Sorry for the long message, but I really think this topic has been
neglected for such a long time.

TTO

----

PS. I am aware of an RFC on MediaWiki.org relating to this, but I can't
see that gaining traction any time soon.  This proposal would be a more
light-weight way of dealing with the problem at hand.

[1] https://gerrit.wikimedia.org/r/#/c/84303/
[2] https://meta.wikimedia.org/wiki/Interwiki_map
[3]
https://www.mediawiki.org/wiki/User:This,_that_and_the_other/Interwiki_map
[4]
https://meta.wikimedia.org/wiki/User:This,_that_and_the_other/Local_interwiki_map
[5]
http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.list
[6]
http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.sql
[7]
https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance.git/master/rebuildInterwiki.php 



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Marius Hoch
You might want to have a look at
https://www.mediawiki.org/wiki/Requests_for_comment/New_sites_system .
That's more future proof than using the current interwiki system IMO.
Also we already use a subset of that for Wikidata.

Cheers,

Marius


On Thu, 2014-01-16 at 22:06 +1100, This, that and the other wrote:

> Sorry about the borked line wrapping in the previous message - I'm
> resending it so you can read it properly!
>
> ----
>
> This is a proposal to try and bring order to the messy area of interwiki
> linking and interwiki prefixes, particularly for non-WMF users of
> MediaWiki.
>
> At the moment, anyone who installs MediaWiki gets a default interwiki
> table that is hopelessly out of date.  Some of the URLs listed there
> have seemingly been broken for 7 years [1].  Meanwhile, WMF wikis have
> access to a nice, updated interwiki map, stored on Meta, that is
> difficult for anyone else to use.  Clearly something needs to be done.
>
> What I propose we do to improve the situation is along the lines of
> bug 58369:
>
> 1. Split the existing interwiki map on Meta [2] into a "global
>     interwiki map", located on MediaWiki.org (draft at [3]), and a
>     "WMF-specific interwiki map" on Meta (draft at [4]).
>     Wikimedia-specific interwiki prefixes, like bugzilla:, gerrit:, and
>     irc: would be located in the map on Meta, whereas general-purpose
>     interwikis, like orthodoxwiki: and wikisource: would go to the
>     "global map" at MediaWiki.org.
>
> 2. Create a bot, similar to l10n-bot, that periodically updates the
>     default interwiki data in mediawiki/core based on the contents of
>     the global map. (Right now, the default map is duplicated in two
>     different formats [5] [6]which is quite messy.)
>
> 3. Write a version of the rebuildInterwiki.php maintenance script [7]
>     that can be bundled with MediaWiki, and which can be run by server
>     admins to pull in new entries to their interwiki table from the
>     global map.
>
> This way, fresh installations of MediaWiki get a set of current, useful
> interwiki prefixes, and they have the ability to pull in updates as
> required.  It also has the benefit of separating out the WMF-specific
> stuff from the global MediaWiki logic, which is a win for external users
> of MW.
>
> Two other things it would be nice to do:
>
> * Define a proper scope for the interwiki map.  At the moment it is a
>    bit unclear what should and shouldn't be there.  The fact that we
>    currently have a Linux users' group from New Zealand and someone's
>    personal blog on the map suggests the scope of the map have not been
>    well thought out over the years.
>    My suggested criterion at [3] is:
>
>      "Most well-established and active wikis should have interwiki
>      prefixes, regardless of whether or not they are using MediaWiki
>      software.
>      Sites that are not wikis may be acceptable in some cases,
>      particularly if they are very commonly linked to (e.g. Google,
>      OEIS)."
>
> * Take this opportunity to CLEAN UP the global interwiki map!
> ** Many of the links are long dead.
> ** Many new wikis have sprung up in the last few years that deserve to
>     be added.
> ** Broken prefixes can be moved to the WMF-specific map so existing
>     links on WMF sites can be cleaned up and dealt with appropriately.
> ** We could add API URLs to fill the iw_api column in the database
>     (currently empty by default).
>
> I'm interested to hear your thoughts on these ideas.
>
> Sorry for the long message, but I really think this topic has been
> neglected for such a long time.
>
> TTO
>
> ----
>
> PS. I am aware of an RFC on MediaWiki.org relating to this, but I can't
> see that gaining traction any time soon.  This proposal would be a more
> light-weight way of dealing with the problem at hand.
>
> [1] https://gerrit.wikimedia.org/r/#/c/84303/
> [2] https://meta.wikimedia.org/wiki/Interwiki_map
> [3]
> https://www.mediawiki.org/wiki/User:This,_that_and_the_other/Interwiki_map
> [4]
> https://meta.wikimedia.org/wiki/User:This,_that_and_the_other/Local_interwiki_map
> [5]
> http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.list
> [6]
> http://git.wikimedia.org/blob/mediawiki%2Fcore.git/master/maintenance%2Finterwiki.sql
> [7]
> https://git.wikimedia.org/blob/mediawiki%2Fextensions%2FWikimediaMaintenance.git/master/rebuildInterwiki.php 
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Tim Starling-2
In reply to this post by This, that and the other
On 16/01/14 22:06, This, that and the other wrote:
>     "Most well-established and active wikis should have interwiki
>     prefixes, regardless of whether or not they are using MediaWiki
>     software.
>     Sites that are not wikis may be acceptable in some cases,
>     particularly if they are very commonly linked to (e.g. Google,
>     OEIS)."

I think the interwiki map should be retired. I think broken links
should be removed from it, and no new wikis should be added.

Interwiki prefixes, local namespaces and article titles containing a
plain colon intractably conflict. Every time you add a new interwiki
prefix, main namespace articles which had that prefix in their title
become inaccessible and need to be recovered with a maintenance script.

There is a very good, standardised system for linking to arbitrary
remote wikis -- URLs. URLs have the advantage of not sharing a
namespace with local article titles.

Even the introduction of new WMF-to-WMF interwiki prefixes has caused
the breakage of large numbers of article titles. I can see that is
convenient, but I think it should be replaced even in that use case.
UI convenience, link styling and rel=nofollow can be dealt with in
other ways.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Nathan Larson
On Thu, Jan 16, 2014 at 10:56 PM, Tim Starling <[hidden email]>wrote:

> I think the interwiki map should be retired. I think broken links
> should be removed from it, and no new wikis should be added.
>
> Interwiki prefixes, local namespaces and article titles containing a
> plain colon intractably conflict. Every time you add a new interwiki
> prefix, main namespace articles which had that prefix in their title
> become inaccessible and need to be recovered with a maintenance script.
>
> There is a very good, standardised system for linking to arbitrary
> remote wikis -- URLs. URLs have the advantage of not sharing a
> namespace with local article titles.
>
> Even the introduction of new WMF-to-WMF interwiki prefixes has caused
> the breakage of large numbers of article titles. I can see that is
> convenient, but I think it should be replaced even in that use case.
> UI convenience, link styling and rel=nofollow can be dealt with in
> other ways.
>

These are some good points. I've run into a problem many times when
importing pages (e.g. templates and/or their documentation) from Wikipedia,
that pages like [[Wikipedia:Signatures]] become interwiki links to
Wikipedia mainspace rather than redlinks. Also, usually I end up accessing
interwiki prefixes through templates like
Template:w<https://meta.wikimedia.org/wiki/Template:W>anyway. It would
be a simple matter to make those templates generate URLs
rather than interwiki links. The only other way to prevent these conflicts
from happening would be to use a different delimiter besides a single
colon; but what would that replacement be?

Before retiring the interwiki map, we could run a bot to edit all the pages
that use interwiki links, and convert the interwiki links to template uses.
A template would have the same advantage as an interwiki link in making it
easy to change the URLs if the site were to switch domains or change its
URL scheme.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

This, that and the other
In reply to this post by Tim Starling-2
"Tim Starling"  wrote in message news:lba9ld$8pj$[hidden email]...

> I think the interwiki map should be retired. I think broken links
> should be removed from it, and no new wikis should be added.
>
> Interwiki prefixes, local namespaces and article titles containing a
> plain colon intractably conflict. Every time you add a new interwiki
> prefix, main namespace articles which had that prefix in their title
> become inaccessible and need to be recovered with a maintenance script.
>
> There is a very good, standardised system for linking to arbitrary
> remote wikis -- URLs. URLs have the advantage of not sharing a
> namespace with local article titles.
>
> Even the introduction of new WMF-to-WMF interwiki prefixes has caused
> the breakage of large numbers of article titles. I can see that is
> convenient, but I think it should be replaced even in that use case.
> UI convenience, link styling and rel=nofollow can be dealt with in
> other ways.
>
> -- Tim Starling

The one main advantage of interwiki mapping is the convenience you mention.  They
save a great amount of unnecessary typing and remembering of URLs.  Whenever we go
to any WMF wiki, we can simply type [[gerrit:12345]] and know that the link will
point where we want it to.

Some possible alternatives to our current system would include:
* to make people manually type out URLs everywhere (silly)
* to use cross-wiki linking templates instead of interwikis.  This has its own set
of problems: cross-wiki transclusion is another area in sore need of attention (see
bug 4547); we need to decide which wikis get their own linking templates; how do we
deal with collisions between local and global (cross-wiki) templates?  etc.  To me,
it doesn't seem worth the effort.
* to introduce a new syntax for interwiki links that does not collide with internal
links (too ambitious?)

I personally favour keeping interwikis as we know them, as collisions are very rare,
and none of the alternatives seem viable or practical.  Maybe the advent of
interactive editing systems like VisualEditor and Flow will make them obsolete, but
until then, editors need the convenience and flexibility that they offer when
writing wikitext.

It seems as though your proposal, Tim, relates to the WMF cluster.  I'd be
interested to know what your thoughts are with relation to the interwiki table in
external MediaWiki installations.

TTO
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l 



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

Gabriel Wicke-3
In reply to this post by Tim Starling-2
On 01/16/2014 07:56 PM, Tim Starling wrote:

> I think the interwiki map should be retired. I think broken links
> should be removed from it, and no new wikis should be added.
>
> Interwiki prefixes, local namespaces and article titles containing a
> plain colon intractably conflict. Every time you add a new interwiki
> prefix, main namespace articles which had that prefix in their title
> become inaccessible and need to be recovered with a maintenance script.
>
> There is a very good, standardised system for linking to arbitrary
> remote wikis -- URLs. URLs have the advantage of not sharing a
> namespace with local article titles.


The underlying issue here is that we are still using wikitext as our
primary storage format, rather than treating it as the textual user
interface it is. With HTML storage this issue disappears, as interwiki
links are stored with full URLs. When using the wikitext editor,
prefixes are introduced correctly and on demand, so you get the
convenience without the conflicts.

Currently Flow is the only project using HTML storage. We are working on
preparing this for MediaWiki proper though, so in the longer term the
interwiki conflict issue should disappear.

Gabriel

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

HTML Storage for Wikidata (was Re: Revamping interwiki prefixes)

Jay Ashworth-2
----- Original Message -----
> From: "Gabriel Wicke" <[hidden email]>

> Currently Flow is the only project using HTML storage. We are working on
> preparing this for MediaWiki proper though, so in the longer term the
> interwiki conflict issue should disappear.

Where, by "HTML storage" I hope you actually mean "something that isn't HTML"
storage, since HTML is a *presentation* markup manguage, not a semantic one,
and thus singularly unsuited to use for the sort of semantic storage a wiki
engine requires...

Cheers,
-- jra
--
Jay R. Ashworth                  Baylink                       [hidden email]
Designer                     The Things I Think                       RFC 2100
Ashworth & Associates       http://www.bcp38.info          2000 Land Rover DII
St Petersburg FL USA      BCP38: Ask For It By Name!           +1 727 647 1274

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: HTML Storage for Wikidata (was Re: Revamping interwiki prefixes)

Gabriel Wicke-3
On 01/17/2014 10:17 AM, Jay Ashworth wrote:

> ----- Original Message -----
>> From: "Gabriel Wicke" <[hidden email]>
>
>> Currently Flow is the only project using HTML storage. We are working on
>> preparing this for MediaWiki proper though, so in the longer term the
>> interwiki conflict issue should disappear.
>
> Where, by "HTML storage" I hope you actually mean "something that isn't HTML"
> storage, since HTML is a *presentation* markup manguage, not a semantic one,
> and thus singularly unsuited to use for the sort of semantic storage a wiki
> engine requires...

I mean our HTML5+RDFa DOM spec format [1], which is semantic markup that
also displays as expected. It exposes all the semantic information of
Wikitext in RDFa, which is why Parsoid can provide a wikitext editing
interface to it.

Gabriel

[1]: https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Revamping interwiki prefixes

MZMcBride-2
In reply to this post by Tim Starling-2
Tim Starling wrote:
>I can see that is convenient, but I think it should be replaced even in
>that use case. UI convenience, link styling and rel=nofollow can be dealt
>with in other ways.

Re: https://meta.wikimedia.org/wiki/Interwiki_map

It's not just convenience. Interwiki links are an easy way to implement
global (across all Wikimedia wikis) templates. They're very simple linker
templates, but templates just the same.

Instead of {{bugzilla|}} for Bugzilla, you use [[bugzilla:]]. Instead of
updating dozens of templates on hundreds of wikis indefinitely, you can
update a centralized interwiki map. The centralized map also helps avoid
conflicts. And if one day one of the targets moves and doesn't leave a
redirect (boo!), we can theoretically update the interwiki map and all of
the links across Wikimedia wikis will continue to work. I believe we use
this feature occasionally.

We could make parser functions such as "{{#bugzilla:}}", but depending on
who you ask, wikitext as a written form is on its way out. I'm not sure
the investment is worth the return.

I suppose it's possible that people are using interwiki markup to disable
the typical link icons, but instead we should be discussing link icons
generally in the user interface. This is pretty far removed from interwiki
links, in my opinion. I do know that people occasionally use redirection
to get around weird link generation behavior when using interwiki markup.
As I recall, space interpretation was the center of that (i.e., query
paths containing "_" v. "+" v. "%20" v. " " &c.).

Regarding rel=nofollow and link trustworthiness: I'm not sure any sane
search engine continues to trust user input these days. I thought lessons
of the past taught developers that people are pretty unscrupulous. :-)

 
MZMcBride



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l