zh interwikis

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

zh interwikis

Platonides
I'm currently working on a interwiki bot and it happens i gets some
intewikis duplicated, some wikis refers to en:City while others to en:Town,
etc.
But the zh interwikis are the worst as there is not only that kind of
problems but up to 3 ways to link with 'http://zh.wikipedia.org/wiki/$1': 
zh, zh-cn, zh-tw. And there is also zh-min-nan for
'http://zh-min-nan.wikipedia.org/wiki/$1'.

What i have found is that zh means  (Zhong Wén) - Chinese, zh-cn Simplified
and zh-tw traditional. However they all link to the same place.
So what's the preferred option? Should zh-cn/zh-tw links converted to zh
links? Should they 3 stay even if they link to the same page? How could a
bot know if a page should be referred as simplified, traditional...?
Is any meta: page explaining this? (if so, where?, as i haven't found).
Should i better simply disable zh* to be recognised as interwikis?

Thanks in advance.

_______________________________________________
Wikibots-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikibots-l
Reply | Threaded
Open this post in threaded view
|

Re: zh interwikis

Rob Hooft
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Platonides wrote:
> I'm currently working on a interwiki bot and it happens i gets some
> intewikis duplicated, some wikis refers to en:City while others to
> en:Town, etc.
> But the zh interwikis are the worst as there is not only that kind of
> problems but up to 3 ways to link with
> 'http://zh.wikipedia.org/wiki/$1': zh, zh-cn, zh-tw. And there is also
> zh-min-nan for 'http://zh-min-nan.wikipedia.org/wiki/$1'.

Are you trying to write your own interwiki bot? Have you tried working
with the existing interwiki bot in the pywikipediabot suite? If there is
something wrong with the code in that bot, it is better if it is fixed
than that you write an alternative! Othwerwise other people may be using
broken code?

The problems with the zh: languages are all solved in the pywikipediabot
suite.

Regards,

Rob Hooft

- --
Rob W.W. Hooft  ||  [hidden email]  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFD4Q5qH7J/Cv8rb3QRAhQ8AJ93Z22GEvmNKCH2BqRDx7HH30gw0ACdGSue
WXlZgaLy+VNNqPX0n6vm5ko=
=8/EP
-----END PGP SIGNATURE-----
_______________________________________________
Wikibots-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikibots-l
Reply | Threaded
Open this post in threaded view
|

Re: zh interwikis

Andre Engels
In reply to this post by Platonides
All zh-cn and zh-tw codes are deprecated nowadays. The reason that
they existed is that at some point in the past, many Chinese pages
existed in separate simplified and traditional versions on the Chinese
Wikipedia. This is not the case any more. Nowadays, there is only a
single page, which can be automatically transliterated into either
simplified or traditional (simplified being the default), depending on
the wishes of the user. If everything is set up right, any zh-tw link
should go to a page that now redirects to the corresponding simplified
page. The page can be seen in traditional characters using the '不转换'
tab at the top of the page.

In short:
* zh: links are good
* zh-cn: links should be changed to zh: links
* zh-tw: links will usually go to a redirect to the corresponding
zh-cn: page; if not, they should be changed to zh: links as well.

Regarding the question how the bot knew what was simplified and what
was traditional: On pages in one version there used to be a link to
the page in the other version; the text of this link was standard (or
rather, there were not more than about 3 versions of it). If there was
such a link to 'this page in traditional characters', the bot would
assume the page was in simplified characters and treat that link as an
interwiki to zh-tw:, if there was such a link to 'this page in
simplified characters', the bot would assume the opposite. If only one
of the two pages existed, the bot would make the link zh: rather than
zh-cn: or zh-tw:. But, as said, that's all a thing of the past.
Nowadays, zh-cn: and zh-tw: interwikis should be considered
deprecated.

2006/2/1, Platonides <[hidden email]>:

> I'm currently working on a interwiki bot and it happens i gets some
> intewikis duplicated, some wikis refers to en:City while others to en:Town,
> etc.
> But the zh interwikis are the worst as there is not only that kind of
> problems but up to 3 ways to link with 'http://zh.wikipedia.org/wiki/$1':
> zh, zh-cn, zh-tw. And there is also zh-min-nan for
> 'http://zh-min-nan.wikipedia.org/wiki/$1'.
>
> What i have found is that zh means  (Zhong Wén) - Chinese, zh-cn Simplified
> and zh-tw traditional. However they all link to the same place.
> So what's the preferred option? Should zh-cn/zh-tw links converted to zh
> links? Should they 3 stay even if they link to the same page? How could a
> bot know if a page should be referred as simplified, traditional...?
> Is any meta: page explaining this? (if so, where?, as i haven't found).
> Should i better simply disable zh* to be recognised as interwikis?
>
> Thanks in advance.
>
> _______________________________________________
> Wikibots-l mailing list
> [hidden email]
> http://mail.wikipedia.org/mailman/listinfo/wikibots-l
>

--
Andre Engels, [hidden email]
ICQ: 6260644  --  Skype: a_engels

_______________________________________________
Wikibots-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikibots-l