[Wikimedia-l] A proposal towards a multilingual Wikipedia

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] A proposal towards a multilingual Wikipedia

Denny Vrandečić
I have been thinking about this for a while, and now finally managed to
write it down as a proposal. Details are on meta on the following link,
below is the intro to the proposal:

<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>

I tried to anticipate some possible questions and provide answers on the
page. Besides that, I obviously hope that Wikimania could provide a place
to start this conversation. And yes, I am aware that the proposal would
lead to a very restrictive solution, but imagine what good it already could
achieve! And since it is not meant to replace anything, but enrich our
current projects... well, read for yourself.

Cheers,
Denny


Wikipedia provides knowledge in more than 200 languages. Whereas a small
number of languages are fortunate enough to have a large Wikipedia, many of
the language editions are far away from providing a comprehensive
encyclopedia by any measure. There are several approaches towards closing
this gap, mostly focusing on increasing the number of contributors to the
small language editions or to improve the provision of automatic or
semi-automatic translations of articles. Both are viable. In the following
we present a proposal for a different approach, which is based on the idea
of multilingual Wikipedia.

Imagine a small extension to the template system, where a template call
like *{{F12}}* would not be expanded by a call to the template
Template:F12, but rather to Template:F12/en, i.e. the template name with
the selected language code of the reader of the page. A template call such
as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into *“Berlin
is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
respectively, which the templates query for the label in the respective
language). Sentence by sentence could be created in order to provide for a
simple article.

That wiki would consist of *content*, i.e. the article pages, possibly just
a simple series of template calls, and *frames*, i.e. the templates that
lexicalize the parameters of a given template call into a sentence (Note
that “sentence” here should not be considered literally. It could be a
table, an image, anything). The implementation of the frames can be done in
normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
these. This would be up to the communities creating them.

Read the rest here:
<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>

--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Jane Darnell
Love it!

2013/8/7, Denny Vrandečić <[hidden email]>:

> I have been thinking about this for a while, and now finally managed to
> write it down as a proposal. Details are on meta on the following link,
> below is the intro to the proposal:
>
> <http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>
> I tried to anticipate some possible questions and provide answers on the
> page. Besides that, I obviously hope that Wikimania could provide a place
> to start this conversation. And yes, I am aware that the proposal would
> lead to a very restrictive solution, but imagine what good it already could
> achieve! And since it is not meant to replace anything, but enrich our
> current projects... well, read for yourself.
>
> Cheers,
> Denny
>
>
> Wikipedia provides knowledge in more than 200 languages. Whereas a small
> number of languages are fortunate enough to have a large Wikipedia, many of
> the language editions are far away from providing a comprehensive
> encyclopedia by any measure. There are several approaches towards closing
> this gap, mostly focusing on increasing the number of contributors to the
> small language editions or to improve the provision of automatic or
> semi-automatic translations of articles. Both are viable. In the following
> we present a proposal for a different approach, which is based on the idea
> of multilingual Wikipedia.
>
> Imagine a small extension to the template system, where a template call
> like *{{F12}}* would not be expanded by a call to the template
> Template:F12, but rather to Template:F12/en, i.e. the template name with
> the selected language code of the reader of the page. A template call such
> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into *“Berlin
> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
> Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
> respectively, which the templates query for the label in the respective
> language). Sentence by sentence could be created in order to provide for a
> simple article.
>
> That wiki would consist of *content*, i.e. the article pages, possibly just
> a simple series of template calls, and *frames*, i.e. the templates that
> lexicalize the parameters of a given template call into a sentence (Note
> that “sentence” here should not be considered literally. It could be a
> table, an image, anything). The implementation of the frames can be done in
> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
> these. This would be up to the communities creating them.
>
> Read the rest here:
> <http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>
> --
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/681/51985.
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Emilio J. Rodríguez-Posada
In reply to this post by Denny Vrandečić
This may work very fine for little stubs about repetitive stuff, like the
introductions of cities (location, population, foundation date, country,
etc). But, how will that work for the rest of sections of Berlin (history,
geography, politics...)? https://en.wikipedia.org/wiki/Berlin


2013/8/7 Denny Vrandečić <[hidden email]>

> I have been thinking about this for a while, and now finally managed to
> write it down as a proposal. Details are on meta on the following link,
> below is the intro to the proposal:
>
> <
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> >
>
> I tried to anticipate some possible questions and provide answers on the
> page. Besides that, I obviously hope that Wikimania could provide a place
> to start this conversation. And yes, I am aware that the proposal would
> lead to a very restrictive solution, but imagine what good it already could
> achieve! And since it is not meant to replace anything, but enrich our
> current projects... well, read for yourself.
>
> Cheers,
> Denny
>
>
> Wikipedia provides knowledge in more than 200 languages. Whereas a small
> number of languages are fortunate enough to have a large Wikipedia, many of
> the language editions are far away from providing a comprehensive
> encyclopedia by any measure. There are several approaches towards closing
> this gap, mostly focusing on increasing the number of contributors to the
> small language editions or to improve the provision of automatic or
> semi-automatic translations of articles. Both are viable. In the following
> we present a proposal for a different approach, which is based on the idea
> of multilingual Wikipedia.
>
> Imagine a small extension to the template system, where a template call
> like *{{F12}}* would not be expanded by a call to the template
> Template:F12, but rather to Template:F12/en, i.e. the template name with
> the selected language code of the reader of the page. A template call such
> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into
> *“Berlin
> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
> Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
> respectively, which the templates query for the label in the respective
> language). Sentence by sentence could be created in order to provide for a
> simple article.
>
> That wiki would consist of *content*, i.e. the article pages, possibly just
> a simple series of template calls, and *frames*, i.e. the templates that
> lexicalize the parameters of a given template call into a sentence (Note
> that “sentence” here should not be considered literally. It could be a
> table, an image, anything). The implementation of the frames can be done in
> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
> these. This would be up to the communities creating them.
>
> Read the rest here:
> <
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> >
>
> --
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/681/51985.
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Denny Vrandečić
I thought so myself, but then I did a bit of research to figure out the
state of natural language generation. I could not find easily a current
state of the art, but I found this list of examples on the KPML website
that is linked from the proposal, they are from 1998:

<
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/R3b12-English/Docu/ENGLISH-reuters-mismatches-19981209/index.html
>
<
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/R3b12-English/Docu/ENGLISH-nigel-exerciseset-mismatches-19981209/index.html
>

There are examples like:
"Analysts say that the private position is far more sensible, because it
leads to much needed capital for European computer and semiconductor
companies, while giving them a toehold in the lucrative Japanese domestic
market."

"Because of its importance, any reaction of the sixty people whose
televisions are attached to the system is monitored closely."

Since they managed it 15 years ago, I believe we can do it too. At least
try and fail.
Even if the complexity of our sentences does not raise that high, it seems
to me that there is plenty of content that would be beneficial to make
available.

Cheers,
Denny





2013/8/7 Emilio J. Rodríguez-Posada <[hidden email]>

> This may work very fine for little stubs about repetitive stuff, like the
> introductions of cities (location, population, foundation date, country,
> etc). But, how will that work for the rest of sections of Berlin (history,
> geography, politics...)? https://en.wikipedia.org/wiki/Berlin
>
>
> 2013/8/7 Denny Vrandečić <[hidden email]>
>
> > I have been thinking about this for a while, and now finally managed to
> > write it down as a proposal. Details are on meta on the following link,
> > below is the intro to the proposal:
> >
> > <
> >
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> > >
> >
> > I tried to anticipate some possible questions and provide answers on the
> > page. Besides that, I obviously hope that Wikimania could provide a place
> > to start this conversation. And yes, I am aware that the proposal would
> > lead to a very restrictive solution, but imagine what good it already
> could
> > achieve! And since it is not meant to replace anything, but enrich our
> > current projects... well, read for yourself.
> >
> > Cheers,
> > Denny
> >
> >
> > Wikipedia provides knowledge in more than 200 languages. Whereas a small
> > number of languages are fortunate enough to have a large Wikipedia, many
> of
> > the language editions are far away from providing a comprehensive
> > encyclopedia by any measure. There are several approaches towards closing
> > this gap, mostly focusing on increasing the number of contributors to the
> > small language editions or to improve the provision of automatic or
> > semi-automatic translations of articles. Both are viable. In the
> following
> > we present a proposal for a different approach, which is based on the
> idea
> > of multilingual Wikipedia.
> >
> > Imagine a small extension to the template system, where a template call
> > like *{{F12}}* would not be expanded by a call to the template
> > Template:F12, but rather to Template:F12/en, i.e. the template name with
> > the selected language code of the reader of the page. A template call
> such
> > as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into
> > *“Berlin
> > is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
> > Hauptstadt Deutschlands.”* (in the example, the template parameters
> Q5119,
> > Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
> > respectively, which the templates query for the label in the respective
> > language). Sentence by sentence could be created in order to provide for
> a
> > simple article.
> >
> > That wiki would consist of *content*, i.e. the article pages, possibly
> just
> > a simple series of template calls, and *frames*, i.e. the templates that
> > lexicalize the parameters of a given template call into a sentence (Note
> > that “sentence” here should not be considered literally. It could be a
> > table, an image, anything). The implementation of the frames can be done
> in
> > normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
> > these. This would be up to the communities creating them.
> >
> > Read the rest here:
> > <
> >
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> > >
> >
> > --
> > Project director Wikidata
> > Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> > Tel. +49-30-219 158 26-0 | http://wikimedia.de
> >
> > Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> > Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg
> unter
> > der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> > Körperschaften I Berlin, Steuernummer 27/681/51985.
> > _______________________________________________
> > Wikimedia-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Anders Wennersten-2
In reply to this post by Denny Vrandečić
Thanks for sharing your very interesting ideas. While I am not fully
support your idea of implementation, I share your basic view of the need
and think some of the concepts you introduce has a very high potential
to better utilize the power of us having many versions.

I have put in my feedback on the talkpage and hope there will be a
possibility to evolve this concept further in some type of workgroup. I
also see an interesting relation to the talk of machine translation
where I believe we can do a lot very quickly if we limit the vocabulary
to be included in such a tool

Anders


Denny Vrandečić skrev 2013-08-07 02:20:

> I have been thinking about this for a while, and now finally managed to
> write it down as a proposal. Details are on meta on the following link,
> below is the intro to the proposal:
>
> <http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>
> I tried to anticipate some possible questions and provide answers on the
> page. Besides that, I obviously hope that Wikimania could provide a place
> to start this conversation. And yes, I am aware that the proposal would
> lead to a very restrictive solution, but imagine what good it already could
> achieve! And since it is not meant to replace anything, but enrich our
> current projects... well, read for yourself.
>
> Cheers,
> Denny
>
>
> Wikipedia provides knowledge in more than 200 languages. Whereas a small
> number of languages are fortunate enough to have a large Wikipedia, many of
> the language editions are far away from providing a comprehensive
> encyclopedia by any measure. There are several approaches towards closing
> this gap, mostly focusing on increasing the number of contributors to the
> small language editions or to improve the provision of automatic or
> semi-automatic translations of articles. Both are viable. In the following
> we present a proposal for a different approach, which is based on the idea
> of multilingual Wikipedia.
>
> Imagine a small extension to the template system, where a template call
> like *{{F12}}* would not be expanded by a call to the template
> Template:F12, but rather to Template:F12/en, i.e. the template name with
> the selected language code of the reader of the page. A template call such
> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into *“Berlin
> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
> Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
> respectively, which the templates query for the label in the respective
> language). Sentence by sentence could be created in order to provide for a
> simple article.
>
> That wiki would consist of *content*, i.e. the article pages, possibly just
> a simple series of template calls, and *frames*, i.e. the templates that
> lexicalize the parameters of a given template call into a sentence (Note
> that “sentence” here should not be considered literally. It could be a
> table, an image, anything). The implementation of the frames can be done in
> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
> these. This would be up to the communities creating them.
>
> Read the rest here:
> <http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>


_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Emilio J. Rodríguez-Posada
Most times the best approach is a compilation of several approaches.

Perhaps we can use the Denny system for the little introduction of articles
(for example: geography, biographies) and optional automatic translation
for the rest of the article.

I mean, if you follow a red link in a little Wikipedia, it loads the i18n
template + wikidata bits, so you have a brief summary about the topic. Then
you can save that "live" generated stub, and expand it (using
autotraslation from other WIkipedia).


2013/8/7 Anders Wennersten <[hidden email]>

> Thanks for sharing your very interesting ideas. While I am not fully
> support your idea of implementation, I share your basic view of the need
> and think some of the concepts you introduce has a very high potential to
> better utilize the power of us having many versions.
>
> I have put in my feedback on the talkpage and hope there will be a
> possibility to evolve this concept further in some type of workgroup. I
> also see an interesting relation to the talk of machine translation where I
> believe we can do a lot very quickly if we limit the vocabulary to be
> included in such a tool
>
> Anders
>
>
> Denny Vrandečić skrev 2013-08-07 02:20:
>
>  I have been thinking about this for a while, and now finally managed to
>> write it down as a proposal. Details are on meta on the following link,
>> below is the intro to the proposal:
>>
>> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
>> multilingual_Wikipedia<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>> >
>>
>> I tried to anticipate some possible questions and provide answers on the
>> page. Besides that, I obviously hope that Wikimania could provide a place
>> to start this conversation. And yes, I am aware that the proposal would
>> lead to a very restrictive solution, but imagine what good it already
>> could
>> achieve! And since it is not meant to replace anything, but enrich our
>> current projects... well, read for yourself.
>>
>> Cheers,
>> Denny
>>
>>
>> Wikipedia provides knowledge in more than 200 languages. Whereas a small
>> number of languages are fortunate enough to have a large Wikipedia, many
>> of
>> the language editions are far away from providing a comprehensive
>> encyclopedia by any measure. There are several approaches towards closing
>> this gap, mostly focusing on increasing the number of contributors to the
>> small language editions or to improve the provision of automatic or
>> semi-automatic translations of articles. Both are viable. In the following
>> we present a proposal for a different approach, which is based on the idea
>> of multilingual Wikipedia.
>>
>> Imagine a small extension to the template system, where a template call
>> like *{{F12}}* would not be expanded by a call to the template
>> Template:F12, but rather to Template:F12/en, i.e. the template name with
>> the selected language code of the reader of the page. A template call such
>> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into
>> *“Berlin
>> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
>> Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
>> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
>> respectively, which the templates query for the label in the respective
>> language). Sentence by sentence could be created in order to provide for a
>> simple article.
>>
>> That wiki would consist of *content*, i.e. the article pages, possibly
>> just
>> a simple series of template calls, and *frames*, i.e. the templates that
>> lexicalize the parameters of a given template call into a sentence (Note
>> that “sentence” here should not be considered literally. It could be a
>> table, an image, anything). The implementation of the frames can be done
>> in
>> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
>> these. This would be up to the communities creating them.
>>
>> Read the rest here:
>> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
>> multilingual_Wikipedia<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>> >
>>
>>
>
> ______________________________**_________________
> Wikimedia-l mailing list
> [hidden email].**org <[hidden email]>
> Unsubscribe: https://lists.wikimedia.org/**mailman/listinfo/wikimedia-l<https://lists.wikimedia.org/mailman/listinfo/wikimedia-l>,
> <mailto:wikimedia-l-request@**lists.wikimedia.org<[hidden email]>
> ?subject=**unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Denny Vrandečić
In reply to this post by Anders Wennersten-2
Thank you, Anders. Yes, I published the idea in order to garner feedback
and further evolve it. It is by no means ready-perfect-finished, it is
rather really just a first draft. So suggestions, constructive critique,
and improvements are obviously extremely welcome. --~~~~


2013/8/7 Anders Wennersten <[hidden email]>

> Thanks for sharing your very interesting ideas. While I am not fully
> support your idea of implementation, I share your basic view of the need
> and think some of the concepts you introduce has a very high potential to
> better utilize the power of us having many versions.
>
> I have put in my feedback on the talkpage and hope there will be a
> possibility to evolve this concept further in some type of workgroup. I
> also see an interesting relation to the talk of machine translation where I
> believe we can do a lot very quickly if we limit the vocabulary to be
> included in such a tool
>
> Anders
>
>
> Denny Vrandečić skrev 2013-08-07 02:20:
>
>> I have been thinking about this for a while, and now finally managed to
>> write it down as a proposal. Details are on meta on the following link,
>> below is the intro to the proposal:
>>
>> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
>> multilingual_Wikipedia<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>> >
>>
>> I tried to anticipate some possible questions and provide answers on the
>> page. Besides that, I obviously hope that Wikimania could provide a place
>> to start this conversation. And yes, I am aware that the proposal would
>> lead to a very restrictive solution, but imagine what good it already
>> could
>> achieve! And since it is not meant to replace anything, but enrich our
>> current projects... well, read for yourself.
>>
>> Cheers,
>> Denny
>>
>>
>> Wikipedia provides knowledge in more than 200 languages. Whereas a small
>> number of languages are fortunate enough to have a large Wikipedia, many
>> of
>> the language editions are far away from providing a comprehensive
>> encyclopedia by any measure. There are several approaches towards closing
>> this gap, mostly focusing on increasing the number of contributors to the
>> small language editions or to improve the provision of automatic or
>> semi-automatic translations of articles. Both are viable. In the following
>> we present a proposal for a different approach, which is based on the idea
>> of multilingual Wikipedia.
>>
>> Imagine a small extension to the template system, where a template call
>> like *{{F12}}* would not be expanded by a call to the template
>>
>> Template:F12, but rather to Template:F12/en, i.e. the template name with
>> the selected language code of the reader of the page. A template call such
>> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into
>> *“Berlin
>> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist die
>> Hauptstadt Deutschlands.”* (in the example, the template parameters Q5119,
>>
>> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
>> respectively, which the templates query for the label in the respective
>> language). Sentence by sentence could be created in order to provide for a
>> simple article.
>>
>> That wiki would consist of *content*, i.e. the article pages, possibly
>> just
>> a simple series of template calls, and *frames*, i.e. the templates that
>>
>> lexicalize the parameters of a given template call into a sentence (Note
>> that “sentence” here should not be considered literally. It could be a
>> table, an image, anything). The implementation of the frames can be done
>> in
>> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
>> these. This would be up to the communities creating them.
>>
>> Read the rest here:
>> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
>> multilingual_Wikipedia<http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia>
>> >
>>
>>
>
> ______________________________**_________________
> Wikimedia-l mailing list
> [hidden email].**org <[hidden email]>
> Unsubscribe: https://lists.wikimedia.org/**mailman/listinfo/wikimedia-l<https://lists.wikimedia.org/mailman/listinfo/wikimedia-l>,
> <mailto:wikimedia-l-request@**lists.wikimedia.org<[hidden email]>
> ?subject=**unsubscribe>
>



--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Denny Vrandečić
In reply to this post by Emilio J. Rodríguez-Posada
Obviously, this system should be only used as far as it carries. I don't
know how far it might carry us - it might fail miserably, and not get
beyond the "Rome is a city. Rome is in Italy. Rome is known for The
Colosseum, coffee and Vatican City (state)." stage. It might lead to a
glorious future, where we really create an open source system that allows
everyone to write in every language and express a wide range of human
thought.

I am personally hesitant about automatic translations, and whether we can
achieve the coverage (in language pairs) and the quality (of Wikipedia).
But that is only my opinion. A hybrid approach, if we can support it and
build it, would obviously be the safest bet, as both endeavors are rather
risky. I see a lot of possible space for a hybrid system, as you describe
it.

One advantage of my proposal is that it's cost is rather small. For
supporting translation I haven't seen yet a sufficiently sketched proposal
that allows to estimate the potential cost and potential benefit.

Cheers,
Denny






2013/8/7 Emilio J. Rodríguez-Posada <[hidden email]>

> Most times the best approach is a compilation of several approaches.
>
> Perhaps we can use the Denny system for the little introduction of articles
> (for example: geography, biographies) and optional automatic translation
> for the rest of the article.
>
> I mean, if you follow a red link in a little Wikipedia, it loads the i18n
> template + wikidata bits, so you have a brief summary about the topic. Then
> you can save that "live" generated stub, and expand it (using
> autotraslation from other WIkipedia).
>
>
> 2013/8/7 Anders Wennersten <[hidden email]>
>
> > Thanks for sharing your very interesting ideas. While I am not fully
> > support your idea of implementation, I share your basic view of the need
> > and think some of the concepts you introduce has a very high potential to
> > better utilize the power of us having many versions.
> >
> > I have put in my feedback on the talkpage and hope there will be a
> > possibility to evolve this concept further in some type of workgroup. I
> > also see an interesting relation to the talk of machine translation
> where I
> > believe we can do a lot very quickly if we limit the vocabulary to be
> > included in such a tool
> >
> > Anders
> >
> >
> > Denny Vrandečić skrev 2013-08-07 02:20:
> >
> >  I have been thinking about this for a while, and now finally managed to
> >> write it down as a proposal. Details are on meta on the following link,
> >> below is the intro to the proposal:
> >>
> >> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
> >> multilingual_Wikipedia<
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> >
> >> >
> >>
> >> I tried to anticipate some possible questions and provide answers on the
> >> page. Besides that, I obviously hope that Wikimania could provide a
> place
> >> to start this conversation. And yes, I am aware that the proposal would
> >> lead to a very restrictive solution, but imagine what good it already
> >> could
> >> achieve! And since it is not meant to replace anything, but enrich our
> >> current projects... well, read for yourself.
> >>
> >> Cheers,
> >> Denny
> >>
> >>
> >> Wikipedia provides knowledge in more than 200 languages. Whereas a small
> >> number of languages are fortunate enough to have a large Wikipedia, many
> >> of
> >> the language editions are far away from providing a comprehensive
> >> encyclopedia by any measure. There are several approaches towards
> closing
> >> this gap, mostly focusing on increasing the number of contributors to
> the
> >> small language editions or to improve the provision of automatic or
> >> semi-automatic translations of articles. Both are viable. In the
> following
> >> we present a proposal for a different approach, which is based on the
> idea
> >> of multilingual Wikipedia.
> >>
> >> Imagine a small extension to the template system, where a template call
> >> like *{{F12}}* would not be expanded by a call to the template
> >> Template:F12, but rather to Template:F12/en, i.e. the template name with
> >> the selected language code of the reader of the page. A template call
> such
> >> as *{{F12:Q64|Q5519|Q183}}* can be expanded by Template:F12/en into
> >> *“Berlin
> >> is the capital of Germany.”* and by Template:F12/de into *“Berlin ist
> die
> >> Hauptstadt Deutschlands.”* (in the example, the template parameters
> Q5119,
> >> Q64 and Q183 refer to the Wikidata items for capital, Berlin and Germany
> >> respectively, which the templates query for the label in the respective
> >> language). Sentence by sentence could be created in order to provide
> for a
> >> simple article.
> >>
> >> That wiki would consist of *content*, i.e. the article pages, possibly
> >> just
> >> a simple series of template calls, and *frames*, i.e. the templates that
> >> lexicalize the parameters of a given template call into a sentence (Note
> >> that “sentence” here should not be considered literally. It could be a
> >> table, an image, anything). The implementation of the frames can be done
> >> in
> >> normal wiki template syntax, in Lua, in a novel mechanism, or a mix of
> >> these. This would be up to the communities creating them.
> >>
> >> Read the rest here:
> >> <http://meta.wikimedia.org/**wiki/A_proposal_towards_a_**
> >> multilingual_Wikipedia<
> http://meta.wikimedia.org/wiki/A_proposal_towards_a_multilingual_Wikipedia
> >
> >> >
> >>
> >>
> >
> > ______________________________**_________________
> > Wikimedia-l mailing list
> > [hidden email].**org <[hidden email]>
> > Unsubscribe: https://lists.wikimedia.org/**mailman/listinfo/wikimedia-l<
> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l>,
> > <mailto:wikimedia-l-request@**lists.wikimedia.org<
> [hidden email]>
> > ?subject=**unsubscribe>
> >
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

David Cuenca Tudela
On Wed, Aug 7, 2013 at 8:50 AM, Denny Vrandečić <
[hidden email]> wrote:

> [...] It might lead to a glorious future, where we really create an open
> source system that allows
> everyone to write in every language and express a wide range of human
> thought.
>

As much as I love this proposal, I have some some reservations, namely:
http://www.xkcd.com/191/

So if there are already complains about the skewed participation in the
Wikipedia, this would be the right step to skew it even further. This is
not necessarily bad (maybe it could even work without patrolling), however
instead of stating "allow everyone to write in every language", I think it
would be more realistic to state "allow everyone, who wants to take the
extra effort in participating in such a project, to write in every
language".
Predicted demographics: 95% women from the "global south" :)


> I am personally hesitant about automatic translations, and whether we can
> achieve the coverage (in language pairs) and the quality (of Wikipedia).
> But that is only my opinion. A hybrid approach, if we can support it and
> build it, would obviously be the safest bet, as both endeavors are rather
> risky. I see a lot of possible space for a hybrid system, as you describe
> it.
>

+1000


>
> One advantage of my proposal is that it's cost is rather small. For
> supporting translation I haven't seen yet a sufficiently sketched proposal
> that allows to estimate the potential cost and potential benefit.
>

As with so many things, it will be hard to assess cost/benefits without
making some effort. A safe bet could be to try with an existing pair or
develop a pair with an estimated high demand. If that works, escalate,
otherwise stop there.

Micru
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

metasj
On Tue, Aug 13, 2013 at 1:57 PM, David Cuenca <[hidden email]> wrote:

> Predicted demographics: 95% women from the "global south" :)

(-:

>> I am personally hesitant about automatic translations, and whether we can
>> achieve the coverage (in language pairs) and the quality (of Wikipedia).
>> But that is only my opinion. A hybrid approach, if we can support it and
>> build it, would obviously be the safest bet, as both endeavors are rather
>> risky. I see a lot of possible space for a hybrid system, as you describe
>> it.
>
> +1000

I have a lot of love for this idea in general, and a hybrid approach
to this part; thank you for articulating it so clearly.

>> One advantage of my proposal is that it's cost is rather small. For
>> supporting translation I haven't seen yet a sufficiently sketched proposal
>> that allows to estimate the potential cost and potential benefit.
>
> As with so many things, it will be hard to assess cost/benefits without
> making some effort. A safe bet could be to try with an existing pair or
> develop a pair with an estimated high demand.

Is there a pair where some work has already been done?

SJ

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

David Cuenca Tudela
On Mon, Aug 19, 2013 at 5:31 PM, Samuel Klein <[hidden email]> wrote:

>
> > As with so many things, it will be hard to assess cost/benefits without
> > making some effort. A safe bet could be to try with an existing pair or
> > develop a pair with an estimated high demand.
>
> Is there a pair where some work has already been done?
>

For Apertium there are quite a few already done:
http://wiki.apertium.org/wiki/Main_Page

Regarding new language pairs, no idea if the priorities for Wikipedia would
be the same as the priorities the Apertium community has.
It might be worth considering which languages to prioritize and how to
measure success or lack thereof.

Cheers,
Micru
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

Denny Vrandečić
Using a rather simple pair like Afrikaans - Dutch or a heavily researched
one like English - Spanish would be giving us a wrong impression of how
this will scale. We should at least add a few random pairs like Yoruba -
Gujarati or Kazakh - Lombard. Most of our 67,000 language pairs that we
will have to cover will fall in the latter group, not in the first two.


2013/8/23 David Cuenca <[hidden email]>

> On Mon, Aug 19, 2013 at 5:31 PM, Samuel Klein <[hidden email]> wrote:
>
> >
> > > As with so many things, it will be hard to assess cost/benefits without
> > > making some effort. A safe bet could be to try with an existing pair or
> > > develop a pair with an estimated high demand.
> >
> > Is there a pair where some work has already been done?
> >
>
> For Apertium there are quite a few already done:
> http://wiki.apertium.org/wiki/Main_Page
>
> Regarding new language pairs, no idea if the priorities for Wikipedia would
> be the same as the priorities the Apertium community has.
> It might be worth considering which languages to prioritize and how to
> measure success or lack thereof.
>
> Cheers,
> Micru
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
Project director Wikidata
Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
Tel. +49-30-219 158 26-0 | http://wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
Körperschaften I Berlin, Steuernummer 27/681/51985.
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] A proposal towards a multilingual Wikipedia

David Cuenca Tudela
Something to take into account should be the efficiency a language pair can
have. For instance, how many articles there are available, how easy is to
translate articles, how many bilingual speakers there are for a given pair,
and perhaps also, how much it can help to harmonize relationships between
speakers of both languages.

There seems to be much more demand for languages that are geographically
closer. While speakers of Kazakh might have little interest in reading the
Lombard or Gujarati wikipedias, they might be more inclined to visit the
Tatar wikipedia, which by the way is closely related and much easier to
translate.

So no, I don't think we should base our decisions on the theoretical number
of pairs that can exist, but on the ones that offer the best efficiency.

Cheers,
Micru


On Fri, Aug 23, 2013 at 4:45 AM, Denny Vrandečić <
[hidden email]> wrote:

> Using a rather simple pair like Afrikaans - Dutch or a heavily researched
> one like English - Spanish would be giving us a wrong impression of how
> this will scale. We should at least add a few random pairs like Yoruba -
> Gujarati or Kazakh - Lombard. Most of our 67,000 language pairs that we
> will have to cover will fall in the latter group, not in the first two.
>
>
> 2013/8/23 David Cuenca <[hidden email]>
>
> > On Mon, Aug 19, 2013 at 5:31 PM, Samuel Klein <[hidden email]> wrote:
> >
> > >
> > > > As with so many things, it will be hard to assess cost/benefits
> without
> > > > making some effort. A safe bet could be to try with an existing pair
> or
> > > > develop a pair with an estimated high demand.
> > >
> > > Is there a pair where some work has already been done?
> > >
> >
> > For Apertium there are quite a few already done:
> > http://wiki.apertium.org/wiki/Main_Page
> >
> > Regarding new language pairs, no idea if the priorities for Wikipedia
> would
> > be the same as the priorities the Apertium community has.
> > It might be worth considering which languages to prioritize and how to
> > measure success or lack thereof.
> >
> > Cheers,
> > Micru
> > _______________________________________________
> > Wikimedia-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
> >
>
>
>
> --
> Project director Wikidata
> Wikimedia Deutschland e.V. | Obentrautstr. 72 | 10963 Berlin
> Tel. +49-30-219 158 26-0 | http://wikimedia.de
>
> Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e.V.
> Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter
> der Nummer 23855 B. Als gemeinnützig anerkannt durch das Finanzamt für
> Körperschaften I Berlin, Steuernummer 27/681/51985.
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
Etiamsi omnes, ego non
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>