Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Lucie-Aimée Kaffee-2
Wikimedia as a movement has over the years given consideration to small
language Wikipedias.
I would like to point you to a recent study I alongside with Hady Elsahar
of the Université de Lyon and Pavlos Vougiouklis of the University of
Southampton have been pursuing, which has been recently translated to
accepted publications.

My research interest involves mainly underserved languages on Wikidata and
Wikipedia, and how we can support them better.

One of the ways to support small Wikipedias was the ArticlePlaceholder [1].
The idea is to use the existing multilingual information in Wikidata [2]
and display it in a reader friendly way on Wikipedia in the respective
language (if a Wikidata label exists in this language).

However, at the moment the data is given only in a tabular form, which is
not very reader friendly and might not be the ideal way to engage editors
to work on the articles.

Therefore, we worked on producing sentences from the information on
Wikidata in the given language. We trained a neural network model, the
details can be found in the preprint of the NAACL paper here:
https://arxiv.org/abs/1803.07116
Given the promising results of the approach using our neural network, we
extended the work to see how we could fit in this text generation into the
existing ArticlePlaceholder and tested it with the Esperanto and Arabic
Wikipedia communities. The ESWC paper preprint for this work can be found
here:
https://2018.eswc-conferences.org/wp-content/uploads/2018/02/ESWC2018_paper_131.pdf

We show that our approach is feasible for generating text from Wikidata for
Wikipedia. Editors tend to reuse the sentences, which shows it can be a
good encouragement to create full articles from those summaries.

We would like to implement the work in a test Wikipedia to see if
communities are interested in adopting the technology on a large scale in
their Wikipedias.

Furthermore, we would love to hear your input: Do you believe, one sentence
summaries are enough, can we serve the communities needs better with more
than one sentence? Is this still true if longer abstracts would be of lower
text quality? What other interesting use cases for such a technology in the
Wikimedia world can you imagine? And especially if you are part of a
underserved language Wikipedia community, what is your opinion to the
project?

[1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and
https://commons.wikimedia.org/wiki/File:Generating_Article_Placeholders_from_Wikidata_for_Wikipedia_-_Increasing_Access_to_Free_and_Open_Knowledge.pdf
[2]
https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_Wikidata_Multilingual.pdf

--
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Leila Zia
Hi Lucie-Aimée,

Nice to see work in this direction is progressing. Some comments in-line.

On Wed, Apr 4, 2018 at 7:49 AM, Lucie-Aimée Kaffee <[hidden email]> wrote:
>
> Therefore, we worked on producing sentences from the information on
> Wikidata in the given language. We trained a neural network model, the
> details can be found in the preprint of the NAACL paper here:
> https://arxiv.org/abs/1803.07116

It would be good to do human (both readers and editors, and perhaps
both sets) evaluations for this research, too, to better understand
how well the model is doing from the perspective of the experienced
editors in some of the smaller languages as well as their readers. (I
acknowledge that finding experienced editors when you go to small
languages can become hard.)

> Furthermore, we would love to hear your input: Do you believe, one sentence
> summaries are enough, can we serve the communities needs better with more
> than one sentence?

This is a hard question to answer. :) The answer may rely on many
factors including the language you want to implement such a system in
and the expectation the users of the language have in terms of online
content available to them in their language.

> Is this still true if longer abstracts would be of lower
> text quality?

same as above. You are signing yourself up for more experiments. ;)

I would be interested to know:
* What is the perception of the readers of a given language about
Wikipedia if a lot of articles that they go to in their language have
one sentence (to a good extent accurate), a few sentences but with
some errors, more sentences with more errors, versus not finding the
article they're interested in at all?
* Related to the above: what is the error threshold beyond which the
brand perceptions will turn negative (to be defined: may be by
measuring if the user returns in the coming week or month.)? This may
well be different in different languages and cultures.
* Depending on the result of the above, we may want to look at
offering the user the option to access that information, but outside
of Wikipedia, or inside Wikipedia but very clearly labeled as Machine
Generated as you do to some extent in these projects.

> What other interesting use cases for such a technology in the
> Wikimedia world can you imagine?

The technology itself can have a variety of use-cases, including
providing captions or summaries of photos even without layers of image
processing applied to them.

Best,
Leila

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Ziko van Dijk-3
Hello,

A most interesting thread, as it touches the topic from different angles. I
agree that it needs actually a study among readers about their preferences.

Personally, I may have some doubt whether it improves an ArticlePlaceholder
to create sentences from the data (as they did in the geographical
"articles" created by bots). The data itself is most suitable for
databases, to be looked up in a table. Reading "Berlin has 3,500,000
million inhabitants" is not really an improvement compared to "Berlin /
inhabitants: 3,500,000".

Sentences have the most power when they combine information to knowledge,
like in "Berlin's population, currently 3,500,000, has been much different
during the Cold War because of the declining attractiveness for businesses".

In general, I would advise against one-sentence-summaries; a reader might
be disappointed when he comes via Google to a website and then only finds
one sentence.

(I hope I understood the question well; I cannot follow the math in your
article. Is there anywhere an example of your "summaries" to read?)

Kind regards
Ziko








2018-04-05 22:50 GMT+02:00 Leila Zia <[hidden email]>:

> Hi Lucie-Aimée,
>
> Nice to see work in this direction is progressing. Some comments in-line.
>
> On Wed, Apr 4, 2018 at 7:49 AM, Lucie-Aimée Kaffee <[hidden email]>
> wrote:
> >
> > Therefore, we worked on producing sentences from the information on
> > Wikidata in the given language. We trained a neural network model, the
> > details can be found in the preprint of the NAACL paper here:
> > https://arxiv.org/abs/1803.07116
>
> It would be good to do human (both readers and editors, and perhaps
> both sets) evaluations for this research, too, to better understand
> how well the model is doing from the perspective of the experienced
> editors in some of the smaller languages as well as their readers. (I
> acknowledge that finding experienced editors when you go to small
> languages can become hard.)
>
> > Furthermore, we would love to hear your input: Do you believe, one
> sentence
> > summaries are enough, can we serve the communities needs better with more
> > than one sentence?
>
> This is a hard question to answer. :) The answer may rely on many
> factors including the language you want to implement such a system in
> and the expectation the users of the language have in terms of online
> content available to them in their language.
>
> > Is this still true if longer abstracts would be of lower
> > text quality?
>
> same as above. You are signing yourself up for more experiments. ;)
>
> I would be interested to know:
> * What is the perception of the readers of a given language about
> Wikipedia if a lot of articles that they go to in their language have
> one sentence (to a good extent accurate), a few sentences but with
> some errors, more sentences with more errors, versus not finding the
> article they're interested in at all?
> * Related to the above: what is the error threshold beyond which the
> brand perceptions will turn negative (to be defined: may be by
> measuring if the user returns in the coming week or month.)? This may
> well be different in different languages and cultures.
> * Depending on the result of the above, we may want to look at
> offering the user the option to access that information, but outside
> of Wikipedia, or inside Wikipedia but very clearly labeled as Machine
> Generated as you do to some extent in these projects.
>
> > What other interesting use cases for such a technology in the
> > Wikimedia world can you imagine?
>
> The technology itself can have a variety of use-cases, including
> providing captions or summaries of photos even without layers of image
> processing applied to them.
>
> Best,
> Leila
>
> > [1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and
> > https://commons.wikimedia.org/wiki/File:Generating_Article_
> Placeholders_from_Wikidata_for_Wikipedia_-_Increasing_
> Access_to_Free_and_Open_Knowledge.pdf
> > [2]
> > https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_
> Wikidata_Multilingual.pdf
> >
> > --
> > Lucie-Aimée Kaffee
> > Web and Internet Science Group
> > School of Electronics and Computer Science
> > University of Southampton
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Lucie-Aimée Kaffee-2
In reply to this post by Lucie-Aimée Kaffee-2
Hi Leila,

First of all thanks for your input!

>
> > Therefore, we worked on producing sentences from the information on
> > Wikidata in the given language. We trained a neural network model, the
> > details can be found in the preprint of the NAACL paper here:
> > https://arxiv.org/abs/1803.07116
>
> It would be good to do human (both readers and editors, and perhaps
> both sets) evaluations for this research, too, to better understand
> how well the model is doing from the perspective of the experienced
> editors in some of the smaller languages as well as their readers. (I
> acknowledge that finding experienced editors when you go to small
> languages can become hard.)
>

We worked with editors in the follow-up study, to be published at ESWC.
https://2018.eswc-conferences.org/wp-content/uploads/2018/02/ESWC2018_paper_131.pdf
We also asked native speakers for their input on the fluency of the
sentences. However, I agree it would be interesting to dive more into the
question how the community perceives the ArticlePlaceholder in general and
with the generated summary in particular.


>
> > Furthermore, we would love to hear your input: Do you believe, one
> sentence
> > summaries are enough, can we serve the communities needs better with more
> > than one sentence?
>
> This is a hard question to answer. :) The answer may rely on many
> factors including the language you want to implement such a system in
> and the expectation the users of the language have in terms of online
> content available to them in their language.
>

I agree. The best would probably be therefore to study the current usage of
ArticlePlaceholder and communities targeted and draw conclusions for real
needs from those points.

>
> > Is this still true if longer abstracts would be of lower
> > text quality?
>
> same as above. You are signing yourself up for more experiments. ;)
>
> I would be interested to know:
> * What is the perception of the readers of a given language about
> Wikipedia if a lot of articles that they go to in their language have
> one sentence (to a good extent accurate), a few sentences but with
> some errors, more sentences with more errors, versus not finding the
> article they're interested in at all?
> * Related to the above: what is the error threshold beyond which the
> brand perceptions will turn negative (to be defined: may be by
> measuring if the user returns in the coming week or month.)? This may
> well be different in different languages and cultures.
> * Depending on the result of the above, we may want to look at
> offering the user the option to access that information, but outside
> of Wikipedia, or inside Wikipedia but very clearly labeled as Machine
> Generated as you do to some extent in these projects.
>
The questions are very interesting, and in part formalize what we discussed
already as well. The best way would be to actually study this with the
communities involved, as we started in the ESWC paper, but focus on the
different interest groups in particular: readers of Wikipedia, readers
coming from outside Wikipedia, editors of Wikipedia and new editors.

>
> > What other interesting use cases for such a technology in the
> > Wikimedia world can you imagine?
>
> The technology itself can have a variety of use-cases, including
> providing captions or summaries of photos even without layers of image
> processing applied to them.
>

This sounds like a very interesting idea. I saw that there is work on image
captions by WMF already started, I will be following this with great
curiosity :)

Best,
Lucie

>
> Best,
> Leila
>
> > [1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and
> > https://commons.wikimedia.org/wiki/File:Generating_Article_
> Placeholders_from_Wikidata_for_Wikipedia_-_Increasing_
> Access_to_Free_and_Open_Knowledge.pdf
> > [2]
> > https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_
> Wikidata_Multilingual.pdf
> >
> > --
> > Lucie-Aimée Kaffee
> > Web and Internet Science Group
> > School of Electronics and Computer Science
> > University of Southampton
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Lucie-Aimée Kaffee-2
In reply to this post by Leila Zia
Hello Ziko,

Thanks for your mail! I responded inline below.

On 6 April 2018 at 03:04, Ziko van Dijk <[hidden email]> wrote:

> Hello,
>
> A most interesting thread, as it touches the topic from different angles. I
> agree that it needs actually a study among readers about their preferences.
>
As I mentioned to Leila, the ESWC paper does work with editors, but I
agree, more thought and work should be done on actual Wikipedia readers.

>
> Personally, I may have some doubt whether it improves an ArticlePlaceholder
> to create sentences from the data (as they did in the geographical
> "articles" created by bots). The data itself is most suitable for
> databases, to be looked up in a table. Reading "Berlin has 3,500,000
> million inhabitants" is not really an improvement compared to "Berlin /
> inhabitants: 3,500,000".
>
> Sentences have the most power when they combine information to knowledge,
> like in "Berlin's population, currently 3,500,000, has been much different
> during the Cold War because of the declining attractiveness for
> businesses".
>
> In general, I would advise against one-sentence-summaries; a reader might
> be disappointed when he comes via Google to a website and then only finds
> one sentence.
>

Just to clarify: the summaries do generate information from multiple
triples. Basically means, the sentences are a bit more complex than just
verbalizing one triple per sentence. However, even with a neural network,
there is a limit to how much context we can produce for each sentence.
Therefore, we integrated the question of how editors work with the data, as
we see it an important aspect of the workflow. Basically,
ArticlePlaceholder can be a better option than no information at all, but
still the ideal would be an actual editor picking up a topic and writing
and maintaining a full article.
Furthermore, in our current (theoretical) design we still keep all the
information available from Wikidata in forms of triples. Therefore, we
don't replace any information, we just add a sentence that's more reader
friendly and gives a first overview, before looking at pure triples.

>
> (I hope I understood the question well; I cannot follow the math in your
> article. Is there anywhere an example of your "summaries" to read?)
>
The summaries are learned from the first sentence of Wikipedia, therefore
they contain the same kind of structure and content. If you're able to read
Arabic or Esperanto, generated sentences can be found here:
https://github.com/pvougiou/Mind-the-Language-Gap/tree/master/Results/Our%20Model

Cheers,
Lucie

>
>
>
>
>
>
> 2018-04-05 22:50 GMT+02:00 Leila Zia <[hidden email]>:
>
> > Hi Lucie-Aimée,
> >
> > Nice to see work in this direction is progressing. Some comments in-line.
> >
> > On Wed, Apr 4, 2018 at 7:49 AM, Lucie-Aimée Kaffee <[hidden email]>
> > wrote:
> > >
> > > Therefore, we worked on producing sentences from the information on
> > > Wikidata in the given language. We trained a neural network model, the
> > > details can be found in the preprint of the NAACL paper here:
> > > https://arxiv.org/abs/1803.07116
> >
> > It would be good to do human (both readers and editors, and perhaps
> > both sets) evaluations for this research, too, to better understand
> > how well the model is doing from the perspective of the experienced
> > editors in some of the smaller languages as well as their readers. (I
> > acknowledge that finding experienced editors when you go to small
> > languages can become hard.)
> >
> > > Furthermore, we would love to hear your input: Do you believe, one
> > sentence
> > > summaries are enough, can we serve the communities needs better with
> more
> > > than one sentence?
> >
> > This is a hard question to answer. :) The answer may rely on many
> > factors including the language you want to implement such a system in
> > and the expectation the users of the language have in terms of online
> > content available to them in their language.
> >
> > > Is this still true if longer abstracts would be of lower
> > > text quality?
> >
> > same as above. You are signing yourself up for more experiments. ;)
> >
> > I would be interested to know:
> > * What is the perception of the readers of a given language about
> > Wikipedia if a lot of articles that they go to in their language have
> > one sentence (to a good extent accurate), a few sentences but with
> > some errors, more sentences with more errors, versus not finding the
> > article they're interested in at all?
> > * Related to the above: what is the error threshold beyond which the
> > brand perceptions will turn negative (to be defined: may be by
> > measuring if the user returns in the coming week or month.)? This may
> > well be different in different languages and cultures.
> > * Depending on the result of the above, we may want to look at
> > offering the user the option to access that information, but outside
> > of Wikipedia, or inside Wikipedia but very clearly labeled as Machine
> > Generated as you do to some extent in these projects.
> >
> > > What other interesting use cases for such a technology in the
> > > Wikimedia world can you imagine?
> >
> > The technology itself can have a variety of use-cases, including
> > providing captions or summaries of photos even without layers of image
> > processing applied to them.
> >
> > Best,
> > Leila
> >
> > > [1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and
> > > https://commons.wikimedia.org/wiki/File:Generating_Article_
> > Placeholders_from_Wikidata_for_Wikipedia_-_Increasing_
> > Access_to_Free_and_Open_Knowledge.pdf
> > > [2]
> > > https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_
> > Wikidata_Multilingual.pdf
> > >
> > > --
> > > Lucie-Aimée Kaffee
> > > Web and Internet Science Group
> > > School of Electronics and Computer Science
> > > University of Southampton
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Lucie-Aimée Kaffee
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Generation of Wikipedia Summaries from Wikidata in Underserved Languages using Deep Learning

Ziko van Dijk-3
Thank you Lucie, for taking the effort to answer in detail. As I said, I am
afraid I cannot really understand your paper as I come from the humanities.
And of course, a study about reader expectations was not part of your paper
and research. For me personally, I would start there, and I know that
Wikipedia research had always more attention for contributors than for
readers.

You are opening a new issue actually: what is useful for readers, that is
one thing. The other thing is: does an ArticlePlaceholder help an editor to
improve an article. I would suppose that it is best to start the article on
your own, but that may depend on the topic of the article.

I do speak Esperanto, by chance. :-)
https://eo.wikipedia.org/wiki/Uzanto:Ziko

Kind regards,
Ziko

Lucie-Aimée Kaffee <[hidden email]> schrieb am Sa. 7. Apr. 2018 um
16:24:

> Hello Ziko,
>
> Thanks for your mail! I responded inline below.
>
> On 6 April 2018 at 03:04, Ziko van Dijk <[hidden email]> wrote:
>
> > Hello,
> >
> > A most interesting thread, as it touches the topic from different
> angles. I
> > agree that it needs actually a study among readers about their
> preferences.
> >
> As I mentioned to Leila, the ESWC paper does work with editors, but I
> agree, more thought and work should be done on actual Wikipedia readers.
>
> >
> > Personally, I may have some doubt whether it improves an
> ArticlePlaceholder
> > to create sentences from the data (as they did in the geographical
> > "articles" created by bots). The data itself is most suitable for
> > databases, to be looked up in a table. Reading "Berlin has 3,500,000
> > million inhabitants" is not really an improvement compared to "Berlin /
> > inhabitants: 3,500,000".
> >
> > Sentences have the most power when they combine information to knowledge,
> > like in "Berlin's population, currently 3,500,000, has been much
> different
> > during the Cold War because of the declining attractiveness for
> > businesses".
> >
> > In general, I would advise against one-sentence-summaries; a reader might
> > be disappointed when he comes via Google to a website and then only finds
> > one sentence.
> >
>
> Just to clarify: the summaries do generate information from multiple
> triples. Basically means, the sentences are a bit more complex than just
> verbalizing one triple per sentence. However, even with a neural network,
> there is a limit to how much context we can produce for each sentence.
> Therefore, we integrated the question of how editors work with the data, as
> we see it an important aspect of the workflow. Basically,
> ArticlePlaceholder can be a better option than no information at all, but
> still the ideal would be an actual editor picking up a topic and writing
> and maintaining a full article.
> Furthermore, in our current (theoretical) design we still keep all the
> information available from Wikidata in forms of triples. Therefore, we
> don't replace any information, we just add a sentence that's more reader
> friendly and gives a first overview, before looking at pure triples.
>
> >
> > (I hope I understood the question well; I cannot follow the math in your
> > article. Is there anywhere an example of your "summaries" to read?)
> >
> The summaries are learned from the first sentence of Wikipedia, therefore
> they contain the same kind of structure and content. If you're able to read
> Arabic or Esperanto, generated sentences can be found here:
>
> https://github.com/pvougiou/Mind-the-Language-Gap/tree/master/Results/Our%20Model
>
> Cheers,
> Lucie
>
> >
> >
> >
> >
> >
> >
> > 2018-04-05 22:50 GMT+02:00 Leila Zia <[hidden email]>:
> >
> > > Hi Lucie-Aimée,
> > >
> > > Nice to see work in this direction is progressing. Some comments
> in-line.
> > >
> > > On Wed, Apr 4, 2018 at 7:49 AM, Lucie-Aimée Kaffee <[hidden email]
> >
> > > wrote:
> > > >
> > > > Therefore, we worked on producing sentences from the information on
> > > > Wikidata in the given language. We trained a neural network model,
> the
> > > > details can be found in the preprint of the NAACL paper here:
> > > > https://arxiv.org/abs/1803.07116
> > >
> > > It would be good to do human (both readers and editors, and perhaps
> > > both sets) evaluations for this research, too, to better understand
> > > how well the model is doing from the perspective of the experienced
> > > editors in some of the smaller languages as well as their readers. (I
> > > acknowledge that finding experienced editors when you go to small
> > > languages can become hard.)
> > >
> > > > Furthermore, we would love to hear your input: Do you believe, one
> > > sentence
> > > > summaries are enough, can we serve the communities needs better with
> > more
> > > > than one sentence?
> > >
> > > This is a hard question to answer. :) The answer may rely on many
> > > factors including the language you want to implement such a system in
> > > and the expectation the users of the language have in terms of online
> > > content available to them in their language.
> > >
> > > > Is this still true if longer abstracts would be of lower
> > > > text quality?
> > >
> > > same as above. You are signing yourself up for more experiments. ;)
> > >
> > > I would be interested to know:
> > > * What is the perception of the readers of a given language about
> > > Wikipedia if a lot of articles that they go to in their language have
> > > one sentence (to a good extent accurate), a few sentences but with
> > > some errors, more sentences with more errors, versus not finding the
> > > article they're interested in at all?
> > > * Related to the above: what is the error threshold beyond which the
> > > brand perceptions will turn negative (to be defined: may be by
> > > measuring if the user returns in the coming week or month.)? This may
> > > well be different in different languages and cultures.
> > > * Depending on the result of the above, we may want to look at
> > > offering the user the option to access that information, but outside
> > > of Wikipedia, or inside Wikipedia but very clearly labeled as Machine
> > > Generated as you do to some extent in these projects.
> > >
> > > > What other interesting use cases for such a technology in the
> > > > Wikimedia world can you imagine?
> > >
> > > The technology itself can have a variety of use-cases, including
> > > providing captions or summaries of photos even without layers of image
> > > processing applied to them.
> > >
> > > Best,
> > > Leila
> > >
> > > > [1] https://www.mediawiki.org/wiki/Extension:ArticlePlaceholder and
> > > > https://commons.wikimedia.org/wiki/File:Generating_Article_
> > > Placeholders_from_Wikidata_for_Wikipedia_-_Increasing_
> > > Access_to_Free_and_Open_Knowledge.pdf
> > > > [2]
> > > > https://eprints.soton.ac.uk/413433/1/Open_Sym_Short_Paper_
> > > Wikidata_Multilingual.pdf
> > > >
> > > > --
> > > > Lucie-Aimée Kaffee
> > > > Web and Internet Science Group
> > > > School of Electronics and Computer Science
> > > > University of Southampton
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
>
>
>
> --
> Lucie-Aimée Kaffee
> Web and Internet Science Group
> School of Electronics and Computer Science
> University of Southampton
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l