[Wikimedia-l] Most wanted articles across languages

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] Most wanted articles across languages

Amir E. Aharoni
Hi!

There's a little research project I've been working on in the last few
weeks: What are the articles that people are most often looking for in
their language, and *cannot* find?

I was doing this by looking at the logs of searches in the language search
box in the interlanguage links panel and counting the articles on which
searching for a language didn't yield any result.

This can be useful to the editors in different languages for understanding
which articles are in demand and should be created. This may also be useful
for considering how to reorganize existing articles. Of course, actually
doing this is up to the editing communities in each language; I'm just
trying to show where exactly does this happen.

My first attempt at producing a report about it can be found here:
https://meta.wikimedia.org/wiki/Most_wanted_articles_across_languages

This is my first attempt to make a public version of this report, so you
may find some issues there, for example contradicting or missing data.
Also, the tables could probably be more nicely designed. Bug reports,
suggestions for improvement, and all other feedback is obviously welcome.
However, I believe this is good enough for taking a first look and reaching
some conclusions.

The two immediate findings that I can see are that the most notable
articles that people cannot find fall into the following categories:
* Topics that are popular in the news: "Avengers: Infinity War", "General
Data Protection Regulation", "Avicii". In particular, I should note that
topics that are featured in Google Doodles [1] come up often: "Georges
Méliès", "Mahadevi Varma", etc.
* Topics that are covered in another language, but cannot be found because
of different organization of information. This often happens with articles
where there are cultural differences between languages, for example
"Football" in the English Wikipedia refers to several different games (I'd
guess that many people around the world are interested in "Association
Football"). This also often happens with articles about Biology and
species: "Homo Sapiens", "Blueberry", etc.; these are organized differently
in different Wikipedias.

[1] https://www.google.com/doodles/


--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
‪“We're living in pieces,
I want to live in peace.” – T. Moore‬
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Most wanted articles across languages

jmh649
Excellent. Google also provided a list of some of the most missing items in
13 languages of India as part of Project Tiger.

https://meta.wikimedia.org/wiki/Supporting_Indian_Language_Wikipedias_Program/Contest/Topics

James

On Thu, May 31, 2018 at 10:58 AM, Amir E. Aharoni <
[hidden email]> wrote:

> Hi!
>
> There's a little research project I've been working on in the last few
> weeks: What are the articles that people are most often looking for in
> their language, and *cannot* find?
>
> I was doing this by looking at the logs of searches in the language search
> box in the interlanguage links panel and counting the articles on which
> searching for a language didn't yield any result.
>
> This can be useful to the editors in different languages for understanding
> which articles are in demand and should be created. This may also be useful
> for considering how to reorganize existing articles. Of course, actually
> doing this is up to the editing communities in each language; I'm just
> trying to show where exactly does this happen.
>
> My first attempt at producing a report about it can be found here:
> https://meta.wikimedia.org/wiki/Most_wanted_articles_across_languages
>
> This is my first attempt to make a public version of this report, so you
> may find some issues there, for example contradicting or missing data.
> Also, the tables could probably be more nicely designed. Bug reports,
> suggestions for improvement, and all other feedback is obviously welcome.
> However, I believe this is good enough for taking a first look and reaching
> some conclusions.
>
> The two immediate findings that I can see are that the most notable
> articles that people cannot find fall into the following categories:
> * Topics that are popular in the news: "Avengers: Infinity War", "General
> Data Protection Regulation", "Avicii". In particular, I should note that
> topics that are featured in Google Doodles [1] come up often: "Georges
> Méliès", "Mahadevi Varma", etc.
> * Topics that are covered in another language, but cannot be found because
> of different organization of information. This often happens with articles
> where there are cultural differences between languages, for example
> "Football" in the English Wikipedia refers to several different games (I'd
> guess that many people around the world are interested in "Association
> Football"). This also often happens with articles about Biology and
> species: "Homo Sapiens", "Blueberry", etc.; these are organized differently
> in different Wikipedias.
>
> [1] https://www.google.com/doodles/
>
>
> --
> Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> http://aharoni.wordpress.com
> ‪“We're living in pieces,
> I want to live in peace.” – T. Moore‬
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> wiki/Wikimedia-l
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>




--
James Heilman
MD, CCFP-EM, Wikipedian
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Most wanted articles across languages

Amir E. Aharoni
This is indeed comparable, though from a slightly different aspect, and we
are doing it completely ourselves.

Hopefully it will be directly useful to editors, and also for improving the
software. For example, we already used it to improve the functionality of
the search box itself, so that it would be able to find languages with
alternate names, such as "castellano" and "español" for Spanish, and a few
more.

בתאריך יום ה׳, 31 במאי 2018, 10:41, מאת James Heilman ‏<[hidden email]>:

> Excellent. Google also provided a list of some of the most missing items in
> 13 languages of India as part of Project Tiger.
>
>
> https://meta.wikimedia.org/wiki/Supporting_Indian_Language_Wikipedias_Program/Contest/Topics
>
> James
>
> On Thu, May 31, 2018 at 10:58 AM, Amir E. Aharoni <
> [hidden email]> wrote:
>
> > Hi!
> >
> > There's a little research project I've been working on in the last few
> > weeks: What are the articles that people are most often looking for in
> > their language, and *cannot* find?
> >
> > I was doing this by looking at the logs of searches in the language
> search
> > box in the interlanguage links panel and counting the articles on which
> > searching for a language didn't yield any result.
> >
> > This can be useful to the editors in different languages for
> understanding
> > which articles are in demand and should be created. This may also be
> useful
> > for considering how to reorganize existing articles. Of course, actually
> > doing this is up to the editing communities in each language; I'm just
> > trying to show where exactly does this happen.
> >
> > My first attempt at producing a report about it can be found here:
> > https://meta.wikimedia.org/wiki/Most_wanted_articles_across_languages
> >
> > This is my first attempt to make a public version of this report, so you
> > may find some issues there, for example contradicting or missing data.
> > Also, the tables could probably be more nicely designed. Bug reports,
> > suggestions for improvement, and all other feedback is obviously welcome.
> > However, I believe this is good enough for taking a first look and
> reaching
> > some conclusions.
> >
> > The two immediate findings that I can see are that the most notable
> > articles that people cannot find fall into the following categories:
> > * Topics that are popular in the news: "Avengers: Infinity War", "General
> > Data Protection Regulation", "Avicii". In particular, I should note that
> > topics that are featured in Google Doodles [1] come up often: "Georges
> > Méliès", "Mahadevi Varma", etc.
> > * Topics that are covered in another language, but cannot be found
> because
> > of different organization of information. This often happens with
> articles
> > where there are cultural differences between languages, for example
> > "Football" in the English Wikipedia refers to several different games
> (I'd
> > guess that many people around the world are interested in "Association
> > Football"). This also often happens with articles about Biology and
> > species: "Homo Sapiens", "Blueberry", etc.; these are organized
> differently
> > in different Wikipedias.
> >
> > [1] https://www.google.com/doodles/
> >
> >
> > --
> > Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
> > http://aharoni.wordpress.com
> > ‪“We're living in pieces,
> > I want to live in peace.” – T. Moore‬
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/
> > wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/
> > wiki/Wikimedia-l
> > New messages to: [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
>
>
>
>
> --
> James Heilman
> MD, CCFP-EM, Wikipedian
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>