Users across all Wikimedia projects

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Users across all Wikimedia projects

Piscopo A.
Hello everyone,

I would like to carry out a study about how users work across different projects in the Wikimedia ecosystem.
I can’t find any dataset containing all usernames and user ids across all the projects, or at least those with a global account.
I’ve tried with quarry, but the query to get all the data from it is too big and is not really a solution.
Can anybody point me to some resource I can download and process myself, e.g. a global account user dataset, or the whole user database table that can be queried in quarry?

Thanks,
     Alessandro

–––
Alessandro Piscopo
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
email: [hidden email]<mailto:[hidden email]>

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Users across all Wikimedia projects

Scott Hale
Hi Alessandro,

Usernames are unique across Wikimedia projects now; so, it is possible to
simply union/intersect the usernames from any projects to understand the
overlap. There are, however, as you have seen a huge number of users
although most are not active in any given month. One approach I have used
[1] is to monitor the recent changes feeds for multiple projects for a time
period and look at overlap within this data, but that obviously does not
speak to longer trends. For that, I would first define a suitable metric
(e.g., one or more edits per month [2] or 5 or more edits [3], etc.) to get
a list of active editors per project.

[1] Hale, S. A. (2014). Multilinguals and Wikipedia editing. In Proceedings
of the 6th Annual ACM Web Science Conference, WebSci ’14, ACM.
http://www.scotthale.net/pubs/?websci2014
[2]
https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/editors/normal|line|2-Year~2016070100~2018080300|~total
[3] https://meta.wikimedia.org/wiki/Research:Wikistats_metrics/Editors


Best wishes,
Scott

On Fri, Aug 3, 2018 at 2:52 PM Piscopo A. <[hidden email]> wrote:

> Hello everyone,
>
> I would like to carry out a study about how users work across different
> projects in the Wikimedia ecosystem.
> I can’t find any dataset containing all usernames and user ids across all
> the projects, or at least those with a global account.
> I’ve tried with quarry, but the query to get all the data from it is too
> big and is not really a solution.
> Can anybody point me to some resource I can download and process myself,
> e.g. a global account user dataset, or the whole user database table that
> can be queried in quarry?
>
> Thanks,
>      Alessandro
>
> –––
> Alessandro Piscopo
> Web and Internet Science Group
> School of Electronics and Computer Science
> University of Southampton
> email: [hidden email]<mailto:[hidden email]>
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


--
Dr Scott A. Hale
http://scott.hale.us
[hidden email]
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Users across all Wikimedia projects

Piscopo A.
Hi Scott,

Thank you very much, the paper is definitely relevant for what I am doing.
However, I would like to look at how the behaviour of multilinguals evolve over time, so I will have to find another solution for getting user names.
I’ll keep you and the community updated about the future developments of this.

Cheers,
      Alessandro

–––
Alessandro Piscopo
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
email: [hidden email]<mailto:[hidden email]>

On 3 Aug 2018, at 15:43, Scott Hale <[hidden email]<mailto:[hidden email]>> wrote:

Hi Alessandro,

Usernames are unique across Wikimedia projects now; so, it is possible to
simply union/intersect the usernames from any projects to understand the
overlap. There are, however, as you have seen a huge number of users
although most are not active in any given month. One approach I have used
[1] is to monitor the recent changes feeds for multiple projects for a time
period and look at overlap within this data, but that obviously does not
speak to longer trends. For that, I would first define a suitable metric
(e.g., one or more edits per month [2] or 5 or more edits [3], etc.) to get
a list of active editors per project.

[1] Hale, S. A. (2014). Multilinguals and Wikipedia editing. In Proceedings
of the 6th Annual ACM Web Science Conference, WebSci ’14, ACM.
http://www.scotthale.net/pubs/?websci2014
[2]
https://stats.wikimedia.org/v2/#/en.wikipedia.org/contributing/editors/normal|line|2-Year~2016070100~2018080300|~total
[3] https://meta.wikimedia.org/wiki/Research:Wikistats_metrics/Editors


Best wishes,
Scott

On Fri, Aug 3, 2018 at 2:52 PM Piscopo A. <[hidden email]<mailto:[hidden email]>> wrote:

Hello everyone,

I would like to carry out a study about how users work across different
projects in the Wikimedia ecosystem.
I can’t find any dataset containing all usernames and user ids across all
the projects, or at least those with a global account.
I’ve tried with quarry, but the query to get all the data from it is too
big and is not really a solution.
Can anybody point me to some resource I can download and process myself,
e.g. a global account user dataset, or the whole user database table that
can be queried in quarry?

Thanks,
    Alessandro

–––
Alessandro Piscopo
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
email: [hidden email]<mailto:[hidden email]><mailto:[hidden email]>

_______________________________________________
Wiki-research-l mailing list
[hidden email]<mailto:[hidden email]>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Dr Scott A. Hale
http://scott.hale.us<http://scott.hale.us/>
[hidden email]<mailto:[hidden email]>
_______________________________________________
Wiki-research-l mailing list
[hidden email]<mailto:[hidden email]>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



–––
Alessandro Piscopo
Web and Internet Science Group
School of Electronics and Computer Science
University of Southampton
email: [hidden email]<mailto:[hidden email]>

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l