Query user history edits

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Query user history edits

Haifeng Zhang
Dear folks,

Is there a good way to query a user's edit history, e.g., edit count during a period?

My current solution is using usercontribs API (https://www.mediawiki.org/wiki/API:Usercontribs).

But, the process has been stalled maybe due to some query limit.


Thanks,

Haifeng Zhang
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

Morten Wang
Hi Haifeng,

In my experience, this depends on how many users you're looking to get
information about. Is it a few hundred? A few thousand? A million+?

If you are getting the edit history for a limited number of users (say a
few hundred to a few thousand), then using the API can work well. One thing
to keep in mind when using the API is that your requests might be throttled
and/or there might be database lag. Are you using a software library to
access the API? If not, I'd consider using one so that throttling/lag
doesn't become an issue, it's one of the reasons why I use Pywikibot
<https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.

If you're interested in querying a large number of users (say tens of
thousands or more), then getting an account on Toolforge
<https://tools.wmflabs.org> so you can run SQL queries against the
replicated MediaWiki databases would make sense. I've frequently used that
approach for data gathering for research purposes.

Hope that helps! And if not, don't hesitate to ask questions :)


Cheers,
Morten

On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]> wrote:

> Dear folks,
>
> Is there a good way to query a user's edit history, e.g., edit count
> during a period?
>
> My current solution is using usercontribs API (
> https://www.mediawiki.org/wiki/API:Usercontribs).
>
> But, the process has been stalled maybe due to some query limit.
>
>
> Thanks,
>
> Haifeng Zhang
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

James Hare-5
I will also note that Quarry <https://quarry.wmflabs.org/> is good for
querying database replicas – no Toolforge account required.


--
*James Hare* (he/him)
Associate Product Manager
Wikimedia Foundation <https://wikimediafoundation.org/>


On Wed, Mar 27, 2019 at 9:30 AM Morten Wang <[hidden email]> wrote:

> Hi Haifeng,
>
> In my experience, this depends on how many users you're looking to get
> information about. Is it a few hundred? A few thousand? A million+?
>
> If you are getting the edit history for a limited number of users (say a
> few hundred to a few thousand), then using the API can work well. One thing
> to keep in mind when using the API is that your requests might be throttled
> and/or there might be database lag. Are you using a software library to
> access the API? If not, I'd consider using one so that throttling/lag
> doesn't become an issue, it's one of the reasons why I use Pywikibot
> <https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.
>
> If you're interested in querying a large number of users (say tens of
> thousands or more), then getting an account on Toolforge
> <https://tools.wmflabs.org> so you can run SQL queries against the
> replicated MediaWiki databases would make sense. I've frequently used that
> approach for data gathering for research purposes.
>
> Hope that helps! And if not, don't hesitate to ask questions :)
>
>
> Cheers,
> Morten
>
> On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]>
> wrote:
>
> > Dear folks,
> >
> > Is there a good way to query a user's edit history, e.g., edit count
> > during a period?
> >
> > My current solution is using usercontribs API (
> > https://www.mediawiki.org/wiki/API:Usercontribs).
> >
> > But, the process has been stalled maybe due to some query limit.
> >
> >
> > Thanks,
> >
> > Haifeng Zhang
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

Aaron Halfaker-2
If you share why kind of query you'd like to run, people on this list might
even write you an example Quarry :)  See also:
https://www.mediawiki.org/wiki/Talk:Quarry  People post query requests that
and others help them draft the right query for their needs.

On Wed, Mar 27, 2019 at 11:28 AM James Hare <[hidden email]> wrote:

> I will also note that Quarry <https://quarry.wmflabs.org/> is good for
> querying database replicas – no Toolforge account required.
>
>
> --
> *James Hare* (he/him)
> Associate Product Manager
> Wikimedia Foundation <https://wikimediafoundation.org/>
>
>
> On Wed, Mar 27, 2019 at 9:30 AM Morten Wang <[hidden email]> wrote:
>
> > Hi Haifeng,
> >
> > In my experience, this depends on how many users you're looking to get
> > information about. Is it a few hundred? A few thousand? A million+?
> >
> > If you are getting the edit history for a limited number of users (say a
> > few hundred to a few thousand), then using the API can work well. One
> thing
> > to keep in mind when using the API is that your requests might be
> throttled
> > and/or there might be database lag. Are you using a software library to
> > access the API? If not, I'd consider using one so that throttling/lag
> > doesn't become an issue, it's one of the reasons why I use Pywikibot
> > <https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.
> >
> > If you're interested in querying a large number of users (say tens of
> > thousands or more), then getting an account on Toolforge
> > <https://tools.wmflabs.org> so you can run SQL queries against the
> > replicated MediaWiki databases would make sense. I've frequently used
> that
> > approach for data gathering for research purposes.
> >
> > Hope that helps! And if not, don't hesitate to ask questions :)
> >
> >
> > Cheers,
> > Morten
> >
> > On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]>
> > wrote:
> >
> > > Dear folks,
> > >
> > > Is there a good way to query a user's edit history, e.g., edit count
> > > during a period?
> > >
> > > My current solution is using usercontribs API (
> > > https://www.mediawiki.org/wiki/API:Usercontribs).
> > >
> > > But, the process has been stalled maybe due to some query limit.
> > >
> > >
> > > Thanks,
> > >
> > > Haifeng Zhang
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

Maximilian Klein
Sample query I just wrote that get the edit count of a single user (by
username) during a period of time on Quarry:
https://quarry.wmflabs.org/query/34716 .
The gotcha that might be important for this query is using the table
`revision_userindex` rather than the play revision table which doesn't have
such an index.


Make a great day,
Max Klein ‽ http://notconfusing.com/


On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker <[hidden email]>
wrote:

> If you share why kind of query you'd like to run, people on this list might
> even write you an example Quarry :)  See also:
> https://www.mediawiki.org/wiki/Talk:Quarry  People post query requests
> that
> and others help them draft the right query for their needs.
>
> On Wed, Mar 27, 2019 at 11:28 AM James Hare <[hidden email]> wrote:
>
> > I will also note that Quarry <https://quarry.wmflabs.org/> is good for
> > querying database replicas – no Toolforge account required.
> >
> >
> > --
> > *James Hare* (he/him)
> > Associate Product Manager
> > Wikimedia Foundation <https://wikimediafoundation.org/>
> >
> >
> > On Wed, Mar 27, 2019 at 9:30 AM Morten Wang <[hidden email]> wrote:
> >
> > > Hi Haifeng,
> > >
> > > In my experience, this depends on how many users you're looking to get
> > > information about. Is it a few hundred? A few thousand? A million+?
> > >
> > > If you are getting the edit history for a limited number of users (say
> a
> > > few hundred to a few thousand), then using the API can work well. One
> > thing
> > > to keep in mind when using the API is that your requests might be
> > throttled
> > > and/or there might be database lag. Are you using a software library to
> > > access the API? If not, I'd consider using one so that throttling/lag
> > > doesn't become an issue, it's one of the reasons why I use Pywikibot
> > > <https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.
> > >
> > > If you're interested in querying a large number of users (say tens of
> > > thousands or more), then getting an account on Toolforge
> > > <https://tools.wmflabs.org> so you can run SQL queries against the
> > > replicated MediaWiki databases would make sense. I've frequently used
> > that
> > > approach for data gathering for research purposes.
> > >
> > > Hope that helps! And if not, don't hesitate to ask questions :)
> > >
> > >
> > > Cheers,
> > > Morten
> > >
> > > On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]>
> > > wrote:
> > >
> > > > Dear folks,
> > > >
> > > > Is there a good way to query a user's edit history, e.g., edit count
> > > > during a period?
> > > >
> > > > My current solution is using usercontribs API (
> > > > https://www.mediawiki.org/wiki/API:Usercontribs).
> > > >
> > > > But, the process has been stalled maybe due to some query limit.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Haifeng Zhang
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

Haifeng Zhang
Thanks a lot for answering my questions, guys.

May I save/upload my own table (with user names and time ranges) to Quarry?

It looks I have to manually enter these information in SQL queries.


Also, I tried to get a Toolforge account. When following the step to

Create a Wikimedia developer account<https://toolsadmin.wikimedia.org/register/>, the page showed:

"Developer account creation is currently disabled. We apologise for the inconvenience."


Best,

Haifeng Zhang
________________________________
From: Wiki-research-l <[hidden email]> on behalf of Maximilian Klein <[hidden email]>
Sent: Wednesday, March 27, 2019 12:46:39 PM
To: Research into Wikimedia content and communities
Subject: Re: [Wiki-research-l] Query user history edits

Sample query I just wrote that get the edit count of a single user (by
username) during a period of time on Quarry:
https://quarry.wmflabs.org/query/34716 .
The gotcha that might be important for this query is using the table
`revision_userindex` rather than the play revision table which doesn't have
such an index.


Make a great day,
Max Klein ‽ http://notconfusing.com/


On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker <[hidden email]>
wrote:

> If you share why kind of query you'd like to run, people on this list might
> even write you an example Quarry :)  See also:
> https://www.mediawiki.org/wiki/Talk:Quarry  People post query requests
> that
> and others help them draft the right query for their needs.
>
> On Wed, Mar 27, 2019 at 11:28 AM James Hare <[hidden email]> wrote:
>
> > I will also note that Quarry <https://quarry.wmflabs.org/> is good for
> > querying database replicas – no Toolforge account required.
> >
> >
> > --
> > *James Hare* (he/him)
> > Associate Product Manager
> > Wikimedia Foundation <https://wikimediafoundation.org/>
> >
> >
> > On Wed, Mar 27, 2019 at 9:30 AM Morten Wang <[hidden email]> wrote:
> >
> > > Hi Haifeng,
> > >
> > > In my experience, this depends on how many users you're looking to get
> > > information about. Is it a few hundred? A few thousand? A million+?
> > >
> > > If you are getting the edit history for a limited number of users (say
> a
> > > few hundred to a few thousand), then using the API can work well. One
> > thing
> > > to keep in mind when using the API is that your requests might be
> > throttled
> > > and/or there might be database lag. Are you using a software library to
> > > access the API? If not, I'd consider using one so that throttling/lag
> > > doesn't become an issue, it's one of the reasons why I use Pywikibot
> > > <https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.
> > >
> > > If you're interested in querying a large number of users (say tens of
> > > thousands or more), then getting an account on Toolforge
> > > <https://tools.wmflabs.org> so you can run SQL queries against the
> > > replicated MediaWiki databases would make sense. I've frequently used
> > that
> > > approach for data gathering for research purposes.
> > >
> > > Hope that helps! And if not, don't hesitate to ask questions :)
> > >
> > >
> > > Cheers,
> > > Morten
> > >
> > > On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]>
> > > wrote:
> > >
> > > > Dear folks,
> > > >
> > > > Is there a good way to query a user's edit history, e.g., edit count
> > > > during a period?
> > > >
> > > > My current solution is using usercontribs API (
> > > > https://www.mediawiki.org/wiki/API:Usercontribs).
> > > >
> > > > But, the process has been stalled maybe due to some query limit.
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Haifeng Zhang
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Query user history edits

Nick Wilson (Quiddity)
Hi Haifeng,

Regarding the Toolforge account, I'm sorry to say that account creation is
currently disabled. This is a temporary measure that we hope to be able to
undo soon, but for now there is no exact timeline or public phabricator
tasks I can point you to to follow. However, I've added you to the list of
people to ping when it is available again. Thank you for your patience.


On Wed, Mar 27, 2019 at 8:18 PM Haifeng Zhang <[hidden email]>
wrote:

> Thanks a lot for answering my questions, guys.
>
> May I save/upload my own table (with user names and time ranges) to Quarry?
>
> It looks I have to manually enter these information in SQL queries.
>
>
> Also, I tried to get a Toolforge account. When following the step to
>
> Create a Wikimedia developer account<
> https://toolsadmin.wikimedia.org/register/>, the page showed:
>
> "Developer account creation is currently disabled. We apologise for the
> inconvenience."
>
>
> Best,
>
> Haifeng Zhang
> ________________________________
> From: Wiki-research-l <[hidden email]> on
> behalf of Maximilian Klein <[hidden email]>
> Sent: Wednesday, March 27, 2019 12:46:39 PM
> To: Research into Wikimedia content and communities
> Subject: Re: [Wiki-research-l] Query user history edits
>
> Sample query I just wrote that get the edit count of a single user (by
> username) during a period of time on Quarry:
> https://quarry.wmflabs.org/query/34716 .
> The gotcha that might be important for this query is using the table
> `revision_userindex` rather than the play revision table which doesn't have
> such an index.
>
>
> Make a great day,
> Max Klein ‽ http://notconfusing.com/
>
>
> On Wed, 27 Mar 2019 at 09:42, Aaron Halfaker <[hidden email]>
> wrote:
>
> > If you share why kind of query you'd like to run, people on this list
> might
> > even write you an example Quarry :)  See also:
> > https://www.mediawiki.org/wiki/Talk:Quarry  People post query requests
> > that
> > and others help them draft the right query for their needs.
> >
> > On Wed, Mar 27, 2019 at 11:28 AM James Hare <[hidden email]> wrote:
> >
> > > I will also note that Quarry <https://quarry.wmflabs.org/> is good for
> > > querying database replicas – no Toolforge account required.
> > >
> > >
> > > --
> > > *James Hare* (he/him)
> > > Associate Product Manager
> > > Wikimedia Foundation <https://wikimediafoundation.org/>
> > >
> > >
> > > On Wed, Mar 27, 2019 at 9:30 AM Morten Wang <[hidden email]> wrote:
> > >
> > > > Hi Haifeng,
> > > >
> > > > In my experience, this depends on how many users you're looking to
> get
> > > > information about. Is it a few hundred? A few thousand? A million+?
> > > >
> > > > If you are getting the edit history for a limited number of users
> (say
> > a
> > > > few hundred to a few thousand), then using the API can work well. One
> > > thing
> > > > to keep in mind when using the API is that your requests might be
> > > throttled
> > > > and/or there might be database lag. Are you using a software library
> to
> > > > access the API? If not, I'd consider using one so that throttling/lag
> > > > doesn't become an issue, it's one of the reasons why I use Pywikibot
> > > > <https://www.mediawiki.org/wiki/Manual:Pywikibot> for API requests.
> > > >
> > > > If you're interested in querying a large number of users (say tens of
> > > > thousands or more), then getting an account on Toolforge
> > > > <https://tools.wmflabs.org> so you can run SQL queries against the
> > > > replicated MediaWiki databases would make sense. I've frequently used
> > > that
> > > > approach for data gathering for research purposes.
> > > >
> > > > Hope that helps! And if not, don't hesitate to ask questions :)
> > > >
> > > >
> > > > Cheers,
> > > > Morten
> > > >
> > > > On Wed, 27 Mar 2019 at 07:22, Haifeng Zhang <[hidden email]
> >
> > > > wrote:
> > > >
> > > > > Dear folks,
> > > > >
> > > > > Is there a good way to query a user's edit history, e.g., edit
> count
> > > > > during a period?
> > > > >
> > > > > My current solution is using usercontribs API (
> > > > > https://www.mediawiki.org/wiki/API:Usercontribs).
> > > > >
> > > > > But, the process has been stalled maybe due to some query limit.
> > > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Haifeng Zhang
> > > > > _______________________________________________
> > > > > Wiki-research-l mailing list
> > > > > [hidden email]
> > > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > > >
> > > > _______________________________________________
> > > > Wiki-research-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > _______________________________________________
> > > Wiki-research-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


--
Nick "Quiddity" Wilson (he/him)
Community Engagement - Documentation
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l