Re: Article Lifecycle stats

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Article Lifecycle stats

Peter Ekman
  I'll suggest Wikihistory, e.g.
https://tools.wmflabs.org/xtools/wikihistory/wh.php?page_title=Tulip_mania
which gives all the editors (ranked by number of edits), article
size(?) and edits (per year, month, or even weeks).
There's a bit more at
https://en.wikipedia.org/w/index.php?title=Tulip_mania&action=info#mw-pageinfo-watchers
which includes info on page watchers, recent edits and a wikidata
link.
Page views at https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-30&pages=Tulip_mania
only goes back a couple of years.  Before that is an inconsistent
series (somewhere)
All these are available from the history tab on the article page
The only other thing that I'd want is the ORES scores (AI quality
prediction for any individual version given the permid).
Is this best place to get these at
https://ores.wmflabs.org/v2/scores/enwiki/wp10/?revids=769824240  ?
Is there an easy way to get a regular-interval time series of these?
(I wouldn't expect a complete time series for 1,000s of versions!)

Hope this helps.

Peete






=====
Message: 1
Date: Wed, 15 Mar 2017 21:18:59 +1300
From: "Stuart A. Yeates" <[hidden email]>
To: Research into Wikimedia content and communities
        <[hidden email]>
Subject: [Wiki-research-l] tool / framework for article lifecycle
        stats ?
Message-ID:
        <CAC_Lu0bjKgdcRYeKVr2U9Cr_hzatW10jiVi8=[hidden email]>
Content-Type: text/plain; charset="utf-8"

Is there a tool or framework forgetting article lifecycle stats in an
automated fashion. Is anyone aware of something like that? Things like
creator (+ their basic stats), total # of edits, who's edited the article
(+ their basic stats), article age, article flags, etc.

I'm reasonably platform / language agnostic. I'll only need stats on dozens
of articles an hour, so no need for a weaponised platform.


cheers
stuart
--
...let us be heard from red core to black sky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/wiki-research-l/attachments/20170315/57c6f7b0/attachment-0001.html>

------------------------------

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Article Lifecycle stats

Jonathan Morgan
Hi Peter,

Re: your question about getting historical ORES article quality predictions, there's an open ticket on Phabricator to add these data to the public replicas hosted on Labs (and therefore accessible via both SSH tunnel and Quarry). Chime in on the discussion if you'd like to see these tables added!

Right now I believe the best way to access historical scores is to download and parse the full dataset. But halfak or others may be aware of better methods.

Best,
Jonathan

On Wed, Mar 15, 2017 at 7:26 AM, Peter Ekman <[hidden email]> wrote:
  I'll suggest Wikihistory, e.g.
https://tools.wmflabs.org/xtools/wikihistory/wh.php?page_title=Tulip_mania
which gives all the editors (ranked by number of edits), article
size(?) and edits (per year, month, or even weeks).
There's a bit more at
https://en.wikipedia.org/w/index.php?title=Tulip_mania&action=info#mw-pageinfo-watchers
which includes info on page watchers, recent edits and a wikidata
link.
Page views at https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-30&pages=Tulip_mania
only goes back a couple of years.  Before that is an inconsistent
series (somewhere)
All these are available from the history tab on the article page
The only other thing that I'd want is the ORES scores (AI quality
prediction for any individual version given the permid).
Is this best place to get these at
https://ores.wmflabs.org/v2/scores/enwiki/wp10/?revids=769824240  ?
Is there an easy way to get a regular-interval time series of these?
(I wouldn't expect a complete time series for 1,000s of versions!)

Hope this helps.

Peete






=====
Message: 1
Date: Wed, 15 Mar 2017 21:18:59 +1300
From: "Stuart A. Yeates" <[hidden email]>
To: Research into Wikimedia content and communities
        <[hidden email]>
Subject: [Wiki-research-l] tool / framework for article lifecycle
        stats ?
Message-ID:
        <CAC_Lu0bjKgdcRYeKVr2U9Cr_hzatW10jiVi8=[hidden email]>
Content-Type: text/plain; charset="utf-8"

Is there a tool or framework forgetting article lifecycle stats in an
automated fashion. Is anyone aware of something like that? Things like
creator (+ their basic stats), total # of edits, who's edited the article
(+ their basic stats), article age, article flags, etc.

I'm reasonably platform / language agnostic. I'll only need stats on dozens
of articles an hour, so no need for a weaponised platform.


cheers
stuart
--
...let us be heard from red core to black sky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/wiki-research-l/attachments/20170315/57c6f7b0/attachment-0001.html>

------------------------------

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Article Lifecycle stats

Aaron Halfaker-3
Regretfully, we don't have anything better than what J-Mo pointed out.  If you'd like to see this dataset hosted on labs, please do say so on that ticket.  Our DBAs will need to spend some time working to allow this dataset to be made available.  But we can help show that this is a priority.  Right now, my statements about "potential use-cases" hold less weight than people who really want to query this dataset in a public space like Quarry or from Tool Labs. 

-Aaron

On Wed, Mar 15, 2017 at 12:31 PM, Jonathan Morgan <[hidden email]> wrote:
Hi Peter,

Re: your question about getting historical ORES article quality predictions, there's an open ticket on Phabricator to add these data to the public replicas hosted on Labs (and therefore accessible via both SSH tunnel and Quarry). Chime in on the discussion if you'd like to see these tables added!

Right now I believe the best way to access historical scores is to download and parse the full dataset. But halfak or others may be aware of better methods.

Best,
Jonathan

On Wed, Mar 15, 2017 at 7:26 AM, Peter Ekman <[hidden email]> wrote:
  I'll suggest Wikihistory, e.g.
https://tools.wmflabs.org/xtools/wikihistory/wh.php?page_title=Tulip_mania
which gives all the editors (ranked by number of edits), article
size(?) and edits (per year, month, or even weeks).
There's a bit more at
https://en.wikipedia.org/w/index.php?title=Tulip_mania&action=info#mw-pageinfo-watchers
which includes info on page watchers, recent edits and a wikidata
link.
Page views at https://tools.wmflabs.org/pageviews/?project=en.wikipedia.org&platform=all-access&agent=user&range=latest-30&pages=Tulip_mania
only goes back a couple of years.  Before that is an inconsistent
series (somewhere)
All these are available from the history tab on the article page
The only other thing that I'd want is the ORES scores (AI quality
prediction for any individual version given the permid).
Is this best place to get these at
https://ores.wmflabs.org/v2/scores/enwiki/wp10/?revids=769824240  ?
Is there an easy way to get a regular-interval time series of these?
(I wouldn't expect a complete time series for 1,000s of versions!)

Hope this helps.

Peete






=====
Message: 1
Date: Wed, 15 Mar 2017 21:18:59 +1300
From: "Stuart A. Yeates" <[hidden email]>
To: Research into Wikimedia content and communities
        <[hidden email]>
Subject: [Wiki-research-l] tool / framework for article lifecycle
        stats ?
Message-ID:
        <CAC_Lu0bjKgdcRYeKVr2U9Cr_hzatW10jiVi8=[hidden email]>
Content-Type: text/plain; charset="utf-8"

Is there a tool or framework forgetting article lifecycle stats in an
automated fashion. Is anyone aware of something like that? Things like
creator (+ their basic stats), total # of edits, who's edited the article
(+ their basic stats), article age, article flags, etc.

I'm reasonably platform / language agnostic. I'll only need stats on dozens
of articles an hour, so no need for a weaponised platform.


cheers
stuart
--
...let us be heard from red core to black sky
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/wiki-research-l/attachments/20170315/57c6f7b0/attachment-0001.html>

------------------------------

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l