Wikiscan statistics tool for Wikimedia projects

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Wikiscan statistics tool for Wikimedia projects

Pine W
Wikiscan is an interesting tool for statistics fans. I suggest briefly
reading this IEG page
<https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
playing with the tool on https://wikiscan.org/

Pine
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

יגאל חיטרון-2
Hello. It's amazing, thank you very much!
Could I suggest one more feature, please? With it, the tool will be
perfect. I'm talking about aggregation. Any kind of historical statistics
for some day, month or year can be also shown as range of time. For
example, if we have month statistics, we could fill From field to be Jan
2008 and To field to be May 2011, and get the aggregated numbers for this
range. Is it possible?
Thank you very much again,
Igal (User:IKhitron)

On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:

> Wikiscan is an interesting tool for statistics fans. I suggest briefly
> reading this IEG page
> <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
> playing with the tool on https://wikiscan.org/
>
> Pine
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

Akeron-2
In reply to this post by Pine W
Hi Igal,
All suggestions are welcome :)
Supporting this feature shouldn't be too difficult in theory because it is
already working with this kind of aggregation (month are built from days,
years from months...). The main problem is scalability for stats which
require uniqueness like number of users or number of edits *per page*.
That's why yearly stats can actually be disabled on some big wikis. So it
would be feasible but with edits limitations for the range (like 3-5
millions) and it would be very slow to load with lots of edits.

Akeron

2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:

> Hello. It's amazing, thank you very much!
> Could I suggest one more feature, please? With it, the tool will be
> perfect. I'm talking about aggregation. Any kind of historical statistics
> for some day, month or year can be also shown as range of time. For
> example, if we have month statistics, we could fill From field to be Jan
> 2008 and To field to be May 2011, and get the aggregated numbers for this
> range. Is it possible?
> Thank you very much again,
> Igal (User:IKhitron)
>
> On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
>
> > Wikiscan is an interesting tool for statistics fans. I suggest briefly
> > reading this IEG page
> > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
> > playing with the tool on https://wikiscan.org/
> >
> > Pine
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

יגאל חיטרון-2
Thank you. I will not say that I understood your explanation, but I'll try:
If you have number of viewers of some page for every year, can't you get
the sum of them and compare with another page to sort them? And the same
for the number of some user's edits?
Igal

On Jul 31, 2017 17:19, "Akeron" <[hidden email]> wrote:

> Hi Igal,
> All suggestions are welcome :)
> Supporting this feature shouldn't be too difficult in theory because it is
> already working with this kind of aggregation (month are built from days,
> years from months...). The main problem is scalability for stats which
> require uniqueness like number of users or number of edits *per page*.
> That's why yearly stats can actually be disabled on some big wikis. So it
> would be feasible but with edits limitations for the range (like 3-5
> millions) and it would be very slow to load with lots of edits.
>
> Akeron
>
> 2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:
>
> > Hello. It's amazing, thank you very much!
> > Could I suggest one more feature, please? With it, the tool will be
> > perfect. I'm talking about aggregation. Any kind of historical statistics
> > for some day, month or year can be also shown as range of time. For
> > example, if we have month statistics, we could fill From field to be Jan
> > 2008 and To field to be May 2011, and get the aggregated numbers for this
> > range. Is it possible?
> > Thank you very much again,
> > Igal (User:IKhitron)
> >
> > On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
> >
> > > Wikiscan is an interesting tool for statistics fans. I suggest briefly
> > > reading this IEG page
> > > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
> > > playing with the tool on https://wikiscan.org/
> > >
> > > Pine
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Analytics] Wikiscan statistics tool for Wikimedia projects

Erik Bernhardson
In reply to this post by Akeron-2
On Mon, Jul 31, 2017 at 7:18 AM, Akeron <[hidden email]> wrote:

> Hi Igal,
> All suggestions are welcome :)
> Supporting this feature shouldn't be too difficult in theory because it is
> already working with this kind of aggregation (month are built from days,
> years from months...). The main problem is scalability for stats which
> require uniqueness like number of users or number of edits *per page*.
> That's why yearly stats can actually be disabled on some big wikis. So it
> would be feasible but with edits limitations for the range (like 3-5
> millions) and it would be very slow to load with lots of edits.
>

One way to handle the scalability problem is to use HyperLogLog counters.
These are an approximate algorithm for which you can store daily counters,
and then merge the counters to get weekly/monthly/etc, avoiding the cost of
doing the calculation over something like an entire year just for the one
stat.  Of course because these are approximate they may not be exactly what
you are looking for, just an idea.


>
> Akeron
>
> 2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:
>
>> Hello. It's amazing, thank you very much!
>> Could I suggest one more feature, please? With it, the tool will be
>> perfect. I'm talking about aggregation. Any kind of historical statistics
>> for some day, month or year can be also shown as range of time. For
>> example, if we have month statistics, we could fill From field to be Jan
>> 2008 and To field to be May 2011, and get the aggregated numbers for this
>> range. Is it possible?
>> Thank you very much again,
>> Igal (User:IKhitron)
>>
>> On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
>>
>> > Wikiscan is an interesting tool for statistics fans. I suggest briefly
>> > reading this IEG page
>> > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
>> > playing with the tool on https://wikiscan.org/
>> >
>> > Pine
>> > _______________________________________________
>> > Wikitech-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
>
> _______________________________________________
> Analytics mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

Akeron-2
In reply to this post by יגאל חיטרון-2
It depends of the size of the dataset, if you already know the pages or
users you want to compare (limited dataset with reasonable quantity) then
there is no scalability issue and it should not be too difficult to
implement. Otherwise it require a lot of resources to merge the numbers for
every pages or users over the period with a large dataset.


2017-07-31 16:59 GMT+02:00 יגאל חיטרון <[hidden email]>:

> Thank you. I will not say that I understood your explanation, but I'll try:
> If you have number of viewers of some page for every year, can't you get
> the sum of them and compare with another page to sort them? And the same
> for the number of some user's edits?
> Igal
>
> On Jul 31, 2017 17:19, "Akeron" <[hidden email]> wrote:
>
> > Hi Igal,
> > All suggestions are welcome :)
> > Supporting this feature shouldn't be too difficult in theory because it
> is
> > already working with this kind of aggregation (month are built from days,
> > years from months...). The main problem is scalability for stats which
> > require uniqueness like number of users or number of edits *per page*.
> > That's why yearly stats can actually be disabled on some big wikis. So it
> > would be feasible but with edits limitations for the range (like 3-5
> > millions) and it would be very slow to load with lots of edits.
> >
> > Akeron
> >
> > 2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:
> >
> > > Hello. It's amazing, thank you very much!
> > > Could I suggest one more feature, please? With it, the tool will be
> > > perfect. I'm talking about aggregation. Any kind of historical
> statistics
> > > for some day, month or year can be also shown as range of time. For
> > > example, if we have month statistics, we could fill From field to be
> Jan
> > > 2008 and To field to be May 2011, and get the aggregated numbers for
> this
> > > range. Is it possible?
> > > Thank you very much again,
> > > Igal (User:IKhitron)
> > >
> > > On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
> > >
> > > > Wikiscan is an interesting tool for statistics fans. I suggest
> briefly
> > > > reading this IEG page
> > > > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>,
> then
> > > > playing with the tool on https://wikiscan.org/
> > > >
> > > > Pine
> > > > _______________________________________________
> > > > Wikitech-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

יגאל חיטרון-2
I see. Pity.
Igal

On Jul 31, 2017 19:38, "Akeron" <[hidden email]> wrote:

> It depends of the size of the dataset, if you already know the pages or
> users you want to compare (limited dataset with reasonable quantity) then
> there is no scalability issue and it should not be too difficult to
> implement. Otherwise it require a lot of resources to merge the numbers for
> every pages or users over the period with a large dataset.
>
>
> 2017-07-31 16:59 GMT+02:00 יגאל חיטרון <[hidden email]>:
>
> > Thank you. I will not say that I understood your explanation, but I'll
> try:
> > If you have number of viewers of some page for every year, can't you get
> > the sum of them and compare with another page to sort them? And the same
> > for the number of some user's edits?
> > Igal
> >
> > On Jul 31, 2017 17:19, "Akeron" <[hidden email]> wrote:
> >
> > > Hi Igal,
> > > All suggestions are welcome :)
> > > Supporting this feature shouldn't be too difficult in theory because it
> > is
> > > already working with this kind of aggregation (month are built from
> days,
> > > years from months...). The main problem is scalability for stats which
> > > require uniqueness like number of users or number of edits *per page*.
> > > That's why yearly stats can actually be disabled on some big wikis. So
> it
> > > would be feasible but with edits limitations for the range (like 3-5
> > > millions) and it would be very slow to load with lots of edits.
> > >
> > > Akeron
> > >
> > > 2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:
> > >
> > > > Hello. It's amazing, thank you very much!
> > > > Could I suggest one more feature, please? With it, the tool will be
> > > > perfect. I'm talking about aggregation. Any kind of historical
> > statistics
> > > > for some day, month or year can be also shown as range of time. For
> > > > example, if we have month statistics, we could fill From field to be
> > Jan
> > > > 2008 and To field to be May 2011, and get the aggregated numbers for
> > this
> > > > range. Is it possible?
> > > > Thank you very much again,
> > > > Igal (User:IKhitron)
> > > >
> > > > On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
> > > >
> > > > > Wikiscan is an interesting tool for statistics fans. I suggest
> > briefly
> > > > > reading this IEG page
> > > > > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>,
> > then
> > > > > playing with the tool on https://wikiscan.org/
> > > > >
> > > > > Pine
> > > > > _______________________________________________
> > > > > Wikitech-l mailing list
> > > > > [hidden email]
> > > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > > _______________________________________________
> > > > Wikitech-l mailing list
> > > > [hidden email]
> > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > > _______________________________________________
> > > Wikitech-l mailing list
> > > [hidden email]
> > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Analytics] Wikiscan statistics tool for Wikimedia projects

Akeron-2
In reply to this post by Erik Bernhardson
Thanks Eric, it looks interesting. Actually I am able to maintain a full
dataset for users but not for pages on big wikis, it may be a good
alternative to display the approximative number of edited pages over a
month or more.

2017-07-31 17:22 GMT+02:00 Erik Bernhardson <[hidden email]>:

> On Mon, Jul 31, 2017 at 7:18 AM, Akeron <[hidden email]> wrote:
>
>> Hi Igal,
>> All suggestions are welcome :)
>> Supporting this feature shouldn't be too difficult in theory because it
>> is already working with this kind of aggregation (month are built from
>> days, years from months...). The main problem is scalability for stats
>> which require uniqueness like number of users or number of edits *per
>> page*. That's why yearly stats can actually be disabled on some big wikis.
>> So it would be feasible but with edits limitations for the range (like 3-5
>> millions) and it would be very slow to load with lots of edits.
>>
>
> One way to handle the scalability problem is to use HyperLogLog counters.
> These are an approximate algorithm for which you can store daily counters,
> and then merge the counters to get weekly/monthly/etc, avoiding the cost of
> doing the calculation over something like an entire year just for the one
> stat.  Of course because these are approximate they may not be exactly what
> you are looking for, just an idea.
>
>
>>
>> Akeron
>>
>> 2017-07-31 14:29 GMT+02:00 יגאל חיטרון <[hidden email]>:
>>
>>> Hello. It's amazing, thank you very much!
>>> Could I suggest one more feature, please? With it, the tool will be
>>> perfect. I'm talking about aggregation. Any kind of historical statistics
>>> for some day, month or year can be also shown as range of time. For
>>> example, if we have month statistics, we could fill From field to be Jan
>>> 2008 and To field to be May 2011, and get the aggregated numbers for this
>>> range. Is it possible?
>>> Thank you very much again,
>>> Igal (User:IKhitron)
>>>
>>> On Jul 30, 2017 22:18, "Pine W" <[hidden email]> wrote:
>>>
>>> > Wikiscan is an interesting tool for statistics fans. I suggest briefly
>>> > reading this IEG page
>>> > <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
>>> > playing with the tool on https://wikiscan.org/
>>> >
>>> > Pine
>>> > _______________________________________________
>>> > Wikitech-l mailing list
>>> > [hidden email]
>>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>> _______________________________________________
>>> Wikitech-l mailing list
>>> [hidden email]
>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>> _______________________________________________
>> Analytics mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>
> _______________________________________________
> Analytics mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

Vira Motorko
In reply to this post by Pine W
Hi,

I have a question here. Looking at Stegosaurus-looking graphs on
http://uk.wikiscan.org I have a hard time understanding what they show.
Users/IPs/bots who got registered? who edit? something else?

I'm sorry for being dumb today.

*--*
*Vira Motorko // Віра Моторко*
Wikimedia Ukraine <https://ua.wikimedia.org/> nonprofit organisation // ГО
«Вікімедіа Україна»
mobile: +380667740499 | facebook: vira.motorko
<https://www.facebook.com/vira.motorko> | wikipedia: Ата
<https://meta.wikimedia.org/wiki/User:Ата>

If this email is about your daily job and it reaches you outside of the
working hours, please, feel free to answer when it's appropriate! // Якщо
це робочий лист і Ви отримали його не в робочий час, будь ласка,
відповідайте, коли вважаєте за потрібне!
Якщо маєте електронну скриньку на зразок [hidden email], задумайтесь, будь ласка,
над її зміною. Дякую!

2017-07-30 22:17 GMT+03:00 Pine W <[hidden email]>:

> Wikiscan is an interesting tool for statistics fans. I suggest briefly
> reading this IEG page
> <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
> playing with the tool on https://wikiscan.org/
>
> Pine
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Wikiscan statistics tool for Wikimedia projects

Strainu
Hi Vira,

I'm not 100% sure, but if I understand correctly, these are basically the
same stats from the official reports, only at day level:

- users/day: how many authenticated users were active (aka made
edits), on average, in a day of that year
- ip/day: how many anonymous users were active on average, in a day of
that year.
- user edits/day: how many edits were made by authenticated users in a
day, on average
... and so on.

HTH,
  Strainu

2017-08-14 14:15 GMT+03:00 Vira Motorko <[hidden email]>:

> Hi,
>
> I have a question here. Looking at Stegosaurus-looking graphs on
> http://uk.wikiscan.org I have a hard time understanding what they show.
> Users/IPs/bots who got registered? who edit? something else?
>
> I'm sorry for being dumb today.
>
> *--*
> *Vira Motorko // Віра Моторко*
> Wikimedia Ukraine <https://ua.wikimedia.org/> nonprofit organisation // ГО
> «Вікімедіа Україна»
> mobile: +380667740499 | facebook: vira.motorko
> <https://www.facebook.com/vira.motorko> | wikipedia: Ата
> <https://meta.wikimedia.org/wiki/User:Ата>
>
> If this email is about your daily job and it reaches you outside of the
> working hours, please, feel free to answer when it's appropriate! // Якщо
> це робочий лист і Ви отримали його не в робочий час, будь ласка,
> відповідайте, коли вважаєте за потрібне!
> Якщо маєте електронну скриньку на зразок [hidden email], задумайтесь, будь ласка,
> над її зміною. Дякую!
>
> 2017-07-30 22:17 GMT+03:00 Pine W <[hidden email]>:
>
>> Wikiscan is an interesting tool for statistics fans. I suggest briefly
>> reading this IEG page
>> <https://meta.wikimedia.org/wiki/Grants:IEG/Wikiscan_multi-wiki>, then
>> playing with the tool on https://wikiscan.org/
>>
>> Pine
>> _______________________________________________
>> Wikitech-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Loading...