Wikistats

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Wikistats

Anthony-73
Regarding the files at http://dammit.lt/wikistats/ :

What are "en.b", "en.d", "en2", etc?

Are edits included, or only views?

Are the hit counts actual, or 1/10th sampled, or something else?

pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz>is
the hour *beginning* 20:00:00?
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Domas Mituzas
Hello Anthony,

I'm back at my lair (phew, finally ;-)

> Regarding the files at http://dammit.lt/wikistats/ :
> What are "en.b", "en.d", "en2", etc?

suffixes indicate projects - from http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/filter.c?revision=34989&view=markup 
  :

projects[] = {
                {"wikipedia","",NULL},
                {"wiktionary",".d",NULL},
                {"wikinews",".n",NULL},
                {"wikimedia",".m",check_wikimedia},
                {"wikibooks",".b",NULL},
                {"wikisource",".s",NULL},
                {"mediawiki",".w",NULL},
                {"wikiversity",".v",NULL},
                {"wikiquote",".q",NULL},
                NULL
        },

en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a  
time, and apparently there're some referrals.

> Are edits included, or only views?

That is views only - though you can find actual logic in above file,  
it is mostly this pattern:

http://*.*.org/wiki/*

which is what we have for special pages and views.

> Are the hit counts actual, or 1/10th sampled, or something else?

They are actual, with duplicates removed (that is, we don't count in  
cache-to-cache traffic, only end-user-to-cache).

> pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz 
> >is
> the hour *beginning* 20:00:00?

ending, I think. let me check, yes, end time. logic is in  
produceDump() at http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/collector.c?revision=30113&view=markup 
  :)

I think I may end up documenting this somewhat more, but I need to do  
some promised and long overdue development on this project.

Domas

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Nikola Smolenski
Domas Mituzas wrote:

> Hello Anthony,
>
> I'm back at my lair (phew, finally ;-)
>
>> Regarding the files at http://dammit.lt/wikistats/ :
>> What are "en.b", "en.d", "en2", etc?
>
> suffixes indicate projects - from http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/filter.c?revision=34989&view=markup 
>   :
>
> projects[] = {
> {"wikipedia","",NULL},
> {"wiktionary",".d",NULL},
> {"wikinews",".n",NULL},
> {"wikimedia",".m",check_wikimedia},
> {"wikibooks",".b",NULL},
> {"wikisource",".s",NULL},
> {"mediawiki",".w",NULL},
> {"wikiversity",".v",NULL},
> {"wikiquote",".q",NULL},
> NULL
> },
>
> en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a  
> time, and apparently there're some referrals.
>
>> Are edits included, or only views?
>
> That is views only - though you can find actual logic in above file,  
> it is mostly this pattern:
>
> http://*.*.org/wiki/*
>
> which is what we have for special pages and views.
>
>> Are the hit counts actual, or 1/10th sampled, or something else?
>
> They are actual, with duplicates removed (that is, we don't count in  
> cache-to-cache traffic, only end-user-to-cache).
>
>> pagecounts-20090501-200000.gz<http://dammit.lt/wikistats/pagecounts-20090501-200000.gz 
>>> is
>> the hour *beginning* 20:00:00?
>
> ending, I think. let me check, yes, end time. logic is in  
> produceDump() at http://svn.wikimedia.org/viewvc/mediawiki/trunk/webstatscollector/collector.c?revision=30113&view=markup 
>   :)
>
> I think I may end up documenting this somewhat more, but I need to do  
> some promised and long overdue development on this project.

If no one minds, I think I will copy this email to the toolserver wiki :)

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Platonides
In reply to this post by Domas Mituzas
Domas Mituzas wrote:
>> Are edits included, or only views?
>
> That is views only - though you can find actual logic in above file,  
> it is mostly this pattern:
>
> http://*.*.org/wiki/*
>
> which is what we have for special pages and views.

However, note that after saving an edit, the editor will be sent to a view.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Domas Mituzas
Hi,
> However, note that after saving an edit, the editor will be sent to  
> a view.

yes, you're absolutely right, but no differentiation is done on that.  
technically, you're not editing, you're viewing :)

Domas

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Platonides
Domas Mituzas wrote:
> Hi,
>> However, note that after saving an edit, the editor will be sent to  
>> a view.
>
> yes, you're absolutely right, but no differentiation is done on that.  
> technically, you're not editing, you're viewing :)
>
> Domas

I know, but its worth remembering that to people who might want to do
some kind of edit differenciating.


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Andre Engels
In reply to this post by Domas Mituzas
On Mon, Aug 31, 2009 at 11:03 AM, Domas Mituzas<[hidden email]> wrote:

> en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a
> time, and apparently there're some referrals.

Wikimedia news, October 2003:
--
A portion of traffic to "www.wikipedia.org" will be diverted to
"en2.wikipedia.org", while most of it will go to "en.wikipedia.org",
where all logins will be directed. Until the server configuration is
more stable and transparent load-sharing is set up, this should help
share some of the traffic without burdening the other wikis too
greatly.
--

I think the reason that en got the lion's share is that en2 was on one
machine with the other languages whereas en was on a machine on its
own. At that time apparently en: still had significantly more traffic
than all other languages taken together.

--
André Engels, [hidden email]

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Domas Mituzas
Andre,

> Wikimedia news, October 2003:

Thanks for that! Awesome artifact ;-)

Domas

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikistats

Brion Vibber-3
In reply to this post by Andre Engels
On 8/31/09 7:51 AM, Andre Engels wrote:

> On Mon, Aug 31, 2009 at 11:03 AM, Domas Mituzas<[hidden email]>  wrote:
>
>> en2 is, um, http://en2.wikipedia.org/ ;-) it used to exist once upon a
>> time, and apparently there're some referrals.
>
> Wikimedia news, October 2003:
> --
> A portion of traffic to "www.wikipedia.org" will be diverted to
> "en2.wikipedia.org", while most of it will go to "en.wikipedia.org",
> where all logins will be directed. Until the server configuration is
> more stable and transparent load-sharing is set up, this should help
> share some of the traffic without burdening the other wikis too
> greatly.
> --
>
> I think the reason that en got the lion's share is that en2 was on one
> machine with the other languages whereas en was on a machine on its
> own. At that time apparently en: still had significantly more traffic
> than all other languages taken together.

Ah, the good old days! Sure glad we figured out Squid soon after that... ;)

-- brion

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l