Wikimedia access data

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Wikimedia access data

Erik Moeller-4
For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats:

http://dammit.lt/wikistats/

This is real, accurate hourly snapshot data on the access to Wikipedia
captured from the Wikimedia Squid servers. Project counts show the
total access in a time period to the different language editions.

This is great stuff for visualization, behavioral pattern analysis,
and other purposes. If you do something with it, let us know. :-)

URL may change in the future - we'll put a redirect on the above one
if that happens.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Ward Cunningham
Thanks to Domas for collecting this and thanks to Erik for mentioning it. 

Here is a command line I used to look for big numbers:

perl -n -e 'print if /^en .*\d\d\d\d$/' pagecounts-20080220-160000 | less

And here is a command line I used to make an html page that let me browse those pages:

perl -n -e 'print if s/^en (.*?) \d+ (\d{4,})$/<a href="http:\/\/en.wikipedia.org\/wiki\/$1">$1<\/a> $2<\/br>/' pagecounts-20080220-160000 >xx.html

Best regards. -- Ward

__________________
Ward Cunningham
503-432-5682





On Feb 18, 2008, at 2:12 PM, Erik Moeller wrote:

For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats:

http://dammit.lt/wikistats/

This is real, accurate hourly snapshot data on the access to Wikipedia
captured from the Wikimedia Squid servers. Project counts show the
total access in a time period to the different language editions.

This is great stuff for visualization, behavioral pattern analysis,
and other purposes. If you do something with it, let us know. :-)

URL may change in the future - we'll put a redirect on the above one
if that happens.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Phanikumar Bhamidipati
In reply to this post by Erik Moeller-4
Hi All,

We are two research students looking for Wikipedia access data. We tried to use the statistics available at http://dammit.lt/wikistats/. But, we would like to know these data in detail: drilled down to per page access, i.e., triplets of the form <Page, IP, Date&Time>.

Could you please let us know if we can get such information? The IP details can be anonymous, if required. We are only looking for a detailed Wikipedia page access log information.

Thanks,
Phani

On Tue, Feb 19, 2008 at 3:42 AM, Erik Moeller <[hidden email]> wrote:
For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats:

http://dammit.lt/wikistats/

This is real, accurate hourly snapshot data on the access to Wikipedia
captured from the Wikimedia Squid servers. Project counts show the
total access in a time period to the different language editions.

This is great stuff for visualization, behavioral pattern analysis,
and other purposes. If you do something with it, let us know. :-)

URL may change in the future - we'll put a redirect on the above one
if that happens.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Phani
http://telugabbai.wordpress.com/

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Liam Wyatt
Have you tried: http://stats.grok.se/
This is not what you want exactly, but it does give detailed listings of pageviews per article. 
I don't think you'll be able to find stats for viewing rates per IP address but I might be wrong. You can for editing stats, but not viewing stats. 

Best, 
-Liam

www.wikipediaweekly.org
Skype - Wittylama
Wikipedia - [[User:Witty lama]]



On 31/08/2008, at 4:10 PM, Phanikumar Bhamidipati wrote:

Hi All,

We are two research students looking for Wikipedia access data. We tried to use the statistics available at http://dammit.lt/wikistats/. But, we would like to know these data in detail: drilled down to per page access, i.e., triplets of the form <Page, IP, Date&Time>.

Could you please let us know if we can get such information? The IP details can be anonymous, if required. We are only looking for a detailed Wikipedia page access log information.

Thanks,
Phani

On Tue, Feb 19, 2008 at 3:42 AM, Erik Moeller <[hidden email]> wrote:
For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats:

http://dammit.lt/wikistats/

This is real, accurate hourly snapshot data on the access to Wikipedia
captured from the Wikimedia Squid servers. Project counts show the
total access in a time period to the different language editions.

This is great stuff for visualization, behavioral pattern analysis,
and other purposes. If you do something with it, let us know. :-)

URL may change in the future - we'll put a redirect on the above one
if that happens.
--
Erik Möller
Deputy Director, Wikimedia Foundation

Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



--
Phani
http://telugabbai.wordpress.com/
_______________________________________________
Wiki-research-l mailing list


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Reid Priedhorsky-2
In reply to this post by Phanikumar Bhamidipati
Phanikumar Bhamidipati wrote:

> Hi All,
>
> We are two research students looking for Wikipedia access data. We tried to
> use the statistics available at http://dammit.lt/wikistats/. But, we would
> like to know these data in detail: drilled down to per page access, i.e.,
> triplets of the form <Page, IP, Date&Time>.
>
> Could you please let us know if we can get such information? The IP details
> can be anonymous, if required. We are only looking for a detailed Wikipedia
> page access log information.

Wikimedia was kind enough to share a 1/10 streaming sample of their
access logs with us and several other researchers. I do not know if they
still do this. It's a LOT of data: several gigs per day even after
compression. They consider IP addresses to be private data and share
only <Page, Date&Time>.

Our contact for this is Tim Starling, [hidden email] I think.

Reid

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Archives Wiki

Liam Wyatt

Dear All, 

I've just been informed that the American Historical Association (AHA) has begun their own wiki: "Archives Wiki"
It is a media-wiki distribution and is intended to be a clearing house of information for historians about the various archives that professional historians use around the world - the opening hours, the quirky cataloging, the important collections etc etc. 
The announcement was made in the association's fortnightly newsletter here:

I encourage anyone who is a historian or uses archives in general to log in there and help out. This could be a very useful resource in its own right but also, if it works, will help break down the barriers to having more professional academics using and helping out on wikimedia projects. 

All the best, 
-Liam Wyatt

wikipediaweekly.org
Skype - Wittylama
Wikipedia - [[User:Witty lama]]

_______________________________________________
Wiki-research-l mailing list


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Phanikumar Bhamidipati
In reply to this post by Reid Priedhorsky-2
Thanks for sharing the info. We are looking for the IP addresses as well, with anonymous names (like A, B, C) because we want to identify user sessions from the log.

Regards,
Phani

On Wed, Sep 3, 2008 at 1:22 AM, Reid Priedhorsky <[hidden email]> wrote:
Phanikumar Bhamidipati wrote:
> Hi All,
>
> We are two research students looking for Wikipedia access data. We tried to
> use the statistics available at http://dammit.lt/wikistats/. But, we would
> like to know these data in detail: drilled down to per page access, i.e.,
> triplets of the form <Page, IP, Date&Time>.
>
> Could you please let us know if we can get such information? The IP details
> can be anonymous, if required. We are only looking for a detailed Wikipedia
> page access log information.

Wikimedia was kind enough to share a 1/10 streaming sample of their
access logs with us and several other researchers. I do not know if they
still do this. It's a LOT of data: several gigs per day even after
compression. They consider IP addresses to be private data and share
only <Page, Date&Time>.

Our contact for this is Tim Starling, [hidden email] I think.

Reid


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikimedia access data

Han-Teng Liao (OII)
In reply to this post by Erik Moeller-4
Dear Erik,
   Is there anywhere that I can manage the access data to Chinese Wikipedia more easily?  I am trying to visualize how Chinese Wikipedia is being accessed before and after the Olympics/block/unblock.  It seems according to the current published wikistats, I have to do some painful 'reverse engineering'.
Best regards,

Liao,Han-Teng
DPhil student at the OII(web)
needs you(blog)
Erik Moeller wrote:
For those of you who haven't seen it, take a look at Domas' Mituzas wiki-stats:

http://dammit.lt/wikistats/

This is real, accurate hourly snapshot data on the access to Wikipedia
captured from the Wikimedia Squid servers. Project counts show the
total access in a time period to the different language editions.

This is great stuff for visualization, behavioral pattern analysis,
and other purposes. If you do something with it, let us know. :-)

URL may change in the future - we'll put a redirect on the above one
if that happens.
  



_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l