Re: Wiki-research-l Digest, Vol 117, Issue 14

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Wiki-research-l Digest, Vol 117, Issue 14

Alex Druk
I just grep monthly totals from Erik Zachte http://dumps.wikimedia.org/other/pagecounts-ez/merged/ (grep "^en.z  Special:Random ")

On Mon, May 11, 2015 at 2:00 PM, <[hidden email]> wrote:
Send Wiki-research-l mailing list submissions to
        [hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
or, via email, send a message with subject or body 'help' to
        [hidden email]

You can reach the person managing the list at
        [hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Wiki-research-l digest..."


Today's Topics:

   1. Re: How to explain drop in random searches (Oliver Keyes)


----------------------------------------------------------------------

Message: 1
Date: Sun, 10 May 2015 08:30:37 -0400
From: Oliver Keyes <[hidden email]>
To: Research into Wikimedia content and communities
        <[hidden email]>
Subject: Re: [Wiki-research-l] How to explain drop in random searches
Message-ID:
        <[hidden email]>
Content-Type: text/plain; charset=UTF-8

Using what data?

On 10 May 2015 at 05:29, Alex Druk <[hidden email]> wrote:
> Hi everyone,
>
>
>
> I try to learn dynamic of random searches (Special:Random) on English
> Wikipedia.
>
> From 01/2012 to 10/2014 average number of random searches per month was
> about 86 millions or about 30% of Main_Page pageviews, but from November
> 2014 it drop to 31,000 per month (or 0.008% of Main_page).
>
> How to explain such a dramatic drop? Any ideas?
>
>
> --
> Thank you.
>
> Alex Druk, PhD
> wikipediatrends.com
> [hidden email]
> <a href="tel:%28775%29%20237-8550" value="+17752378550">(775) 237-8550 Google voice
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation



------------------------------

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


End of Wiki-research-l Digest, Vol 117, Issue 14
************************************************



--
Thank you.

Alex Druk
[hidden email]
(775) 237-8550 Google voice

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wiki-research-l Digest, Vol 117, Issue 14

Oliver Keyes-4
A reduction or alteration in automata activity, possibly? Erik's dumps
contain literally no filtering for scammers or crawlers, and we're a
hot locale for spammer activity.

On 11 May 2015 at 08:09, Alex Druk <[hidden email]> wrote:

> I just grep monthly totals from Erik Zachte
> http://dumps.wikimedia.org/other/pagecounts-ez/merged/ (grep "^en.z
> Special:Random ")
>
> On Mon, May 11, 2015 at 2:00 PM,
> <[hidden email]> wrote:
>>
>> Send Wiki-research-l mailing list submissions to
>>         [hidden email]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> or, via email, send a message with subject or body 'help' to
>>         [hidden email]
>>
>> You can reach the person managing the list at
>>         [hidden email]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Wiki-research-l digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: How to explain drop in random searches (Oliver Keyes)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sun, 10 May 2015 08:30:37 -0400
>> From: Oliver Keyes <[hidden email]>
>> To: Research into Wikimedia content and communities
>>         <[hidden email]>
>> Subject: Re: [Wiki-research-l] How to explain drop in random searches
>> Message-ID:
>>
>> <[hidden email]>
>> Content-Type: text/plain; charset=UTF-8
>>
>> Using what data?
>>
>> On 10 May 2015 at 05:29, Alex Druk <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> >
>> >
>> > I try to learn dynamic of random searches (Special:Random) on English
>> > Wikipedia.
>> >
>> > From 01/2012 to 10/2014 average number of random searches per month was
>> > about 86 millions or about 30% of Main_Page pageviews, but from November
>> > 2014 it drop to 31,000 per month (or 0.008% of Main_page).
>> >
>> > How to explain such a dramatic drop? Any ideas?
>> >
>> >
>> > --
>> > Thank you.
>> >
>> > Alex Druk, PhD
>> > wikipediatrends.com
>> > [hidden email]
>> > (775) 237-8550 Google voice
>> >
>> > _______________________________________________
>> > Wiki-research-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>> End of Wiki-research-l Digest, Vol 117, Issue 14
>> ************************************************
>
>
>
>
> --
> Thank you.
>
> Alex Druk
> [hidden email]
> (775) 237-8550 Google voice
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wiki-research-l Digest, Vol 117, Issue 14

R.Stuart Geiger
Going from 86,000,000 a month to 31,000 a month is quite a drop, and the shift is pretty dramatic. It goes from 1.7 million one day to 715 the next and stays flat (http://stats.grok.se/en/201410/Special:Random). 

I was also thinking there could be a bot or something that is scraping Special:Random, but the drop also happens for Special:Random/Talk -- which hardly anybody uses, but it still drops flat the same day (http://stats.grok.se/en/201410/Special:Random/Talk). It doesn't happen for Special:Upload or Special:Log though.

October 16th, 2014 is the day it changes. Anybody know of something that might have changed that day with logging? Also, there have to be way more than ~1,000 hits a day to Special:Random. Perhaps pageviews started to be counted for the page that it got redirected to, rather than the Special:Random page itself. But then why wouldn't it go to 0? What are those ~1,000 hits a day? 

👻 ~~ it is a mystery ~~ 👻

On Mon, May 11, 2015 at 7:44 PM, Oliver Keyes <[hidden email]> wrote:
A reduction or alteration in automata activity, possibly? Erik's dumps
contain literally no filtering for scammers or crawlers, and we're a
hot locale for spammer activity.

On 11 May 2015 at 08:09, Alex Druk <[hidden email]> wrote:
> I just grep monthly totals from Erik Zachte
> http://dumps.wikimedia.org/other/pagecounts-ez/merged/ (grep "^en.z
> Special:Random ")
>
> On Mon, May 11, 2015 at 2:00 PM,
> <[hidden email]> wrote:
>>
>> Send Wiki-research-l mailing list submissions to
>>         [hidden email]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> or, via email, send a message with subject or body 'help' to
>>         [hidden email]
>>
>> You can reach the person managing the list at
>>         [hidden email]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Wiki-research-l digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: How to explain drop in random searches (Oliver Keyes)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sun, 10 May 2015 08:30:37 -0400
>> From: Oliver Keyes <[hidden email]>
>> To: Research into Wikimedia content and communities
>>         <[hidden email]>
>> Subject: Re: [Wiki-research-l] How to explain drop in random searches
>> Message-ID:
>>
>> <[hidden email]>
>> Content-Type: text/plain; charset=UTF-8
>>
>> Using what data?
>>
>> On 10 May 2015 at 05:29, Alex Druk <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> >
>> >
>> > I try to learn dynamic of random searches (Special:Random) on English
>> > Wikipedia.
>> >
>> > From 01/2012 to 10/2014 average number of random searches per month was
>> > about 86 millions or about 30% of Main_Page pageviews, but from November
>> > 2014 it drop to 31,000 per month (or 0.008% of Main_page).
>> >
>> > How to explain such a dramatic drop? Any ideas?
>> >
>> >
>> > --
>> > Thank you.
>> >
>> > Alex Druk, PhD
>> > wikipediatrends.com
>> > [hidden email]
>> > <a href="tel:%28775%29%20237-8550" value="+17752378550">(775) 237-8550 Google voice
>> >
>> > _______________________________________________
>> > Wiki-research-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>> End of Wiki-research-l Digest, Vol 117, Issue 14
>> ************************************************
>
>
>
>
> --
> Thank you.
>
> Alex Druk
> [hidden email]
> <a href="tel:%28775%29%20237-8550" value="+17752378550">(775) 237-8550 Google voice
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Wiki-research-l Digest, Vol 117, Issue 14

Oliver Keyes-4
I can happily check the sampled logs for hits to those pages prior to and on those dates, if that'd help?

On 11 May 2015 at 23:08, R.Stuart Geiger <[hidden email]> wrote:
Going from 86,000,000 a month to 31,000 a month is quite a drop, and the shift is pretty dramatic. It goes from 1.7 million one day to 715 the next and stays flat (http://stats.grok.se/en/201410/Special:Random). 

I was also thinking there could be a bot or something that is scraping Special:Random, but the drop also happens for Special:Random/Talk -- which hardly anybody uses, but it still drops flat the same day (http://stats.grok.se/en/201410/Special:Random/Talk). It doesn't happen for Special:Upload or Special:Log though.

October 16th, 2014 is the day it changes. Anybody know of something that might have changed that day with logging? Also, there have to be way more than ~1,000 hits a day to Special:Random. Perhaps pageviews started to be counted for the page that it got redirected to, rather than the Special:Random page itself. But then why wouldn't it go to 0? What are those ~1,000 hits a day? 

👻 ~~ it is a mystery ~~ 👻

On Mon, May 11, 2015 at 7:44 PM, Oliver Keyes <[hidden email]> wrote:
A reduction or alteration in automata activity, possibly? Erik's dumps
contain literally no filtering for scammers or crawlers, and we're a
hot locale for spammer activity.

On 11 May 2015 at 08:09, Alex Druk <[hidden email]> wrote:
> I just grep monthly totals from Erik Zachte
> http://dumps.wikimedia.org/other/pagecounts-ez/merged/ (grep "^en.z
> Special:Random ")
>
> On Mon, May 11, 2015 at 2:00 PM,
> <[hidden email]> wrote:
>>
>> Send Wiki-research-l mailing list submissions to
>>         [hidden email]
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>         https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> or, via email, send a message with subject or body 'help' to
>>         [hidden email]
>>
>> You can reach the person managing the list at
>>         [hidden email]
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Wiki-research-l digest..."
>>
>>
>> Today's Topics:
>>
>>    1. Re: How to explain drop in random searches (Oliver Keyes)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Sun, 10 May 2015 08:30:37 -0400
>> From: Oliver Keyes <[hidden email]>
>> To: Research into Wikimedia content and communities
>>         <[hidden email]>
>> Subject: Re: [Wiki-research-l] How to explain drop in random searches
>> Message-ID:
>>
>> <[hidden email]>
>> Content-Type: text/plain; charset=UTF-8
>>
>> Using what data?
>>
>> On 10 May 2015 at 05:29, Alex Druk <[hidden email]> wrote:
>> > Hi everyone,
>> >
>> >
>> >
>> > I try to learn dynamic of random searches (Special:Random) on English
>> > Wikipedia.
>> >
>> > From 01/2012 to 10/2014 average number of random searches per month was
>> > about 86 millions or about 30% of Main_Page pageviews, but from November
>> > 2014 it drop to 31,000 per month (or 0.008% of Main_page).
>> >
>> > How to explain such a dramatic drop? Any ideas?
>> >
>> >
>> > --
>> > Thank you.
>> >
>> > Alex Druk, PhD
>> > wikipediatrends.com
>> > [hidden email]
>> > <a href="tel:%28775%29%20237-8550" value="+17752378550" target="_blank">(775) 237-8550 Google voice
>> >
>> > _______________________________________________
>> > Wiki-research-l mailing list
>> > [hidden email]
>> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>>
>>
>> ------------------------------
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>> End of Wiki-research-l Digest, Vol 117, Issue 14
>> ************************************************
>
>
>
>
> --
> Thank you.
>
> Alex Druk
> [hidden email]
> <a href="tel:%28775%29%20237-8550" value="+17752378550" target="_blank">(775) 237-8550 Google voice
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




--
Oliver Keyes
Research Analyst
Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l