[Wikimedia-l] Wikipedia Zero in Google search result

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] Wikipedia Zero in Google search result

Benjamin Chen
Hi,

I noticed that when I'm searching on Google, many Wikipedia results are in the form of lang-code.zero.wikipedia.org, perhaps just since a day or two ago.

I'm not sure what items are indexed this way, but it would really be a trouble - there is no link on the page that jumps you to the standard site (even the notice links to main page of m.wikipedia.org, not the corresponding article on m.wikipedia.org)

Regards,

Benjamin Chen / [[User:Bencmq]]


_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

K. Peachey-2
Can you please file this in bugzilla <https://bugzilla.wikimedia.org>.

Thanks.

On Mon, May 27, 2013 at 9:41 PM, Benjamin Chen <[hidden email]> wrote:

> Hi,
>
> I noticed that when I'm searching on Google, many Wikipedia results are in the form of lang-code.zero.wikipedia.org, perhaps just since a day or two ago.
>
> I'm not sure what items are indexed this way, but it would really be a trouble - there is no link on the page that jumps you to the standard site (even the notice links to main page of m.wikipedia.org, not the corresponding article on m.wikipedia.org)
>
> Regards,
>
> Benjamin Chen / [[User:Bencmq]]
>
>
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

MZMcBride-2
K. Peachey wrote:
>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?

https://bugzilla.wikimedia.org/show_bug.cgi?id=48856


MZMcBride



_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Tomasz Finc-2
Looping Dan Foy in who's managing the Zero backlog.

On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:

> K. Peachey wrote:
>>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
>
> https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>
>
> MZMcBride
>
>
>
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Kul Wadhwa
Adam Baso (copied on this email) is working on it and a fix is ready. He'll
do some testing to make sure it's resolved.

On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]> wrote:

> Looping Dan Foy in who's managing the Zero backlog.
>
> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
> > K. Peachey wrote:
> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
> >
> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
> >
> >
> > MZMcBride
> >
> >
> >
> > _______________________________________________
> > Wikimedia-l mailing list
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>



--
Kul Wadhwa
Head of Mobile
Wikimedia Foundation
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Jon Robson
As I mentioned on the bug report [1] I worry more about the fact that
the content is not accessible to users on another IP. If a user of
Wikipedia Zero shares a link do we not want that to be accessible to
other people? By showing a message as we currently do "Sorry,
zero.wikipedia.org is only supported by select mobile carriers and is
not available from your mobile carrier." are we not denying people
knowledge and going against our mission? Should this not redirect or
at least be a message pointing to the actual content?

Also if we block google from indexing zero any links shared on the
zero domain will not contribute to the page ranking of that page which
seems a bit dumb...

As I mentioned the fact that pages are currently in google results is
only a temporary problem due to an upstream bug [2] in the
MobileFrontend extension which should resolve itself as the cache is
cleared.

[1] https://bugzilla.wikimedia.org/show_bug.cgi?id=48856#c3
[2] https://bugzilla.wikimedia.org/show_bug.cgi?id=35233

On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]> wrote:

> Adam Baso (copied on this email) is working on it and a fix is ready. He'll
> do some testing to make sure it's resolved.
>
> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]> wrote:
>
>> Looping Dan Foy in who's managing the Zero backlog.
>>
>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>> > K. Peachey wrote:
>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
>> >
>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>> >
>> >
>> > MZMcBride
>> >
>> >
>> >
>> > _______________________________________________
>> > Wikimedia-l mailing list
>> > [hidden email]
>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>
>> _______________________________________________
>> Wikimedia-l mailing list
>> [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>
>
>
>
> --
> Kul Wadhwa
> Head of Mobile
> Wikimedia Foundation
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l



--
Jon Robson
http://jonrobson.me.uk
@rakugojon

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Adam Baso
In reply to this post by Kul Wadhwa
Hello All,

We had shelved my patch, patch 64629 <https://gerrit.wikimedia.org/r/64629>,
in hopes that an earlier patch, patch
61809<https://gerrit.wikimedia.org/r/61809>(bug
35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
resolve the issue naturally as Google re-indexed. But it appears Google has
re-indexed and yet the .zero.wikipedia.org URLs are still  present in
Google's index, instead of the <language>.wikipedia.org URLs.

I have thus resubmitted patch 64629 <https://gerrit.wikimedia.org/r/64629> for
re-review. We will need to further discuss whether it is appropriate to
have Google completely remove .zero.wikipedia.org links from their cache,
or if perhaps we need to open a support thread with Google about canonical
URLs.




On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]> wrote:

> Adam Baso (copied on this email) is working on it and a fix is ready.
> He'll do some testing to make sure it's resolved.
>
> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]> wrote:
>
>> Looping Dan Foy in who's managing the Zero backlog.
>>
>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>> > K. Peachey wrote:
>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
>> >
>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>> >
>> >
>> > MZMcBride
>> >
>> >
>> >
>> > _______________________________________________
>> > Wikimedia-l mailing list
>> > [hidden email]
>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>
>> _______________________________________________
>> Wikimedia-l mailing list
>> [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>
>
>
>
> --
> Kul Wadhwa
> Head of Mobile
> Wikimedia Foundation
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] [WikimediaMobile] Wikipedia Zero in Google search result

James Alexander-4
At some level there seems to be a change in google (or our) settings that
are doing this everywhere. I've also been seeing a lot of links indexed and
appearing in google as the primary domain too
(wikipedia.org/wiki/Bostonrather then
en.wikipedia.org/wiki/Boston, seen it on Wikivoyage as well and I assume
the others). At some level we're probably going to want to figure out
what's happening because at some point down the road 'something' changed
just not sure on what side.

James Alexander
Legal and Community Advocacy
Wikimedia Foundation
(415) 839-6885 x6716 @jamesofur


On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:

> Hello All,
>
> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
> resolve the issue naturally as Google re-indexed. But it appears Google has
> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
> Google's index, instead of the <language>.wikipedia.org URLs.
>
> I have thus resubmitted patch 64629 <https://gerrit.wikimedia.org/r/64629> for
> re-review. We will need to further discuss whether it is appropriate to
> have Google completely remove .zero.wikipedia.org links from their cache,
> or if perhaps we need to open a support thread with Google about canonical
> URLs.
>
>
>
>
> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]> wrote:
>
>> Adam Baso (copied on this email) is working on it and a fix is ready.
>> He'll do some testing to make sure it's resolved.
>>
>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>
>>> Looping Dan Foy in who's managing the Zero backlog.
>>>
>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>> > K. Peachey wrote:
>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
>>> >
>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>> >
>>> >
>>> > MZMcBride
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Wikimedia-l mailing list
>>> > [hidden email]
>>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>
>>> _______________________________________________
>>> Wikimedia-l mailing list
>>> [hidden email]
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>
>>
>>
>>
>> --
>> Kul Wadhwa
>> Head of Mobile
>> Wikimedia Foundation
>>
>
>
> _______________________________________________
> Mobile-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] [WikimediaMobile] Wikipedia Zero in Google search result

James Alexander-4
Sorry for the double post: For what it's worth .... It does not appear that
we have a canonical url appearing on most pages. Mobile has canonical
pointing to the main site but neither zero or the main site have
a canonical expressly stated.

James Alexander
Legal and Community Advocacy
Wikimedia Foundation
(415) 839-6885 x6716 @jamesofur


On Tue, May 28, 2013 at 1:53 PM, James Alexander
<[hidden email]>wrote:

> At some level there seems to be a change in google (or our) settings that
> are doing this everywhere. I've also been seeing a lot of links indexed and
> appearing in google as the primary domain too (wikipedia.org/wiki/Bostonrather then
> en.wikipedia.org/wiki/Boston, seen it on Wikivoyage as well and I assume
> the others). At some level we're probably going to want to figure out
> what's happening because at some point down the road 'something' changed
> just not sure on what side.
>
> James Alexander
> Legal and Community Advocacy
> Wikimedia Foundation
> (415) 839-6885 x6716 @jamesofur
>
>
> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:
>
>> Hello All,
>>
>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
>> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
>> resolve the issue naturally as Google re-indexed. But it appears Google has
>> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
>> Google's index, instead of the <language>.wikipedia.org URLs.
>>
>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> for
>> re-review. We will need to further discuss whether it is appropriate to
>> have Google completely remove .zero.wikipedia.org links from their
>> cache, or if perhaps we need to open a support thread with Google about
>> canonical URLs.
>>
>>
>>
>>
>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]>wrote:
>>
>>> Adam Baso (copied on this email) is working on it and a fix is ready.
>>> He'll do some testing to make sure it's resolved.
>>>
>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>>
>>>> Looping Dan Foy in who's managing the Zero backlog.
>>>>
>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>>> > K. Peachey wrote:
>>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org
>>>> >?
>>>> >
>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>>> >
>>>> >
>>>> > MZMcBride
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Wikimedia-l mailing list
>>>> > [hidden email]
>>>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>
>>>> _______________________________________________
>>>> Wikimedia-l mailing list
>>>> [hidden email]
>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>
>>>
>>>
>>>
>>> --
>>> Kul Wadhwa
>>> Head of Mobile
>>> Wikimedia Foundation
>>>
>>
>>
>> _______________________________________________
>> Mobile-l mailing list
>> [hidden email]
>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>
>>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] [WikimediaMobile] Wikipedia Zero in Google search result

Jon Robson
FWIW Adding debug=true on zero domains should show the canonical url
to be present. As stated before this will fix itself within less than
30 days as the caches update.
e.g. http://hak.zero.wikipedia.org/wiki/Th%C3%A8u-Ya%CC%8Dp?debug=true

As James points out the main site doesn't use canonical urls and
probably should... but I'd say that's another bug.


On Tue, May 28, 2013 at 1:58 PM, James Alexander
<[hidden email]> wrote:

> Sorry for the double post: For what it's worth .... It does not appear that
> we have a canonical url appearing on most pages. Mobile has canonical
> pointing to the main site but neither zero or the main site have
> a canonical expressly stated.
>
> James Alexander
> Legal and Community Advocacy
> Wikimedia Foundation
> (415) 839-6885 x6716 @jamesofur
>
>
> On Tue, May 28, 2013 at 1:53 PM, James Alexander
> <[hidden email]>wrote:
>
>> At some level there seems to be a change in google (or our) settings that
>> are doing this everywhere. I've also been seeing a lot of links indexed and
>> appearing in google as the primary domain too (wikipedia.org/wiki/Bostonrather then
>> en.wikipedia.org/wiki/Boston, seen it on Wikivoyage as well and I assume
>> the others). At some level we're probably going to want to figure out
>> what's happening because at some point down the road 'something' changed
>> just not sure on what side.
>>
>> James Alexander
>> Legal and Community Advocacy
>> Wikimedia Foundation
>> (415) 839-6885 x6716 @jamesofur
>>
>>
>> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:
>>
>>> Hello All,
>>>
>>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
>>> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
>>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
>>> resolve the issue naturally as Google re-indexed. But it appears Google has
>>> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
>>> Google's index, instead of the <language>.wikipedia.org URLs.
>>>
>>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> for
>>> re-review. We will need to further discuss whether it is appropriate to
>>> have Google completely remove .zero.wikipedia.org links from their
>>> cache, or if perhaps we need to open a support thread with Google about
>>> canonical URLs.
>>>
>>>
>>>
>>>
>>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]>wrote:
>>>
>>>> Adam Baso (copied on this email) is working on it and a fix is ready.
>>>> He'll do some testing to make sure it's resolved.
>>>>
>>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>>>
>>>>> Looping Dan Foy in who's managing the Zero backlog.
>>>>>
>>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>>>> > K. Peachey wrote:
>>>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org
>>>>> >?
>>>>> >
>>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>>>> >
>>>>> >
>>>>> > MZMcBride
>>>>> >
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Wikimedia-l mailing list
>>>>> > [hidden email]
>>>>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>
>>>>> _______________________________________________
>>>>> Wikimedia-l mailing list
>>>>> [hidden email]
>>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Kul Wadhwa
>>>> Head of Mobile
>>>> Wikimedia Foundation
>>>>
>>>
>>>
>>> _______________________________________________
>>> Mobile-l mailing list
>>> [hidden email]
>>> https://lists.wikimedia.org/mailman/listinfo/mobile-l
>>>
>>>
>>
> _______________________________________________
> Wikimedia-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l



--
Jon Robson
http://jonrobson.me.uk
@rakugojon

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Adam Baso
In reply to this post by Adam Baso
All,

My mistake. The pages in Google's index that I used for sampling - the ones
that have "Sorry, ..." in their description in Google search results - are
cached pages. I assumed incorrectly that those pages were based on recent
indexing (e.g., in the past few days).

I think we can actually stick to the original plan of Google re-indexing
and the search results de-emphasizing the
<language>.zero.wikipedia.orglinks within the next 30 days.

I still find it strange that there are <language>.zero.wikipedia.org links
that turned up higher in the search engine rankings than their
better-established <language>.wikipedia.org counterparts. But I suppose
with fewer competing page elements, especially on long-tail articles with
fewer or no direct links to the desktop page, this is maybe not totally
unexpected.

-Adam




On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:

> Hello All,
>
> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
> resolve the issue naturally as Google re-indexed. But it appears Google has
> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
> Google's index, instead of the <language>.wikipedia.org URLs.
>
> I have thus resubmitted patch 64629 <https://gerrit.wikimedia.org/r/64629> for
> re-review. We will need to further discuss whether it is appropriate to
> have Google completely remove .zero.wikipedia.org links from their cache,
> or if perhaps we need to open a support thread with Google about canonical
> URLs.
>
>
>
>
> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]> wrote:
>
>> Adam Baso (copied on this email) is working on it and a fix is ready.
>> He'll do some testing to make sure it's resolved.
>>
>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>
>>> Looping Dan Foy in who's managing the Zero backlog.
>>>
>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>> > K. Peachey wrote:
>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org>?
>>> >
>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>> >
>>> >
>>> > MZMcBride
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Wikimedia-l mailing list
>>> > [hidden email]
>>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>
>>> _______________________________________________
>>> Wikimedia-l mailing list
>>> [hidden email]
>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>
>>
>>
>>
>> --
>> Kul Wadhwa
>> Head of Mobile
>> Wikimedia Foundation
>>
>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] [WikimediaMobile] Wikipedia Zero in Google search result

Federico Leva (Nemo)
I don't know if there's a general bug report about canonical URLs & co.
indexing, but there's one about Google messing up with 301/302 redirects
which is spreading quite a bit lately. Erik wrote them to no avail some
time ago.
https://bugzilla.wikimedia.org/show_bug.cgi?id=26115

Nemo

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] [WikimediaMobile] Wikipedia Zero in Google search result

Erik Moeller-4
On Wed, May 29, 2013 at 8:10 AM, Federico Leva (Nemo)
<[hidden email]> wrote:
> I don't know if there's a general bug report about canonical URLs & co.
> indexing, but there's one about Google messing up with 301/302 redirects
> which is spreading quite a bit lately. Erik wrote them to no avail some time
> ago.
> https://bugzilla.wikimedia.org/show_bug.cgi?id=26115

Actually none of the examples in the bug still return the reported
results for me.

Erik
--
Erik Möller
VP of Engineering and Product Development, Wikimedia Foundation

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Adam Baso
In reply to this post by Adam Baso
Update:

We've added an enhancement to Wikipedia Zero so that if a user who isn't on
a participating carrier network navigates to a Wikipedia Zero page on
<language>.zero.wikipedia.org, such as
http://en.zero.wikipedia.org/wiki/Muse_%28band%29 , the user will be
presented an option to visit the canonical URL of the article. If clicked,
the canonical URL should get the user to the mobile or desktop version of
the page, based on device type.

We're hoping that by next week the Google index will be refreshed so as to
correctly mark the <language>.zero.wikipedia.org pages as duplicate pages
in the omitted section. Upon confirmation of as much, the current plan is
to introduce https://gerrit.wikimedia.org/r/#/c/69420/ to prevent indexing
of <language>.zero.wikipedia.org altogether.


On Tue, May 28, 2013 at 6:26 PM, Adam Baso <[hidden email]> wrote:

> All,
>
> My mistake. The pages in Google's index that I used for sampling - the
> ones that have "Sorry, ..." in their description in Google search results -
> are cached pages. I assumed incorrectly that those pages were based on
> recent indexing (e.g., in the past few days).
>
> I think we can actually stick to the original plan of Google re-indexing
> and the search results de-emphasizing the <language>.zero.wikipedia.orglinks within the next 30 days.
>
> I still find it strange that there are <language>.zero.wikipedia.orglinks that turned up higher in the search engine rankings than their
> better-established <language>.wikipedia.org counterparts. But I suppose
> with fewer competing page elements, especially on long-tail articles with
> fewer or no direct links to the desktop page, this is maybe not totally
> unexpected.
>
> -Adam
>
>
>
>
> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:
>
>> Hello All,
>>
>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
>> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
>> resolve the issue naturally as Google re-indexed. But it appears Google has
>> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
>> Google's index, instead of the <language>.wikipedia.org URLs.
>>
>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> for
>> re-review. We will need to further discuss whether it is appropriate to
>> have Google completely remove .zero.wikipedia.org links from their
>> cache, or if perhaps we need to open a support thread with Google about
>> canonical URLs.
>>
>>
>>
>>
>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]>wrote:
>>
>>> Adam Baso (copied on this email) is working on it and a fix is ready.
>>> He'll do some testing to make sure it's resolved.
>>>
>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>>
>>>> Looping Dan Foy in who's managing the Zero backlog.
>>>>
>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>>> > K. Peachey wrote:
>>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org
>>>> >?
>>>> >
>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>>> >
>>>> >
>>>> > MZMcBride
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > Wikimedia-l mailing list
>>>> > [hidden email]
>>>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>
>>>> _______________________________________________
>>>> Wikimedia-l mailing list
>>>> [hidden email]
>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>
>>>
>>>
>>>
>>> --
>>> Kul Wadhwa
>>> Head of Mobile
>>> Wikimedia Foundation
>>>
>>
>>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Matthew Flaschen-2
On 06/18/2013 06:35 PM, Adam Baso wrote:
> Update:
>
> We've added an enhancement to Wikipedia Zero so that if a user who isn't on
> a participating carrier network navigates to a Wikipedia Zero page on
> <language>.zero.wikipedia.org, such as
> http://en.zero.wikipedia.org/wiki/Muse_%28band%29 , the user will be
> presented an option to visit the canonical URL of the article. If clicked,
> the canonical URL should get the user to the mobile or desktop version of
> the page, based on device type.

That's good to hear.  It would be helpful if when visiting on desktop
(the original report,
https://bugzilla.wikimedia.org/show_bug.cgi?id=48856, is about desktop
search), it did not mention "mobile carriers", "data charges", and such.
 Perhaps it could even redirect silently.

If that's not feasible for now, perhaps the message could be a bit more
general so it reads better on desktop.

Matt Flaschen

_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Adam Baso
In reply to this post by Adam Baso
(cross-posted on mobile-l)

Okay, looks like the index of zero.wikipedia.org pages in Google has shrunk
by some 20 million entries. Nonetheless, a number of really old pages
(e.g., going back to 6-May-2013) are still in the Google index with article
text. I'll set a reminder to check on the Google index again in 30 days,
and hopefully then we can finally put the no-index rules in place at that
time.

The good news is that many of the pages are now correctly suppressed in
natural search as non-canonical pages. In other words, a user would need to
go through omitted results or do a site:<domain> search to see them.

-Adam


On Tue, Jun 18, 2013 at 3:35 PM, Adam Baso <[hidden email]> wrote:

> Update:
>
> We've added an enhancement to Wikipedia Zero so that if a user who isn't
> on a participating carrier network navigates to a Wikipedia Zero page on
> <language>.zero.wikipedia.org, such as
> http://en.zero.wikipedia.org/wiki/Muse_%28band%29 , the user will be
> presented an option to visit the canonical URL of the article. If clicked,
> the canonical URL should get the user to the mobile or desktop version of
> the page, based on device type.
>
> We're hoping that by next week the Google index will be refreshed so as to
> correctly mark the <language>.zero.wikipedia.org pages as duplicate pages
> in the omitted section. Upon confirmation of as much, the current plan is
> to introduce https://gerrit.wikimedia.org/r/#/c/69420/ to prevent
> indexing of <language>.zero.wikipedia.org altogether.
>
>
> On Tue, May 28, 2013 at 6:26 PM, Adam Baso <[hidden email]> wrote:
>
>> All,
>>
>> My mistake. The pages in Google's index that I used for sampling - the
>> ones that have "Sorry, ..." in their description in Google search results -
>> are cached pages. I assumed incorrectly that those pages were based on
>> recent indexing (e.g., in the past few days).
>>
>> I think we can actually stick to the original plan of Google re-indexing
>> and the search results de-emphasizing the <language>.zero.wikipedia.orglinks within the next 30 days.
>>
>> I still find it strange that there are <language>.zero.wikipedia.orglinks that turned up higher in the search engine rankings than their
>> better-established <language>.wikipedia.org counterparts. But I suppose
>> with fewer competing page elements, especially on long-tail articles with
>> fewer or no direct links to the desktop page, this is maybe not totally
>> unexpected.
>>
>> -Adam
>>
>>
>>
>>
>> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:
>>
>>> Hello All,
>>>
>>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
>>> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
>>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
>>> resolve the issue naturally as Google re-indexed. But it appears Google has
>>> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
>>> Google's index, instead of the <language>.wikipedia.org URLs.
>>>
>>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> for
>>> re-review. We will need to further discuss whether it is appropriate to
>>> have Google completely remove .zero.wikipedia.org links from their
>>> cache, or if perhaps we need to open a support thread with Google about
>>> canonical URLs.
>>>
>>>
>>>
>>>
>>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]>wrote:
>>>
>>>> Adam Baso (copied on this email) is working on it and a fix is ready.
>>>> He'll do some testing to make sure it's resolved.
>>>>
>>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>>>
>>>>> Looping Dan Foy in who's managing the Zero backlog.
>>>>>
>>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>>>> > K. Peachey wrote:
>>>>> >>Can you please file this in bugzilla <https://bugzilla.wikimedia.org
>>>>> >?
>>>>> >
>>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>>>> >
>>>>> >
>>>>> > MZMcBride
>>>>> >
>>>>> >
>>>>> >
>>>>> > _______________________________________________
>>>>> > Wikimedia-l mailing list
>>>>> > [hidden email]
>>>>> > Unsubscribe:
>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>
>>>>> _______________________________________________
>>>>> Wikimedia-l mailing list
>>>>> [hidden email]
>>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Kul Wadhwa
>>>> Head of Mobile
>>>> Wikimedia Foundation
>>>>
>>>
>>>
>>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Wikipedia Zero in Google search result

Adam Baso
(cross-posted on mobile-l)

Update:

I have been checking on the indexed link count over the last couple of
months, and it has been roughly constant. Upon another check in the past
week, it looked like it was time to go ahead with the robots.txt update.

Just yesterday, the start of a robots.txt entry for <lang>.
zero.wikipedia.org has also been updated to instruct all robots like
Googlebot to not index <lang>.zero.wikipedia.org. Looks like even more
<lang>.zero.wikipedia.org pages may already be starting to fall out of the
index.

Thanks for flagging this! Will keep watching the indexed links count as it
dwindles.

Thanks again.
-Adam


On Wed, Jun 26, 2013 at 10:59 AM, Adam Baso <[hidden email]> wrote:

> (cross-posted on mobile-l)
>
> Okay, looks like the index of zero.wikipedia.org pages in Google has
> shrunk by some 20 million entries. Nonetheless, a number of really old
> pages (e.g., going back to 6-May-2013) are still in the Google index with
> article text. I'll set a reminder to check on the Google index again in 30
> days, and hopefully then we can finally put the no-index rules in place at
> that time.
>
> The good news is that many of the pages are now correctly suppressed in
> natural search as non-canonical pages. In other words, a user would need to
> go through omitted results or do a site:<domain> search to see them.
>
> -Adam
>
>
> On Tue, Jun 18, 2013 at 3:35 PM, Adam Baso <[hidden email]> wrote:
>
>> Update:
>>
>> We've added an enhancement to Wikipedia Zero so that if a user who isn't
>> on a participating carrier network navigates to a Wikipedia Zero page on
>> <language>.zero.wikipedia.org, such as
>> http://en.zero.wikipedia.org/wiki/Muse_%28band%29 , the user will be
>> presented an option to visit the canonical URL of the article. If clicked,
>> the canonical URL should get the user to the mobile or desktop version of
>> the page, based on device type.
>>
>> We're hoping that by next week the Google index will be refreshed so as
>> to correctly mark the <language>.zero.wikipedia.org pages as duplicate
>> pages in the omitted section. Upon confirmation of as much, the current
>> plan is to introduce https://gerrit.wikimedia.org/r/#/c/69420/ to
>> prevent indexing of <language>.zero.wikipedia.org altogether.
>>
>>
>> On Tue, May 28, 2013 at 6:26 PM, Adam Baso <[hidden email]> wrote:
>>
>>> All,
>>>
>>> My mistake. The pages in Google's index that I used for sampling - the
>>> ones that have "Sorry, ..." in their description in Google search results -
>>> are cached pages. I assumed incorrectly that those pages were based on
>>> recent indexing (e.g., in the past few days).
>>>
>>> I think we can actually stick to the original plan of Google re-indexing
>>> and the search results de-emphasizing the <language>.zero.wikipedia.orglinks within the next 30 days.
>>>
>>> I still find it strange that there are <language>.zero.wikipedia.orglinks that turned up higher in the search engine rankings than their
>>> better-established <language>.wikipedia.org counterparts. But I suppose
>>> with fewer competing page elements, especially on long-tail articles with
>>> fewer or no direct links to the desktop page, this is maybe not totally
>>> unexpected.
>>>
>>> -Adam
>>>
>>>
>>>
>>>
>>> On Tue, May 28, 2013 at 1:49 PM, Adam Baso <[hidden email]> wrote:
>>>
>>>> Hello All,
>>>>
>>>> We had shelved my patch, patch 64629<https://gerrit.wikimedia.org/r/64629>,
>>>> in hopes that an earlier patch, patch 61809<https://gerrit.wikimedia.org/r/61809>(bug
>>>> 35233 <https://bugzilla.wikimedia.org/show_bug.cgi?id=35233>), would
>>>> resolve the issue naturally as Google re-indexed. But it appears Google has
>>>> re-indexed and yet the .zero.wikipedia.org URLs are still  present in
>>>> Google's index, instead of the <language>.wikipedia.org URLs.
>>>>
>>>> I have thus resubmitted patch 64629<https://gerrit.wikimedia.org/r/64629> for
>>>> re-review. We will need to further discuss whether it is appropriate to
>>>> have Google completely remove .zero.wikipedia.org links from their
>>>> cache, or if perhaps we need to open a support thread with Google about
>>>> canonical URLs.
>>>>
>>>>
>>>>
>>>>
>>>> On Tue, May 28, 2013 at 1:13 PM, Kul Wadhwa <[hidden email]>wrote:
>>>>
>>>>> Adam Baso (copied on this email) is working on it and a fix is ready.
>>>>> He'll do some testing to make sure it's resolved.
>>>>>
>>>>> On Tue, May 28, 2013 at 10:22 AM, Tomasz Finc <[hidden email]>wrote:
>>>>>
>>>>>> Looping Dan Foy in who's managing the Zero backlog.
>>>>>>
>>>>>> On Mon, May 27, 2013 at 8:01 AM, MZMcBride <[hidden email]> wrote:
>>>>>> > K. Peachey wrote:
>>>>>> >>Can you please file this in bugzilla <
>>>>>> https://bugzilla.wikimedia.org>?
>>>>>> >
>>>>>> > https://bugzilla.wikimedia.org/show_bug.cgi?id=48856
>>>>>> >
>>>>>> >
>>>>>> > MZMcBride
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > Wikimedia-l mailing list
>>>>>> > [hidden email]
>>>>>> > Unsubscribe:
>>>>>> https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>>
>>>>>> _______________________________________________
>>>>>> Wikimedia-l mailing list
>>>>>> [hidden email]
>>>>>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Kul Wadhwa
>>>>> Head of Mobile
>>>>> Wikimedia Foundation
>>>>>
>>>>
>>>>
>>>
>>
>
_______________________________________________
Wikimedia-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>