Google doesn't index the Commons

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Google doesn't index the Commons

Brianna Laugher
Hi,

Someone suggested I (commons and en user:pfctdayelise) post this here.
If I am mistaken and Google does in fact index the Commons, I would
really appreciate someone explaining how to go about it. Comments
welcome on either of my user talk pages, COM:VP or by email.

cheers, Brianna (pfctdayelise)


Originally at [[commons:COM:VP#Keywords]]:

A related problem is that Google does not index the commons. Well,
more or less. It is basically impossible to find any commons content
in there - as you normally can by going ' keyword
site:commons.wikimedia.org '. Personally I find this rather
unbelievable and appalling on Google's behalf. Does anyone else think
we should, um, make them aware of this? pfctdayelise 13:26, 16 January
2006 (UTC)

        This has been discussed before. Part of the problem is that
descriptive info is in "foo.jpg", but Google has no way of knowing
that we have text pages that just happen to be named identically to
the nontextual image files that everybody in the world uses. Another
problem is that Google ranks by references to the page, and most
commons pages look like orphans or near-orphans, image references from
WP articles being made "secretly" via the MediaWiki local/commons
two-step lookup. On keywords, feel free to add them anywhere, they
will help our own search algorithm. I'm not so inclined to be
concerned about that, considering how many thousands of images still
have not a single link or category. Stan Shebs 00:22, 17 January 2006
(UTC)

            Google does have a way to know that those pages are HTML,
not binary files: it's in the http header field called "Content-Type".
Google just ignores it (or does not even try to load such a page).
[snip]  Duesentrieb(?!) 02:01, 17 January 2006 (UTC)

--
"Mathematicians do it with Nobel's wife."
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Google doesn't index the Commons

Christof Damian
On Wed, 18 Jan 2006, Brianna Laugher wrote:
>
> A related problem is that Google does not index the commons. Well,
> more or less. It is basically impossible to find any commons content
> in there - as you normally can by going ' keyword
> site:commons.wikimedia.org '. Personally I find this rather
> unbelievable and appalling on Google's behalf. Does anyone else
> think we should, um, make them aware of this? pfctdayelise 13:26, 16
> January 2006 (UTC)

It is rather annoying. I have a wiki wit a large amount of images too
and description on the "Image:" pages.

I thought about hacking mediawiki so that it changes the Image:*.jpg
to something else, without having to change any wiki text.

I don't think google will fix their bot soon, this has been broken for
ages.

christof

--
Christof Damian        
[hidden email]
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Google doesn't index the Commons

erchache2000
Christof Damian escribió:

> On Wed, 18 Jan 2006, Brianna Laugher wrote:
>  
>> A related problem is that Google does not index the commons. Well,
>> more or less. It is basically impossible to find any commons content
>> in there - as you normally can by going ' keyword
>> site:commons.wikimedia.org '. Personally I find this rather
>> unbelievable and appalling on Google's behalf. Does anyone else
>> think we should, um, make them aware of this? pfctdayelise 13:26, 16
>> January 2006 (UTC)
>>    
>
> It is rather annoying. I have a wiki wit a large amount of images too
> and description on the "Image:" pages.
>
> I thought about hacking mediawiki so that it changes the Image:*.jpg
> to something else, without having to change any wiki text.
>
> I don't think google will fix their bot soon, this has been broken for
> ages.
>
> christof
>
>  
I'm using lastest mediawiki 1.4.X, and has same problem....google bot
caching me but dont show any results on search google webpage :-S
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Google doesn't index the Commons

Rob Church
> I'm using lastest mediawiki 1.4.X,

Hint: MediaWiki 1.4 is nowhere near the latest release, which is
1.5.5. I'd imagine the problem still affects that however; and it
affects HEAD, given that that's what Commons runs. Seems to be a
Google oversight rather than our fault.


Rob Church

On 18/01/06, erchache2000 <[hidden email]> wrote:

> Christof Damian escribió:
> > On Wed, 18 Jan 2006, Brianna Laugher wrote:
> >
> >> A related problem is that Google does not index the commons. Well,
> >> more or less. It is basically impossible to find any commons content
> >> in there - as you normally can by going ' keyword
> >> site:commons.wikimedia.org '. Personally I find this rather
> >> unbelievable and appalling on Google's behalf. Does anyone else
> >> think we should, um, make them aware of this? pfctdayelise 13:26, 16
> >> January 2006 (UTC)
> >>
> >
> > It is rather annoying. I have a wiki wit a large amount of images too
> > and description on the "Image:" pages.
> >
> > I thought about hacking mediawiki so that it changes the Image:*.jpg
> > to something else, without having to change any wiki text.
> >
> > I don't think google will fix their bot soon, this has been broken for
> > ages.
> >
> > christof
> >
> >
> I'm using lastest mediawiki 1.4.X, and has same problem....google bot
> caching me but dont show any results on search google webpage :-S
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> http://mail.wikipedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Fxp
Reply | Threaded
Open this post in threaded view
|

Re: Google doesn't index the Commons

Fxp
This issue and others related to referencement (see SEO: search engine
optimizer) can be overcome through :
1)addition of a variable in the template, so that the DESCRIPTION meta
is different between the pages. For example, add the title at the
beginning of the description.
2)Without any description meta and only the KEYWORD meta... variables
can be adapted

To assess different sites type directly in google:

site:commons.wikimedia.org
or
site:wikimedia.org

These are only ideas

François


Rob Church wrote:

>>I'm using lastest mediawiki 1.4.X,
>
>
> Hint: MediaWiki 1.4 is nowhere near the latest release, which is
> 1.5.5. I'd imagine the problem still affects that however; and it
> affects HEAD, given that that's what Commons runs. Seems to be a
> Google oversight rather than our fault.
>
>
> Rob Church
>
> On 18/01/06, erchache2000 <[hidden email]> wrote:
>
>>Christof Damian escribió:
>>
>>>On Wed, 18 Jan 2006, Brianna Laugher wrote:
>>>
>>>
>>>>A related problem is that Google does not index the commons. Well,
>>>>more or less. It is basically impossible to find any commons content
>>>>in there - as you normally can by going ' keyword
>>>>site:commons.wikimedia.org '. Personally I find this rather
>>>>unbelievable and appalling on Google's behalf. Does anyone else
>>>>think we should, um, make them aware of this? pfctdayelise 13:26, 16
>>>>January 2006 (UTC)
>>>>
>>>
>>>It is rather annoying. I have a wiki wit a large amount of images too
>>>and description on the "Image:" pages.
>>>
>>>I thought about hacking mediawiki so that it changes the Image:*.jpg
>>>to something else, without having to change any wiki text.
>>>
>>>I don't think google will fix their bot soon, this has been broken for
>>>ages.
>>>
>>>christof
>>>
>>>
>>
>>I'm using lastest mediawiki 1.4.X, and has same problem....google bot
>>caching me but dont show any results on search google webpage :-S
>>_______________________________________________
>>Wikitech-l mailing list
>>[hidden email]
>>http://mail.wikipedia.org/mailman/listinfo/wikitech-l
>>

_______________________________________________
Wikitech-l mailing list
[hidden email]
http://mail.wikipedia.org/mailman/listinfo/wikitech-l