No-indexing of project-space pages

classic Classic list List threaded Threaded
104 messages Options
1234 ... 6
Reply | Threaded
Open this post in threaded view
|

No-indexing of project-space pages

Newyorkbrad (Wikipedia)
A couple of months ago, I raised on this list the issue of "no-indexing"
Wikipedia pages outside the mainspace, principally including project-space
pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
including userspace as well.  By no-indexing, I refer to coding these pages
such that they will not be picked up by Google or other search engines.

The desirability of this change has been noted by many people, including
very experienced Wikipedians.  As we all know, the popularity of Wikipedia
and the intensive number of internal links means that when a Wikipedia page
contains the name of a living individual, then unless the person is either
extremely notable or happens to have a common name, that page will almost
inevitably become a high-ranking, if not the highest ranking, search engine
result for that individual.  This raises issues enough when the search
result is a BLP or other mainspace article, but it is totally unacceptable
when the high-ranking result destined to follow the individual around
forever is something like:

-  An AfD deciding to delete an article about a person because of her
perceived lack of any sufficiently notable or meaningful accomplishments in
life (these can be courtesy-blanked on request, but how many subjects know
how even to ask); or
-  An RfA, involving a contributor who happens to edit under his real name,
which fails because the user was deemed unqualified for adminship; or
-  An arbitration case, in which an editor was severely criticized or even
banned for violations of Wikipedia policy - regrettable, but not
something for which it would serve any purpose to tar the person's RL
reputation forever; or
-  A long and heated discussion in an ancient ANI thread, again involving a
contributor who edits using her name, involving some ancient wiki-grievance
long forgotten ... until the contributor applies for a scholarship or a job
and someone Googles her name; or
-  An ArbCom election in which the user came in 17th place; or
-  An SSP report in which a user editing under a new name is indelibly
linked to a username based on his real name, which he chose to abandon
months or years earlier because of precisely these very concerns; or
-  A discussion on ANI noticeboard of defamatory or privacy-invading
material in a BLP or other article, which it is rightfully decided to delete
from the article itself ... except it remains preserved in the
noticeboard discussion (I do see that this aspect of the problem has been
addressed on the BLP noticeboard archives, but this type of discussion
occurs on ANI and elsewhere as well); or
-  Various other places where these issues, involving both article subjects
and Wikipedia contributors, continue to arise on a frequent basis.

It has been observed that being named on Wikipedia, whether for legitimate
reasons or otherwise, has a powerful potential to damage a person's life.
(See for example the BLP policy and its talkpage, the ArbCom decisions in
RfAr/Badlydrawnjeff and RfAr/Footnoted quotes, or discussion on various
criticism sites.)  As noted, this raises a troublesome enough suite of
issues when the person in question has been accurately discussed in the
encyclopedia itself.  It is really not acceptable when it occurs as a
happenstance of an ancillary discussion of an article subject or of a
contributor (even a misbehaving or a now-unwelcome contributor).

I have read more than enough complaints from people who have found
themselves in many of the unfortunate situations I describe here.  If they
are Wikipedians, they sometimes come to rue the day they ever thought of
contributing, much less contributing under a name linked to their real
identity.  If they are article subjects with no particular connection to
Wikipedia, they must surely find the situation maddening.  By comparison,
the benefits to the general public of being able to read through internal
Wikipedia discussions of this nature as the result of a casual Google search
must be reckoned, at the best, as slight.

In the prior thread, I believe there was significant support for
implementing coding necessary to cause "no-indexing" of projectspace and
possibly userspace and other-space pages.  The main counter-arguments were:

- That some project-space pages DO warrant indexing.  An example that was
given was the notability policy or the BLP policy.  The solution to this is
to have a "yes-index" feature that would override the no-index code on a
particular project-space page where indexing was agreed to be affirmatively
desirable.  Community discussion could come up with a list of those
particular pages in a week or so.
- That Wikipedia currently lacks a top-quality internal search capability,
and therefore we need to be able to use external search engines such as
Google to perform administrator functions and the like.  There is some merit
to this observation; I certainly have used Google to hunt down references I
remembered when I was writing arbitration decisions, for example.  But
internal administrative convenience is not a good argument to disregard real
harm that we are inadvertently causing to specific individuals.  The
developers can and probably should be tasked, as a high priority, with
improving the search capabilities; but it has been too long since the
problems I have described in this e-mail were identifed, and it is time they
were solved.
- The most cynical response has been that Wikipedia thrives on Google-rank
created by internal links and is not going to do anything that would lessen
its page-ranks, whether out of pride or for some conjectured eventual
mercenary reason.  Actually, this was not a counter-argument presented on
Wikien; it's a cynical speculation about motivations that was presented on a
criticism site.  I give it no credence, but it would be easy enough to
disprove once and for all.

Wikipedia and its community are often criticized for irresponsibly
neglecting the negative effects of the project on some of its subjects and
some of its contributors.  We have here an opportunity to take an
incremental but meaningful step toward addressing a group of related,
significant concerns.  I would like to urge that the on-again, off-again
discussion of this proposal proceed to a conclusion either here or on-wiki
and that some definitive action be taken in the near future.

(Finally, I would appreciate if responses could focus on the substance of
this post and not on the identity of its author.)

Regards,
Newyorkbrad
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Stephen Bain
On Wed, Jul 23, 2008 at 10:47 AM, Newyorkbrad (Wikipedia)
<[hidden email]> wrote:
> A couple of months ago, I raised on this list the issue of "no-indexing"
> Wikipedia pages outside the mainspace, principally including project-space
> pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> including userspace as well.  By no-indexing, I refer to coding these pages
> such that they will not be picked up by Google or other search engines.

Note that much of this is already done, see our robots file:

http://en.wikipedia.org/robots.txt

Currently all AFD, RFA, RFC and RFAR subpages (but not the main AFD
page, the main RFA page etc) are blocked from indexing. Of your
examples the admin noticeboard and userspace are probably the big
examples of pages that are still indexed that we might not want to be
so.

Note that the robots file can easily be updated by a request on
bugzilla [1] if there is consensus for it.

> - That Wikipedia currently lacks a top-quality internal search capability,
> and therefore we need to be able to use external search engines such as
> Google to perform administrator functions and the like.  There is some merit

On this point, there's been great improvement in MediaWiki's search
capabilities this year with the MWSearch backend coming online.

----
[1] Like this request, for example:
https://bugzilla.wikimedia.org/show_bug.cgi?id=10288

--
Stephen Bain
[hidden email]

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Joe Szilagyi
On Wed, Jul 23, 2008 at 7:24 AM, Stephen Bain <[hidden email]>
wrote:

> On Wed, Jul 23, 2008 at 10:47 AM, Newyorkbrad (Wikipedia)
> <[hidden email]> wrote:
> > A couple of months ago, I raised on this list the issue of "no-indexing"
> > Wikipedia pages outside the mainspace, principally including
> project-space
> > pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> > including userspace as well.  By no-indexing, I refer to coding these
> pages
> > such that they will not be picked up by Google or other search engines.
>
> Note that much of this is already done, see our robots file:
>
> http://en.wikipedia.org/robots.txt
>
> Currently all AFD, RFA, RFC and RFAR subpages (but not the main AFD
> page, the main RFA page etc) are blocked from indexing. Of your
> examples the admin noticeboard and userspace are probably the big
> examples of pages that are still indexed that we might not want to be
> so.
>

Just to pick everyone's favorite topic as an example:

http://www.google.com/search?hl=en&pwst=1&q=+site:en.wikipedia.org+%22Articles+for+deletion%22+Brandt+wikipedia

What is the benefit to allowing Google to index DRV, talk pages, and
user/user talk pages? Aside from the Mediawiki native search function not
being always that great, the only negative to blocking or restricting Search
Engines to just cover strictly Article space would be a possible loss of
Google Juice, which should not a concern.

- Joe
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Newyorkbrad (Wikipedia)
In reply to this post by Stephen Bain
On 7/23/08, Stephen Bain <[hidden email]> wrote:

>
> On Wed, Jul 23, 2008 at 10:47 AM, Newyorkbrad (Wikipedia)
> <[hidden email]> wrote:
> > A couple of months ago, I raised on this list the issue of "no-indexing"
> > Wikipedia pages outside the mainspace, principally including
> project-space
> > pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> > including userspace as well.  By no-indexing, I refer to coding these
> pages
> > such that they will not be picked up by Google or other search engines.
>
> Note that much of this is already done, see our robots file:
>
> http://en.wikipedia.org/robots.txt
>
> Currently all AFD, RFA, RFC and RFAR subpages (but not the main AFD
> page, the main RFA page etc) are blocked from indexing. Of your
> examples the admin noticeboard and userspace are probably the big
> examples of pages that are still indexed that we might not want to be
> so.
>
> Note that the robots file can easily be updated by a request on
> bugzilla [1] if there is consensus for it.
>
> > - That Wikipedia currently lacks a top-quality internal search
> capability,
> > and therefore we need to be able to use external search engines such as
> > Google to perform administrator functions and the like.  There is some
> merit
>
> On this point, there's been great improvement in MediaWiki's search
> capabilities this year with the MWSearch backend coming online.
>
> ----
> [1] Like this request, for example:
> https://bugzilla.wikimedia.org/show_bug.cgi?id=10288
>
> --
> Stephen Bain
> [hidden email]



Thank you for this update.  I think there may have been progress that I have
missed in the past couple of months.  When I posted on this topic a few
months ago, either some of these types of pages were not yet no-indexed, or
no one mentioned the fact, or if they did I overlooked it.

Other pages that should be excluded from indexing (if they aren't already)
include SSP, RfCU, the old PAIN archives, WQA, and I'm sure people can put
together a list of a few more.

As for userspace, I think the optimal solution would be to allow the
individual user to opt in or out of indexing, if that is doable without too
much fuss.  (And indefblocked or banned users would automatically be
no-indexed, to give those with identifiable usernames one fewer grievance to
pursue after they have left us.)  Query whether "in" or "out" would be the
better default.

Newyorkbrad
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Grease Monkee
In reply to this post by Newyorkbrad (Wikipedia)
On Tue, Jul 22, 2008 at 5:47 PM, Newyorkbrad (Wikipedia) <
[hidden email]> wrote:

> A couple of months ago, I raised on this list the issue of "no-indexing"
> Wikipedia pages outside the mainspace, principally including project-space
> pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> including userspace as well.  By no-indexing, I refer to coding these pages
> such that they will not be picked up by Google or other search engines.
>
> Regards,
> Newyorkbrad
>

Newyorkbrad is a member of Wikipedia Review; he is therefore a troll, or
possibly brainwashed, and must not be listened to. Moderators, thank you for
moderating the post initially. Too bad we can't just ban all the bad people
so all us good people can work in peace.

regards ...
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Risker
In reply to this post by Newyorkbrad (Wikipedia)
2008/7/23 Newyorkbrad (Wikipedia) <[hidden email]>:

>
>
> As for userspace, I think the optimal solution would be to allow the
> individual user to opt in or out of indexing, if that is doable without too
> much fuss.  (And indefblocked or banned users would automatically be
> no-indexed, to give those with identifiable usernames one fewer grievance
> to
> pursue after they have left us.)  Query whether "in" or "out" would be the
> better default.
>

I am of the belief that only mainspace should be indexed by default, and
that only limited project space (e.g., policies) should be indexed but that
all other areas should be no-index by default. Userspace should default to
no-index, in particular. I recall a very contentious BLP-related discussion
that took place over several months, and was discussed not only in project
space and on the talk pages of related articles, but was also discussed on
multiple user pages. When I do a search for that subject now, ("name of
celebrity" +"BLP issue" +wikipedia) what comes up is the many discussions on
userpages, perpetuating the BLP problem.

Risker
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Newyorkbrad (Wikipedia)
In reply to this post by Joe Szilagyi
On 7/23/08, Joe Szilagyi <[hidden email]> wrote:

>
> On Wed, Jul 23, 2008 at 7:24 AM, Stephen Bain <[hidden email]>
> wrote:
>
> > On Wed, Jul 23, 2008 at 10:47 AM, Newyorkbrad (Wikipedia)
> > <[hidden email]> wrote:
> > > A couple of months ago, I raised on this list the issue of
> "no-indexing"
> > > Wikipedia pages outside the mainspace, principally including
> > project-space
> > > pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> > > including userspace as well.  By no-indexing, I refer to coding these
> > pages
> > > such that they will not be picked up by Google or other search engines.
> >
> > Note that much of this is already done, see our robots file:
> >
> > http://en.wikipedia.org/robots.txt
> >
> > Currently all AFD, RFA, RFC and RFAR subpages (but not the main AFD
> > page, the main RFA page etc) are blocked from indexing. Of your
> > examples the admin noticeboard and userspace are probably the big
> > examples of pages that are still indexed that we might not want to be
> > so.
> >
>
> Just to pick everyone's favorite topic as an example:
>
>
> http://www.google.com/search?hl=en&pwst=1&q=+site:en.wikipedia.org+%22Articles+for+deletion%22+Brandt+wikipedia
>
> What is the benefit to allowing Google to index DRV, talk pages, and
> user/user talk pages? Aside from the Mediawiki native search function not
> being always that great, the only negative to blocking or restricting
> Search
> Engines to just cover strictly Article space would be a possible loss of
> Google Juice, which should not a concern.
>
> - Joe


Does the current exclusion of XfD's include DRV as well?

Newyorkbrad
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Risker
In reply to this post by Grease Monkee
2008/7/23 Grease Monkee <[hidden email]>:

> On Tue, Jul 22, 2008 at 5:47 PM, Newyorkbrad (Wikipedia) <
> [hidden email]> wrote:
>
> > A couple of months ago, I raised on this list the issue of "no-indexing"
> > Wikipedia pages outside the mainspace, principally including
> project-space
> > pages such as XfDs, AN/ANI, RfA's, RfAr's, and the like, but possibly
> > including userspace as well.  By no-indexing, I refer to coding these
> pages
> > such that they will not be picked up by Google or other search engines.
> >
> > Regards,
> > Newyorkbrad
> >
>


>
> Newyorkbrad is a member of Wikipedia Review; he is therefore a troll, or
> possibly brainwashed, and must not be listened to. Moderators, thank you
> for
> moderating the post initially. Too bad we can't just ban all the bad people
> so all us good people can work in peace.
>
> regards ...
>

Sorry, Grease Monkee, I think the thread you were looking for was the one
entitled "Dangerous factionalism."  Perhaps the mods can move it for you.

Risker
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Joe Szilagyi
In reply to this post by Newyorkbrad (Wikipedia)
On Wed, Jul 23, 2008 at 8:31 AM, Newyorkbrad (Wikipedia) <
[hidden email]> wrote:

> Does the current exclusion of XfD's include DRV as well?
>

Nope:

http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla:en-US:official&hs=hpL&pwst=1&q=+site:en.wikipedia.org+%22deletion+review%22

- Joe
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Puddl Duk
In reply to this post by Risker
On Wed, Jul 23, 2008 at 8:32 AM, Risker <[hidden email]> wrote:

> 2008/7/23 Grease Monkee <[hidden email]>:
>>
>> Newyorkbrad is a member of Wikipedia Review; he is therefore a troll, or
>> possibly brainwashed, and must not be listened to. Moderators, thank you
>> for
>> moderating the post initially. Too bad we can't just ban all the bad people
>> so all us good people can work in peace.
>>
>> regards ...
>>
>
> Sorry, Grease Monkee, I think the thread you were looking for was the one
> entitled "Dangerous factionalism."  Perhaps the mods can move it for you.
>
> Risker

Oh no, I'm quite sure this is the right thread. We MUST NOT allow our
minds to be poisoned by these people, and the only way to do it is to
STAMP OUT their voice on this list. I mean after all, the next thing
you know something crazy might happen, like admitting that a
WikipediaReviewer actually had a good idea.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Gregory Maxwell
In reply to this post by Risker
On Wed, Jul 23, 2008 at 11:29 AM, Risker <[hidden email]> wrote:
> I am of the belief that only mainspace should be indexed by default, and
> that only limited project space (e.g., policies) should be indexed but that
> all other areas should be no-index by default. Userspace should default to
> no-index, in particular. I recall a very contentious BLP-related discussion
> that took place over several months, and was discussed not only in project
> space and on the talk pages of related articles, but was also discussed on
> multiple user pages. When I do a search for that subject now, ("name of
> celebrity" +"BLP issue" +wikipedia) what comes up is the many discussions on
> userpages, perpetuating the BLP problem.

Agreed: http://lists.wikimedia.org/pipermail/wikien-l/2007-September/081682.html
(and some other posts in that month; the link there at the time was to
a google search that showed that 'Thomas Dalton' #1 hit on google was
a "this user has been banned from Wikipedia" userpage notice)

I'd also add portal namespace to the indexable stuff.  But yea...
indexing Main + Portal + named pages elsewhere would be really good.
It would produce the right results for the vast majority of the
searchers.

(and I for one don't mind that this thread was began by an obviously
evil troll!  ;) )

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Charlotte Webb
In reply to this post by Joe Szilagyi
On 7/23/08, Joe Szilagyi <[hidden email]> wrote:
> What is the benefit to allowing Google to index DRV, talk pages, and
> user/user talk pages? Aside from the Mediawiki native search function not
> being always that great, the only negative to blocking or restricting Search
> Engines to just cover strictly Article space would be a possible loss of
> Google Juice, which should not a concern.

As far as I'm concerned, Google juice, i.e. page-rank and whatnot can
go jump in the lake.

1. Build a search engine of Google-esque calibre (boolean +A +B -"C D"
etc. to search any and all WMF projects of the user's choosing),

2. Put it on the toolserver,

3. Configure the toolserver's robots.txt to unwelcome Google, at least
from indexing anything related to the toolserver search engine.

4. Configure all WMF projects' robots.txt to welcome Google indexing
only of main-space, article, portal, etc. "content" pages.

5. (optional, sounds quite tricky) Split the category namespace.
Figure out some way to train google-bot to:

index content categories like
*Category:English popes
*Category:Bob Dylan songs
*Category:Pacific Ocean

ignore logistical crap like
*Category:Articles with unsourced statements since December 2006
*Category:Unsuccessful requests for adminship
*Category:Suspected Wikipedia sockpuppets of Janis Doe
*Category:Start-Class biography (sports and games) articles
*Category:Sports templates by country
etc. etc.

could somebody think of a reliable way to do this, short of creating a
separate name-space?

—C.W.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Gregory Maxwell
On Wed, Jul 23, 2008 at 12:09 PM, Charlotte Webb
<[hidden email]> wrote:
> 5. (optional, sounds quite tricky) Split the category namespace.
> Figure out some way to train google-bot to:
[snip]

That could be addressed with a __NOINDEX__ parser directive that could
be applied on a page by page basis for things like that...  the
complication there is that eventually people would abuse it to hide
things in places we normally expect to be indexed.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Chris Howie
In reply to this post by Charlotte Webb
On Wed, Jul 23, 2008 at 12:09 PM, Charlotte Webb
<[hidden email]> wrote:
> could somebody think of a reliable way to do this, short of creating a
> separate name-space?

From a technical standpoint we just need a way to either:

1. Insert any <meta/> tag from the article content.  (Bad idea for
security reasons.)

2. Make some template-esque tag like {{{noindex}}} that will instruct
the engine to include the following tag in the <head/> element:

<meta name="robots" content="noindex" />

Note that nofollow should not be present, because we do want it to
crawl to the linked articles.  We just don't want the category
indexed.

This tag should follow transclusion rules -- then we could just insert
this into whatever template we currently use to mark such categories
(if any).


P.S. Regarding Gregory's response (that came in while writing this)
potential abuse is not really a concern.  We have a block button.  The
trick is coming up with a policy or guideline on usage so people know
what's acceptable and what's not.

Alternately (thinking while I type here, bear with me) we could have a
MediaWiki: page listing pages that we don't want indexed.  Possibly
specifying a template would catch all pages that template is
transcluded to?  Then it could be protected if it became an issue.

--
Chris Howie
http://www.chrishowie.com
http://en.wikipedia.org/wiki/User:Crazycomputers

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Newyorkbrad (Wikipedia)
In reply to this post by Gregory Maxwell
On 7/23/08, Gregory Maxwell <[hidden email]> wrote:

>
> On Wed, Jul 23, 2008 at 12:09 PM, Charlotte Webb
> <[hidden email]> wrote:
> > 5. (optional, sounds quite tricky) Split the category namespace.
> > Figure out some way to train google-bot to:
> [snip]
>
> That could be addressed with a __NOINDEX__ parser directive that could
> be applied on a page by page basis for things like that...  the
> complication there is that eventually people would abuse it to hide
> things in places we normally expect to be indexed.


Changing the default indexing status of a page (index to no-index or vice
versa) could theoretically be made an admin-only function (and would count
as use of an administrator tool, with the accountability implied thereby).
However, this implies longer-term mediawiki programming changes of unknown
complexity, so certainly shouldn't become a barrier to other progress.

Newyorkbrad
_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Eugene van der Pijll
In reply to this post by Chris Howie
Chris Howie schreef:
> P.S. Regarding Gregory's response (that came in while writing this)
> potential abuse is not really a concern.  We have a block button.

Indeed.

The worst abuse that can happen is that vandals un-noindex libelous
information, so that it shows up in Google.

But consider that as long as article space is indexed (which it should
be -- we shouldn't put anything in the main namespace that we wouldn't
be happy about showing up in Google) vandals will always be able to do
just that, by adding the info to an article.

Our strategy for the latter tactic is to block those vandals; there is
no need for stronger measures for vandals who tamper with no-index.

Eugene

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Gregory Maxwell
In reply to this post by Chris Howie
On Wed, Jul 23, 2008 at 12:20 PM, Chris Howie <[hidden email]> wrote:
[snip]
> 2. Make some template-esque tag like {{{noindex}}} that will instruct
> the engine to include the following tag in the <head/> element:

Parser directives like __noindex__ are the mediawiki esq way of
accomplishing things like this. ... but it's all the same..

> P.S. Regarding Gregory's response (that came in while writing this)
> potential abuse is not really a concern.  We have a block button.  The
> trick is coming up with a policy or guideline on usage so people know
> what's acceptable and what's not.

It's not just me pointing this out... proposals like this have been
previously rejected on this basis:

https://bugzilla.wikimedia.org/show_bug.cgi?id=9415
https://bugzilla.wikimedia.org/show_bug.cgi?id=8068

Blocking is a good tool to stop abuse but it only works once we've
found it. Someone could sneakily create __noindex__ pages, especially
via transcluding no-indexing templates.

Also of relevance to this discussion please see:
https://bugzilla.wikimedia.org/show_bug.cgi?id=11443

> Alternately (thinking while I type here, bear with me) we could have a
> MediaWiki: page listing pages that we don't want indexed.  Possibly
> specifying a template would catch all pages that template is
> transcluded to?  Then it could be protected if it became an issue.

Having to read some enormous page every page-load wouldn't be good. It
would be better to do the right thing on average per-namespace then
use something in the pages to control exceptions.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Gregory Maxwell
In reply to this post by Newyorkbrad (Wikipedia)
On Wed, Jul 23, 2008 at 12:26 PM, Newyorkbrad (Wikipedia)
<[hidden email]> wrote:
> Changing the default indexing status of a page (index to no-index or vice
> versa) could theoretically be made an admin-only function (and would count
> as use of an administrator tool, with the accountability implied thereby).
> However, this implies longer-term mediawiki programming changes of unknown
> complexity, so certainly shouldn't become a barrier to other progress.

Right. Perfect is the enemy of good.  Get the defaults sane
per-namespace and then we'll be motivated to figure out how to set the
defaults.

I'd propose that only Main, Portal, Category, and Image be indexed.
With Category eventually slimmed down via some more selective process,
and Wikipedia: eventually puffed up.   ... though I'd support any and
all proposals that cut down on the indexing of 'meta' namespaces.

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Chris Howie
In reply to this post by Gregory Maxwell
On Wed, Jul 23, 2008 at 12:35 PM, Gregory Maxwell <[hidden email]> wrote:

> On Wed, Jul 23, 2008 at 12:20 PM, Chris Howie <[hidden email]> wrote:
>> P.S. Regarding Gregory's response (that came in while writing this)
>> potential abuse is not really a concern.  We have a block button.  The
>> trick is coming up with a policy or guideline on usage so people know
>> what's acceptable and what's not.
>
> It's not just me pointing this out... proposals like this have been
> previously rejected on this basis:
>
> https://bugzilla.wikimedia.org/show_bug.cgi?id=9415
> https://bugzilla.wikimedia.org/show_bug.cgi?id=8068
>
> Blocking is a good tool to stop abuse but it only works once we've
> found it. Someone could sneakily create __noindex__ pages, especially
> via transcluding no-indexing templates.

People do sneaky mainspace vandalism too.

>> Alternately (thinking while I type here, bear with me) we could have a
>> MediaWiki: page listing pages that we don't want indexed.  Possibly
>> specifying a template would catch all pages that template is
>> transcluded to?  Then it could be protected if it became an issue.
>
> Having to read some enormous page every page-load wouldn't be good. It
> would be better to do the right thing on average per-namespace then
> use something in the pages to control exceptions.

That is how I meant it -- a page of exceptions.  In the case of
categories, it could point at just a template we put on
non-encyclopedic categories, if "noindex-by-transclusion" can work.

--
Chris Howie
http://www.chrishowie.com
http://en.wikipedia.org/wiki/User:Crazycomputers

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
Reply | Threaded
Open this post in threaded view
|

Re: No-indexing of project-space pages

Gregory Maxwell
On Wed, Jul 23, 2008 at 12:42 PM, Chris Howie <[hidden email]> wrote:
>> https://bugzilla.wikimedia.org/show_bug.cgi?id=9415
>> https://bugzilla.wikimedia.org/show_bug.cgi?id=8068
>>
>> Blocking is a good tool to stop abuse but it only works once we've
>> found it. Someone could sneakily create __noindex__ pages, especially
>> via transcluding no-indexing templates.
>
> People do sneaky mainspace vandalism too.

Indeed. In mainspace. And hide it from being found by noindexing it.
::shrugs::.  It's not primarily my argument. Go read the bugzilla
entries I linked to.

>> Having to read some enormous page every page-load wouldn't be good. It
>> would be better to do the right thing on average per-namespace then
>> use something in the pages to control exceptions.
>
> That is how I meant it -- a page of exceptions.  In the case of
> categories, it could point at just a template we put on
> non-encyclopedic categories, if "noindex-by-transclusion" can work.

An explicit list of exemptions could reasonably grow to very large and
it would need to be scanned for membership every time a page is
parsed. I would be somewhat surprised if there were not >1000
meta-categories already.  Go look at how __NOTOC__ works, that would
be the most logical way of doing this in mediawiki.  Thoughts?

_______________________________________________
WikiEN-l mailing list
[hidden email]
To unsubscribe from this mailing list, visit:
https://lists.wikimedia.org/mailman/listinfo/wikien-l
1234 ... 6