Applying nofollow only to external links added in revisions that are still unpatrolled

classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Applying nofollow only to external links added in revisions that are still unpatrolled

Nathan Larson
Following the mediawiki-l
discussion<http://lists.wikimedia.org/pipermail/mediawiki-l/2013-November/042038.html>about
$wgNoFollowLinks and various other discussions, in which some
discontent was expressed with the current two options of either applying or
not applying nofollow to all external links, I wanted to see what support
there might be for applying nofollow only to external links added in
revisions that are still unpatrolled (bug
42599<https://bugzilla.wikimedia.org/show_bug.cgi?id=42599>
).

How common do you think it would be for a use case to arise in which one
could be confident that a revision's being patrolled means that the
external links added in that revision have been adequately reviewed for
spamminess? Nemo had mentioned "sysadmins would be interested in this only
if their wiki has a strict definition of what's patrollable which matches
the assumptions here." In my experience, spam is pretty easy to spot
because the bots aren't very subtle about it.

I would think that if someone went around marking such obviously spammy
edits as patrolled, that if there were any bureaucrats around who cared
about keeping spam off the wiki, his patrol rights would end up getting
taken away. Spam is a form of vandalism, so it would fall under the duties
of patrollers. At Wikipedia, RecentChanges patrollers are expected to be on
the lookout for spam.
https://en.wikipedia.org/wiki/Wikipedia:Recent_changes_patrol#Spam

--
Nathan Larson <https://mediawiki.org/wiki/User:Leucosticte>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Mark A. Hershberger-4
On 11/17/2013 06:41 AM, Nathan Larson wrote:
> I wanted to see what support
> there might be for applying nofollow only to external links added in
> revisions that are still unpatrolled (bug 42599)

I think I could support this.

After that wiki-spamer site last week, I went looking for various forums
and such where these things are discussed and saw that they
(wiki-spammers) don't seem to see nofollow as a real impediment to their
work.

So, I won't say we should just drop nofollow, but it obviously doesn't
put much in the way of spammers.

Mark.



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Mark A. Hershberger-4
In reply to this post by Nathan Larson
On 11/17/2013 06:41 AM, Nathan Larson wrote:
> I wanted to see what support
> there might be for applying nofollow only to external links added in
> revisions that are still unpatrolled (bug 42599)

I think I could support this.

After that wiki-spamer site last week, I went looking for various forums
and such where these things are discussed and saw that they
(wiki-spammers) don't seem to see nofollow as a real impediment to their
work.

So, I won't say we should just drop nofollow, but it obviously doesn't
put much in the way of spammers.

Mark.



_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Happy Melon-2
In reply to this post by Nathan Larson
On 17 November 2013 11:41, Nathan Larson <[hidden email]> wrote:

> In my experience, spam is pretty easy to spot
> because the bots aren't very subtle about it.
>

I'm sure spam directed at, say, enwiki, would get very subtle very quickly
if spammers thought there was a real chance of it being able to use
enwiki's pagerank weight.  Don't underestimate spammers' ability to learn
and adapt.

--HM
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Ryan Kaldari-2
On Mon, Nov 18, 2013 at 3:09 PM, Happy Melon <[hidden email]>wrote:

>
> I'm sure spam directed at, say, enwiki, would get very subtle very quickly
> if spammers thought there was a real chance of it being able to use
> enwiki's pagerank weight.  Don't underestimate spammers' ability to learn
> and adapt.
>

+1. I think this would be a very bad idea. If we opened up external links
to Google, I'm sure it would only be a matter of time before spammers
started figuring out how to get revision reviewing rights. It's just a
question of economics. Which is cheaper: Paying an SEO company $5000 to
improve your pagerank or paying a Wiki-PR editor $100 to do the same (and
probably more effectively).

Ryan Kaldari
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Marc-Andre
In reply to this post by Happy Melon-2
On 11/18/2013 04:39 AM, Happy Melon wrote:
> I'm sure spam directed at, say, enwiki, would get very subtle very quickly
> if spammers thought there was a real chance of it being able to use
> enwiki's pagerank weight.  Don't underestimate spammers' ability to learn
> and adapt.

Also +1; pagerank is a valuable thing and Wikipedia has lots of it.
Spammers would be quick to find ways to cheat, lie and manipulate their
way into tapping into it.

Right now, we are plagued with the spammers that are too desperate or
stupid to care; if we turned nofollow off, they would all descend upon
us like a plague of locusts.

-- Marc


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Arcane 21
I agree. While spammers are so pathetic they will do anything for page views, I have to admire (and detest) their ability to adapt in order to spread their nonsense.

Anything that slows them down, even in the slightest degree, is something I support and recommend.

> Date: Mon, 18 Nov 2013 09:24:54 -0500
> From: [hidden email]
> To: [hidden email]
> Subject: Re: [Wikitech-l] Applying nofollow only to external links added in revisions that are still unpatrolled
>
> On 11/18/2013 04:39 AM, Happy Melon wrote:
> > I'm sure spam directed at, say, enwiki, would get very subtle very quickly
> > if spammers thought there was a real chance of it being able to use
> > enwiki's pagerank weight.  Don't underestimate spammers' ability to learn
> > and adapt.
>
> Also +1; pagerank is a valuable thing and Wikipedia has lots of it.
> Spammers would be quick to find ways to cheat, lie and manipulate their
> way into tapping into it.
>
> Right now, we are plagued with the spammers that are too desperate or
> stupid to care; if we turned nofollow off, they would all descend upon
> us like a plague of locusts.
>
> -- Marc
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
     
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Risker
I also agree.

Perhaps more importantly, I don't see any actual argument for *not* using
nofollow.  We're not here to drive pagerank for other websites, and our
doing so can be harmful to those sites, or to the article subject.

Risker

On 18 November 2013 09:44, Arcane 21 <[hidden email]> wrote:

> I agree. While spammers are so pathetic they will do anything for page
> views, I have to admire (and detest) their ability to adapt in order to
> spread their nonsense.
>
> Anything that slows them down, even in the slightest degree, is something
> I support and recommend.
>
> > Date: Mon, 18 Nov 2013 09:24:54 -0500
> > From: [hidden email]
> > To: [hidden email]
> > Subject: Re: [Wikitech-l] Applying nofollow only to external links added
> in revisions that are still unpatrolled
>  >
> > On 11/18/2013 04:39 AM, Happy Melon wrote:
> > > I'm sure spam directed at, say, enwiki, would get very subtle very
> quickly
> > > if spammers thought there was a real chance of it being able to use
> > > enwiki's pagerank weight.  Don't underestimate spammers' ability to
> learn
> > > and adapt.
> >
> > Also +1; pagerank is a valuable thing and Wikipedia has lots of it.
> > Spammers would be quick to find ways to cheat, lie and manipulate their
> > way into tapping into it.
> >
> > Right now, we are plagued with the spammers that are too desperate or
> > stupid to care; if we turned nofollow off, they would all descend upon
> > us like a plague of locusts.
> >
> > -- Marc
> >
> >
> > _______________________________________________
> > Wikitech-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Tyler Romeo
On Mon, Nov 18, 2013 at 11:21 AM, Risker <[hidden email]> wrote:

> Perhaps more importantly, I don't see any actual argument for *not* using
> nofollow.  We're not here to drive pagerank for other websites, and our
> doing so can be harmful to those sites, or to the article subject.
>

Wikipedia's purpose may not be to drive PageRank, but nonetheless I think
the argument for using nofollow is pretty clear. Why would Wikipedia want
to purposely make search engine results less useful? The question here is
whether spammers are smart enough to get around us and boost their PageRank
artifically, which seems to be the case.

*-- *
*Tyler Romeo*
Stevens Institute of Technology, Class of 2016
Major in Computer Science
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Gabriel Wicke-3
In reply to this post by Nathan Larson
On 11/17/2013 03:41 AM, Nathan Larson wrote:
> Following the mediawiki-l
> discussion<http://lists.wikimedia.org/pipermail/mediawiki-l/2013-November/042038.html>about
> $wgNoFollowLinks and various other discussions, in which some
> discontent was expressed with the current two options of either applying or
> not applying nofollow to all external links, I wanted to see what support
> there might be for applying nofollow only to external links added in
> revisions that are still unpatrolled (bug
> 42599<https://bugzilla.wikimedia.org/show_bug.cgi?id=42599>
> ).

Google and probably other search engines have a custom rule to ignore
rel=nofollow in MediaWiki-powered wikis. It seems that our external
links are too high quality to pass up. See
https://bugzilla.wikimedia.org/show_bug.cgi?id=52617.

Gabriel

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Nathan Larson
In reply to this post by Tyler Romeo
To aggregate some of the arguments and counter-arguments, I posted
https://www.mediawiki.org/wiki/The_dofollow_FAQ and
https://www.mediawiki.org/wiki/Manual:Costs_and_benefits_of_using_nofollow.
It does seem, from my googling of what the owners of smaller wikis
have
to say about it, that nofollow is less popular outside of WMF with many of
those wiki owners who have taken the time to analyze the issue. On the
other hand, it could be that people who were happy with the default felt
less dissatisfied with MediaWiki devs' decision and therefore didn't feel
as much need to voice their opinions, since they had already gotten their
way and didn't have to take any measures to override the default.

I do think the implications of changing how nofollow is applied are very
different on, say, Wikipedia than they would be on a small or even
medium-sized wiki where the average user watches RecentChanges instead of a
watchlist. In a small town, you can leave your doors unlocked and get away
with it because you don't have as much traffic coming through and the
neighbors would notice and care about (for curiosity, if no other reason)
the presence of anyone who seemed out of place. It's the same way on these
small wikis; it's rare than anyone comes along to try to subtly add a spam
link, and when they do, it's noticed. Likewise, if someone starts marking
spammy edits as patrolled, that gets noticed.

Spambots are not able yet to be subtle, and the labor required to get
accustomed to the norms of a wiki and to become fluent enough in the native
language to fit in require a skilled labor that is more expensive than that
required to simply pass a CAPTCHA. So, I think that putting dofollow on
patrolled external links would be okay especially on smaller wikis, as the
patrol would stop the spambots from getting a pagerank boost and the labor
costs would deter the subtler ones. Even on Wikipedia, those fighting spam
can take advantage of the same economies of scale as those adding spam,
such as using pattern recognition on the entire wiki to catch people, or
blacklisting individual spammers and taking measures to keep them out (on
the smaller wikis, a person caught spamming can just go to another wiki,
but if you're caught spamming on Wikipedia, there isn't another site of
Wikipedia's size and scope you can go to.)

To say that patrolling wouldn't do enough to keep spam out is basically to
say, at least to some extent, that patrolling is not a very effective
system and that the wiki way doesn't work very well. If Google agrees, they
can stop giving wikis in general, or certain wikis, such influence over
pagerank. The spammers have market incentives to become more sophisticated,
but so does Google, since their earnings depend on keeping their search
results relevant and useful, so that people don't switch to competitors
that do a better job.

The question of what the default configuration should be, or what
configuration should be used on WMF sites, can be addressed in other bugs
besides this one. It doesn't take much coding to change a default setting
from "true" to "false". For now, I would just like to implement the feature
and make it available for those wikis who want to use it. So, is there
support for putting this in the core as an optional feature, and is there
anyone who will do the code review if I write this?
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Gabriel Wicke-3
On 11/18/2013 09:46 AM, Nathan Larson wrote:
> I do think the implications of changing how nofollow is applied are very
> different on, say, Wikipedia than they would be on a small or even
> medium-sized wiki

As I said, at least for Google there should be no difference as it
ignores rel=nofollow on MediaWiki-powered sites anyway. See
https://bugzilla.wikimedia.org/show_bug.cgi?id=52617.

Gabriel

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Happy Melon-2
In reply to this post by Nathan Larson
On 18 November 2013 17:46, Nathan Larson <[hidden email]> wrote:


> If Google agrees, they
> can stop giving wikis in general, or certain wikis, such influence over
> pagerank. The spammers have market incentives to become more sophisticated,
> but so does Google, since their earnings depend on keeping their search
> results relevant and useful, so that people don't switch to competitors
> that do a better job.
>

Market forces are not our friend.  Google's incentive is to *ignore* spammy
links, not to stop them existing; spammers' incentive is to get their links
wherever they possibly can, and particularly in the places where they're
effective, not to avoid putting links where they're not effective.  Pure
market forces would leave wikis (large and small) attacked by progressively
more sophisticated spam, search engines being progressively smarter about
ignoring the spam, and wikis *still being served with as much spam as
before* (and it being progressively harder to identify and remove).

Wikis can only participate in the arms race by exposing publicly the
*extent* to which spamming is pointless.  Google publicising the fact that
nofollow is ignored (and hence spamming is pointful) is actually a really
unhelpful thing for them to do.  If they really have taken the nofollow
weapon away from wikis altogether, then we need to find a way to get it
back.

--HM
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Brian Wolff
In reply to this post by Nathan Larson
> I do think the implications of changing how nofollow is applied are very
> different on, say, Wikipedia than they would be on a small or even
> medium-sized wiki where the average user watches RecentChanges instead of
a
> watchlist. In a small town, you can leave your doors unlocked and get away
> with it because you don't have as much traffic coming through and the
> neighbors would notice and care about (for curiosity, if no other reason)
> the presence of anyone who seemed out of place. It's the same way on these
> small wikis; it's rare than anyone comes along to try to subtly add a spam
> link, and when they do, it's noticed. Likewise, if someone starts marking
> spammy edits as patrolled, that gets noticed.

That's actually the opposite of what I expect. Small wikis have much less
resources to deal with spam, so the per capita spam is significantly larger
(imho)

> The question of what the default configuration should be, or what
> configuration should be used on WMF sites, can be addressed in other bugs
> besides this one. It doesn't take much coding to change a default setting
> from "true" to "false". For now, I would just like to implement the
feature
> and make it available for those wikis who want to use it. So, is there
> support for putting this in the core as an optional feature, and is there
> anyone who will do the code review if I write this?
>

If there reasonably conceivable exists 3rd party users who want such a
feature, I (speaking just for myself) see no problem with having it as an
off by default, feature in core.

-bawolff _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Nathan Larson
In reply to this post by Gabriel Wicke-3
On Mon, Nov 18, 2013 at 12:58 PM, Gabriel Wicke <[hidden email]>wrote:

> On 11/18/2013 09:46 AM, Nathan Larson wrote:
> > I do think the implications of changing how nofollow is applied are very
> > different on, say, Wikipedia than they would be on a small or even
> > medium-sized wiki
>
> As I said, at least for Google there should be no difference as it
> ignores rel=nofollow on MediaWiki-powered sites anyway. See
> https://bugzilla.wikimedia.org/show_bug.cgi?id=52617.
>
> Gabriel
>

Do we have any way of knowing that Yong-Gang Wang of Google is correct
about this? I sent a message to this
individual<https://plus.google.com/105349418663822362024/about>(hopefully
it's the same guy) asking for more information. It seems like a
pretty major departure from past Google policy/practice.
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Gabriel Wicke-3
On 11/18/2013 10:11 AM, Nathan Larson wrote:
> Do we have any way of knowing that Yong-Gang Wang of Google is correct
> about this? I sent a message to this
> individual<https://plus.google.com/105349418663822362024/about>(hopefully
> it's the same guy) asking for more information. It seems like a
> pretty major departure from past Google policy/practice.

I think it is highly likely that he is correct about this. Professional
spammers will likely monitor the effect of their campaigns closely, so
would know about this first. I would expect less wiki spam if
rel=nofollow was actually honored. Especially hidden (unclickable) links
don't have much value apart from page rank.

Gabriel

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Brian Wolff
On 2013-11-18 2:19 PM, "Gabriel Wicke" <[hidden email]> wrote:

>
> On 11/18/2013 10:11 AM, Nathan Larson wrote:
> > Do we have any way of knowing that Yong-Gang Wang of Google is correct
> > about this? I sent a message to this
> > individual<https://plus.google.com/105349418663822362024/about
>(hopefully
> > it's the same guy) asking for more information. It seems like a
> > pretty major departure from past Google policy/practice.
>
> I think it is highly likely that he is correct about this. Professional
> spammers will likely monitor the effect of their campaigns closely, so
> would know about this first. I would expect less wiki spam if
> rel=nofollow was actually honored. Especially hidden (unclickable) links
> don't have much value apart from page rank.
>
> Gabriel
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

That certainly sounds logical for wikipedia and friends. However it sounds
kind of odd for mediawiki in general. There exists many unmaintained mw
installs just collecting spam.

It would also be interesting to know if other search engines do something
similar.

-bawolff
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Mark A. Hershberger-2
In reply to this post by Happy Melon-2
On 11/18/2013 01:03 PM, Happy Melon wrote:
> wikis *still being served with as much spam as
> before* (and it being progressively harder to identify and remove).

http://xkcd.com/810/

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Risker
In reply to this post by Gabriel Wicke-3
On 18 November 2013 11:47, Gabriel Wicke <[hidden email]> wrote:

> On 11/17/2013 03:41 AM, Nathan Larson wrote:
> > Following the mediawiki-l
> > discussion<
> http://lists.wikimedia.org/pipermail/mediawiki-l/2013-November/042038.html
> >about
> > $wgNoFollowLinks and various other discussions, in which some
> > discontent was expressed with the current two options of either applying
> or
> > not applying nofollow to all external links, I wanted to see what support
> > there might be for applying nofollow only to external links added in
> > revisions that are still unpatrolled (bug
> > 42599<https://bugzilla.wikimedia.org/show_bug.cgi?id=42599>
> > ).
>
> Google and probably other search engines have a custom rule to ignore
> rel=nofollow in MediaWiki-powered wikis. It seems that our external
> links are too high quality to pass up. See
> https://bugzilla.wikimedia.org/show_bug.cgi?id=52617.
>

 Oh dear. This becomes a philosophical  versus practical discussion.
Practically, we have nowhere near enough spam-fighters to keep just the
obvious spam off our projects, let alone the not-as-obvious spam.

People keep mixing up English Wikipedia (with thousands of active editors,
many of whom do nothing but page patrolling) with the rest of the Wikimedia
projects, many of which have only a handful of active editors, who then get
stuck having to choose between spam-fighting or adding content. Software
decisions should not be made based on the assumption that some editor
somewhere will clean up the problems.

Given the ease by which all Mediawiki wikis can be infiltrated by useless
and spam links, and the particular ease by which most Wikimedia wikis can
be infiltrated, Google's pretty badly polluting their pageranks if they're
given links to Wikimedia projects any significant rank.

To be honest, I suspect if the Google fellow said anything like this, it
was that they might ignore nofollow on Wikimedia wikis, but I'm pretty
certain that he didn't say Mediawiki wikis.  There are thousands and
thousands of them out there that have been completely abandoned to spam.


Risker
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: Applying nofollow only to external links added in revisions that are still unpatrolled

Gabriel Wicke-3
On 11/18/2013 12:27 PM, Risker wrote:
> To be honest, I suspect if the Google fellow said anything like this, it
> was that they might ignore nofollow on Wikimedia wikis, but I'm pretty
> certain that he didn't say Mediawiki wikis.

I remember being surprised too that it applied to all MediaWiki
installations rather than just Wikimedia sites. I have pinged him about it.

Gabriel

_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
12