[Wikimedia-l] Announcing: The Wikipedia Prize!

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] Announcing: The Wikipedia Prize!

Brian
I'm sure many of you recall the Netflix Prize
<http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for Wikipedia!

Although the initial goal of the Netflix Prize was to design a
collaborative filtering algorithm, it became notorious when the data was
used to de-anonymize Netflix users. Researchers proved that given just a
user's movie ratings on one site, you can plug those ratings into another
site, such as the IMDB. You can then take that information, and with some
Google searches and optionally a bit of cash (for websites that sell user
information, including, in some cases, their SSN) figure out who they are.
You could even drive up to their house and take a selfie with them, or
follow them to work and meet their boss and tell them about their views on
the topics they were editing.

Here, we'll cut straight to the privacy chase. Using just the full history
dump of the English Wikipedia, excluding edits from any logged-in users,
identify five people. You must confirm their identities with them, and
privately prove to me that you've done this. I will then nominate you as
the winner and send you one million Satoshis (the smallest unit of Bitcoin,
times 1 million), in addition to updating this thread.

I suspect this challenge will be very easy for anyone who is determined.
Indeed, even if MediaWiki no longer displayed IP addresses, there would
still be enough information to identify people. Completely getting rid of
the edit history would largely solve the problem. In the mean time, this
Prize will serve as a reminder that when Wikipedia says "Your IP address
will be publicly visible if you make any edits." what they mean is, "People
will probably be able to figure out where you live and embarrass you."

An extra million Satoshis for each NSA employee that you identify. A full
bitcoin if you take a selfie with them.

Let the games begin!

Brian Mingus
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Richard Symonds-3
I worry that encouraging people to do this to prove a political point could
be inappropriate. It's one thing to point out a potential privacy flaw, but
paying people to exploit it may be seen as a step too far.

Richard Symonds
Wikimedia UK
0207 065 0992

Wikimedia UK is a Company Limited by Guarantee registered in England and
Wales, Registered No. 6741827. Registered Charity No.1144513. Registered
Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT.
United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia
movement. The Wikimedia projects are run by the Wikimedia Foundation (who
operate Wikipedia, amongst other projects).

*Wikimedia UK is an independent non-profit charity with no legal control
over Wikipedia nor responsibility for its contents.*

On 29 March 2015 at 23:25, Brian <[hidden email]> wrote:

> I'm sure many of you recall the Netflix Prize
> <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for Wikipedia!
>
> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into another
> site, such as the IMDB. You can then take that information, and with some
> Google searches and optionally a bit of cash (for websites that sell user
> information, including, in some cases, their SSN) figure out who they are.
> You could even drive up to their house and take a selfie with them, or
> follow them to work and meet their boss and tell them about their views on
> the topics they were editing.
>
> Here, we'll cut straight to the privacy chase. Using just the full history
> dump of the English Wikipedia, excluding edits from any logged-in users,
> identify five people. You must confirm their identities with them, and
> privately prove to me that you've done this. I will then nominate you as
> the winner and send you one million Satoshis (the smallest unit of Bitcoin,
> times 1 million), in addition to updating this thread.
>
> I suspect this challenge will be very easy for anyone who is determined.
> Indeed, even if MediaWiki no longer displayed IP addresses, there would
> still be enough information to identify people. Completely getting rid of
> the edit history would largely solve the problem. In the mean time, this
> Prize will serve as a reminder that when Wikipedia says "Your IP address
> will be publicly visible if you make any edits." what they mean is, "People
> will probably be able to figure out where you live and embarrass you."
>
> An extra million Satoshis for each NSA employee that you identify. A full
> bitcoin if you take a selfie with them.
>
> Let the games begin!
>
> Brian Mingus
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Katherine Casey
Publicly identifying anonymous Wikimedians, especially with reference to
their editing histories, is not just an academic way to make a point; it's
messing with people's real lives, and it's not something I'm particularly
comfortable seeing suggested, especially for a reward, on a
wikimedia-hosted listserv. I mean, I see the point you're trying to make,
but making people whose privacy may already be imperfect into
explicitly-outed victims is rather like burning down the house to prove it
ought to have been fireproofed better: you've made your point, but now you
have no house. If you want to see if you can identify people using leaky
data, ask for volunteers from among those who are comfortable having their
identities researched this way and work on identifying them with their
consent.

On Mon, Mar 30, 2015 at 12:48 PM, Richard Symonds <
[hidden email]> wrote:

> I worry that encouraging people to do this to prove a political point could
> be inappropriate. It's one thing to point out a potential privacy flaw, but
> paying people to exploit it may be seen as a step too far.
>
> Richard Symonds
> Wikimedia UK
> 0207 065 0992
>
> Wikimedia UK is a Company Limited by Guarantee registered in England and
> Wales, Registered No. 6741827. Registered Charity No.1144513. Registered
> Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT.
> United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia
> movement. The Wikimedia projects are run by the Wikimedia Foundation (who
> operate Wikipedia, amongst other projects).
>
> *Wikimedia UK is an independent non-profit charity with no legal control
> over Wikipedia nor responsibility for its contents.*
>
> On 29 March 2015 at 23:25, Brian <[hidden email]> wrote:
>
> > I'm sure many of you recall the Netflix Prize
> > <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for
> Wikipedia!
> >
> > Although the initial goal of the Netflix Prize was to design a
> > collaborative filtering algorithm, it became notorious when the data was
> > used to de-anonymize Netflix users. Researchers proved that given just a
> > user's movie ratings on one site, you can plug those ratings into another
> > site, such as the IMDB. You can then take that information, and with some
> > Google searches and optionally a bit of cash (for websites that sell user
> > information, including, in some cases, their SSN) figure out who they
> are.
> > You could even drive up to their house and take a selfie with them, or
> > follow them to work and meet their boss and tell them about their views
> on
> > the topics they were editing.
> >
> > Here, we'll cut straight to the privacy chase. Using just the full
> history
> > dump of the English Wikipedia, excluding edits from any logged-in users,
> > identify five people. You must confirm their identities with them, and
> > privately prove to me that you've done this. I will then nominate you as
> > the winner and send you one million Satoshis (the smallest unit of
> Bitcoin,
> > times 1 million), in addition to updating this thread.
> >
> > I suspect this challenge will be very easy for anyone who is determined.
> > Indeed, even if MediaWiki no longer displayed IP addresses, there would
> > still be enough information to identify people. Completely getting rid of
> > the edit history would largely solve the problem. In the mean time, this
> > Prize will serve as a reminder that when Wikipedia says "Your IP address
> > will be publicly visible if you make any edits." what they mean is,
> "People
> > will probably be able to figure out where you live and embarrass you."
> >
> > An extra million Satoshis for each NSA employee that you identify. A full
> > bitcoin if you take a selfie with them.
> >
> > Let the games begin!
> >
> > Brian Mingus
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> > <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

David Gerard-2
Context:

https://lists.wikimedia.org/pipermail/wikien-l/2015-March/thread.html

Brian believes that Wikimedia recording non-logged-in editors' IPs is
*literally* the same as the NSA hoovering up all data they can get
anywhere.


On 30 March 2015 at 18:13, Katherine Casey <[hidden email]> wrote:

> Publicly identifying anonymous Wikimedians, especially with reference to
> their editing histories, is not just an academic way to make a point; it's
> messing with people's real lives, and it's not something I'm particularly
> comfortable seeing suggested, especially for a reward, on a
> wikimedia-hosted listserv. I mean, I see the point you're trying to make,
> but making people whose privacy may already be imperfect into
> explicitly-outed victims is rather like burning down the house to prove it
> ought to have been fireproofed better: you've made your point, but now you
> have no house. If you want to see if you can identify people using leaky
> data, ask for volunteers from among those who are comfortable having their
> identities researched this way and work on identifying them with their
> consent.
>
> On Mon, Mar 30, 2015 at 12:48 PM, Richard Symonds <
> [hidden email]> wrote:
>
>> I worry that encouraging people to do this to prove a political point could
>> be inappropriate. It's one thing to point out a potential privacy flaw, but
>> paying people to exploit it may be seen as a step too far.
>>
>> Richard Symonds
>> Wikimedia UK
>> 0207 065 0992
>>
>> Wikimedia UK is a Company Limited by Guarantee registered in England and
>> Wales, Registered No. 6741827. Registered Charity No.1144513. Registered
>> Office 4th Floor, Development House, 56-64 Leonard Street, London EC2A 4LT.
>> United Kingdom. Wikimedia UK is the UK chapter of a global Wikimedia
>> movement. The Wikimedia projects are run by the Wikimedia Foundation (who
>> operate Wikipedia, amongst other projects).
>>
>> *Wikimedia UK is an independent non-profit charity with no legal control
>> over Wikipedia nor responsibility for its contents.*
>>
>> On 29 March 2015 at 23:25, Brian <[hidden email]> wrote:
>>
>> > I'm sure many of you recall the Netflix Prize
>> > <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for
>> Wikipedia!
>> >
>> > Although the initial goal of the Netflix Prize was to design a
>> > collaborative filtering algorithm, it became notorious when the data was
>> > used to de-anonymize Netflix users. Researchers proved that given just a
>> > user's movie ratings on one site, you can plug those ratings into another
>> > site, such as the IMDB. You can then take that information, and with some
>> > Google searches and optionally a bit of cash (for websites that sell user
>> > information, including, in some cases, their SSN) figure out who they
>> are.
>> > You could even drive up to their house and take a selfie with them, or
>> > follow them to work and meet their boss and tell them about their views
>> on
>> > the topics they were editing.
>> >
>> > Here, we'll cut straight to the privacy chase. Using just the full
>> history
>> > dump of the English Wikipedia, excluding edits from any logged-in users,
>> > identify five people. You must confirm their identities with them, and
>> > privately prove to me that you've done this. I will then nominate you as
>> > the winner and send you one million Satoshis (the smallest unit of
>> Bitcoin,
>> > times 1 million), in addition to updating this thread.
>> >
>> > I suspect this challenge will be very easy for anyone who is determined.
>> > Indeed, even if MediaWiki no longer displayed IP addresses, there would
>> > still be enough information to identify people. Completely getting rid of
>> > the edit history would largely solve the problem. In the mean time, this
>> > Prize will serve as a reminder that when Wikipedia says "Your IP address
>> > will be publicly visible if you make any edits." what they mean is,
>> "People
>> > will probably be able to figure out where you live and embarrass you."
>> >
>> > An extra million Satoshis for each NSA employee that you identify. A full
>> > bitcoin if you take a selfie with them.
>> >
>> > Let the games begin!
>> >
>> > Brian Mingus
>> > _______________________________________________
>> > Wikimedia-l mailing list, guidelines at:
>> > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>> > [hidden email]
>> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> > <mailto:[hidden email]?subject=unsubscribe>
>> _______________________________________________
>> Wikimedia-l mailing list, guidelines at:
>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>> [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> <mailto:[hidden email]?subject=unsubscribe>
>>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Nathan Awrich
In reply to this post by Katherine Casey
I'm hoping this is satire, but if it isn't, I think anyone paying others to
out Wikimedians should minimally be barred from further participation in
the movement.
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Newyorkbrad (Wikipedia)
In reply to this post by Brian
I agree with the others who have opined that this should not happen.

Newyorkbrad

On 3/29/15, Brian <[hidden email]> wrote:

> I'm sure many of you recall the Netflix Prize
> <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for Wikipedia!
>
> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into another
> site, such as the IMDB. You can then take that information, and with some
> Google searches and optionally a bit of cash (for websites that sell user
> information, including, in some cases, their SSN) figure out who they are.
> You could even drive up to their house and take a selfie with them, or
> follow them to work and meet their boss and tell them about their views on
> the topics they were editing.
>
> Here, we'll cut straight to the privacy chase. Using just the full history
> dump of the English Wikipedia, excluding edits from any logged-in users,
> identify five people. You must confirm their identities with them, and
> privately prove to me that you've done this. I will then nominate you as
> the winner and send you one million Satoshis (the smallest unit of Bitcoin,
> times 1 million), in addition to updating this thread.
>
> I suspect this challenge will be very easy for anyone who is determined.
> Indeed, even if MediaWiki no longer displayed IP addresses, there would
> still be enough information to identify people. Completely getting rid of
> the edit history would largely solve the problem. In the mean time, this
> Prize will serve as a reminder that when Wikipedia says "Your IP address
> will be publicly visible if you make any edits." what they mean is, "People
> will probably be able to figure out where you live and embarrass you."
>
> An extra million Satoshis for each NSA employee that you identify. A full
> bitcoin if you take a selfie with them.
>
> Let the games begin!
>
> Brian Mingus
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Robert Rohde
In reply to this post by Brian
So, you are offering a prize equivalent to US $2.50?  Not exactly an
inspirational amount of money (though perhaps that is the point).

-Robert Rohde

On Sun, Mar 29, 2015 at 3:25 PM, Brian <[hidden email]> wrote:

> I'm sure many of you recall the Netflix Prize
> <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for Wikipedia!
>
> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into another
> site, such as the IMDB. You can then take that information, and with some
> Google searches and optionally a bit of cash (for websites that sell user
> information, including, in some cases, their SSN) figure out who they are.
> You could even drive up to their house and take a selfie with them, or
> follow them to work and meet their boss and tell them about their views on
> the topics they were editing.
>
> Here, we'll cut straight to the privacy chase. Using just the full history
> dump of the English Wikipedia, excluding edits from any logged-in users,
> identify five people. You must confirm their identities with them, and
> privately prove to me that you've done this. I will then nominate you as
> the winner and send you one million Satoshis (the smallest unit of Bitcoin,
> times 1 million), in addition to updating this thread.
>
> I suspect this challenge will be very easy for anyone who is determined.
> Indeed, even if MediaWiki no longer displayed IP addresses, there would
> still be enough information to identify people. Completely getting rid of
> the edit history would largely solve the problem. In the mean time, this
> Prize will serve as a reminder that when Wikipedia says "Your IP address
> will be publicly visible if you make any edits." what they mean is, "People
> will probably be able to figure out where you live and embarrass you."
>
> An extra million Satoshis for each NSA employee that you identify. A full
> bitcoin if you take a selfie with them.
>
> Let the games begin!
>
> Brian Mingus
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Oliver Keyes-5
So, let me get this right:

1. You announced that, as David puts it, noting anonymous IPs is the
same as all-the-NSA-stuff-ever;
2. People disputed it, but suggested you go form local consensus that
this was problematic or participate in efforts to improve how we mask
and handle data if that doesn't work for you;
3. You decided that this was hard and a satirical breaching experiment
would be more enjoyable?

I'm...really not sure how this could possibly seem like a constructive
way to go about solving for this problem, to you. Andrew Gray's advice
is good advice, and still stands.

On Mon, Mar 30, 2015 at 6:43 PM, Robert Rohde <[hidden email]> wrote:

> So, you are offering a prize equivalent to US $2.50?  Not exactly an
> inspirational amount of money (though perhaps that is the point).
>
> -Robert Rohde
>
> On Sun, Mar 29, 2015 at 3:25 PM, Brian <[hidden email]> wrote:
>
>> I'm sure many of you recall the Netflix Prize
>> <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for Wikipedia!
>>
>> Although the initial goal of the Netflix Prize was to design a
>> collaborative filtering algorithm, it became notorious when the data was
>> used to de-anonymize Netflix users. Researchers proved that given just a
>> user's movie ratings on one site, you can plug those ratings into another
>> site, such as the IMDB. You can then take that information, and with some
>> Google searches and optionally a bit of cash (for websites that sell user
>> information, including, in some cases, their SSN) figure out who they are.
>> You could even drive up to their house and take a selfie with them, or
>> follow them to work and meet their boss and tell them about their views on
>> the topics they were editing.
>>
>> Here, we'll cut straight to the privacy chase. Using just the full history
>> dump of the English Wikipedia, excluding edits from any logged-in users,
>> identify five people. You must confirm their identities with them, and
>> privately prove to me that you've done this. I will then nominate you as
>> the winner and send you one million Satoshis (the smallest unit of Bitcoin,
>> times 1 million), in addition to updating this thread.
>>
>> I suspect this challenge will be very easy for anyone who is determined.
>> Indeed, even if MediaWiki no longer displayed IP addresses, there would
>> still be enough information to identify people. Completely getting rid of
>> the edit history would largely solve the problem. In the mean time, this
>> Prize will serve as a reminder that when Wikipedia says "Your IP address
>> will be publicly visible if you make any edits." what they mean is, "People
>> will probably be able to figure out where you live and embarrass you."
>>
>> An extra million Satoshis for each NSA employee that you identify. A full
>> bitcoin if you take a selfie with them.
>>
>> Let the games begin!
>>
>> Brian Mingus
>> _______________________________________________
>> Wikimedia-l mailing list, guidelines at:
>> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
>> [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
>> <mailto:[hidden email]?subject=unsubscribe>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Rich Farmbrough
Moreover this may well be a breach of policy, TOS and even law.

On 31 March 2015 at 01:15, Oliver Keyes <[hidden email]> wrote:

> So, let me get this right:
>
> 1. You announced that, as David puts it, noting anonymous IPs is the
> same as all-the-NSA-stuff-ever;
> 2. People disputed it, but suggested you go form local consensus that
> this was problematic or participate in efforts to improve how we mask
> and handle data if that doesn't work for you;
> 3. You decided that this was hard and a satirical breaching experiment
> would be more enjoyable?
>
> I'm...really not sure how this could possibly seem like a constructive
> way to go about solving for this problem, to you. Andrew Gray's advice
> is good advice, and still stands.
>
> On Mon, Mar 30, 2015 at 6:43 PM, Robert Rohde <[hidden email]> wrote:
> > So, you are offering a prize equivalent to US $2.50?  Not exactly an
> > inspirational amount of money (though perhaps that is the point).
> >
> > -Robert Rohde
> >
> > On Sun, Mar 29, 2015 at 3:25 PM, Brian <[hidden email]> wrote:
> >
> >> I'm sure many of you recall the Netflix Prize
> >> <http://en.wikipedia.org/wiki/Netflix_Prize>. This is that, for
> Wikipedia!
> >>
> >> Although the initial goal of the Netflix Prize was to design a
> >> collaborative filtering algorithm, it became notorious when the data was
> >> used to de-anonymize Netflix users. Researchers proved that given just a
> >> user's movie ratings on one site, you can plug those ratings into
> another
> >> site, such as the IMDB. You can then take that information, and with
> some
> >> Google searches and optionally a bit of cash (for websites that sell
> user
> >> information, including, in some cases, their SSN) figure out who they
> are.
> >> You could even drive up to their house and take a selfie with them, or
> >> follow them to work and meet their boss and tell them about their views
> on
> >> the topics they were editing.
> >>
> >> Here, we'll cut straight to the privacy chase. Using just the full
> history
> >> dump of the English Wikipedia, excluding edits from any logged-in users,
> >> identify five people. You must confirm their identities with them, and
> >> privately prove to me that you've done this. I will then nominate you as
> >> the winner and send you one million Satoshis (the smallest unit of
> Bitcoin,
> >> times 1 million), in addition to updating this thread.
> >>
> >> I suspect this challenge will be very easy for anyone who is determined.
> >> Indeed, even if MediaWiki no longer displayed IP addresses, there would
> >> still be enough information to identify people. Completely getting rid
> of
> >> the edit history would largely solve the problem. In the mean time, this
> >> Prize will serve as a reminder that when Wikipedia says "Your IP address
> >> will be publicly visible if you make any edits." what they mean is,
> "People
> >> will probably be able to figure out where you live and embarrass you."
> >>
> >> An extra million Satoshis for each NSA employee that you identify. A
> full
> >> bitcoin if you take a selfie with them.
> >>
> >> Let the games begin!
> >>
> >> Brian Mingus
> >> _______________________________________________
> >> Wikimedia-l mailing list, guidelines at:
> >> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> >> [hidden email]
> >> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> >> <mailto:[hidden email]?subject=unsubscribe>
> > _______________________________________________
> > Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> > [hidden email]
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>



--
Landline (UK) 01780 757 250
Mobile (UK) 0798 1995 792
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

geni
On 31 March 2015 at 03:15, Richard Farmbrough <[hidden email]>
wrote:

> Moreover this may well be a breach of policy, TOS and even law.
>
>
Eh probably not. Go through a bunch of wikipedia bios of not very notable
people. Find the edits obviously made by the subject of the article. Note
IPs. I don't see any legal issues. Just rather boring thats all.



--
geni
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Tim Starling-2
In reply to this post by Brian
On 30/03/15 09:25, Brian wrote:
> I suspect this challenge will be very easy for anyone who is determined.
> Indeed, even if MediaWiki no longer displayed IP addresses, there would
> still be enough information to identify people. Completely getting rid of
> the edit history would largely solve the problem.

So... what do you actually want? I am having trouble working out how
many layers of sarcasm to strip back here to find your actual point.

There are alternatives to publishing IP addresses that we have
discussed before, for example automatically creating a user account
with a random name and associating it with a persistent cookie. The
user could set a password or just abandon the account by letting the
cookie expire. CheckUser would still provide access to IP addresses. I
would support such a change. I have no idea whether you would.

After reading this post and your posts on wikien-l, here are my
theories on what your non-sarcastic beliefs may be:

1. That we shouldn't store or use IP addresses at all, and that
identification for abuse prevention should be done by some kind of
unspecified cryptographic magic.

2. That disclosure and storage of IP addresses should be limited in
some pragmatic way to reduce the risk of identification by
cross-correlation in the manner you suggest in your $2.50 "prize".

3. That Wikimedia's suit against the NSA is hypocritical and that both
Wikimedia and the NSA have legitimate needs for data collection.

Feel free to narrow it down for me.

-- Tim Starling


_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Cristian Consonni
In reply to this post by Brian
Hi Brian,

2015-03-30 0:25 GMT+02:00 Brian <[hidden email]>:

> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into another
> site, such as the IMDB. You can then take that information, and with some
> Google searches and optionally a bit of cash (for websites that sell user
> information, including, in some cases, their SSN) figure out who they are.
> You could even drive up to their house and take a selfie with them, or
> follow them to work and meet their boss and tell them about their views on
> the topics they were editing.

somewhat tangentially, and to bring back this to topic to a more
scientific setting I would like to point out that there has already
been reasearch in the past on this topic.

I highly recommend reading the following paper:

Lieberman, Michael D., and Jimmy Lin. "You Are Where You Edit:
Locating Wikipedia Contributors through Edit Histories." ICWSM. 2009.
(PDF <http://www.pensivepuffin.com/dwmcphd/syllabi/infx598_wi12/papers/wikipedia/lieberman-lin.YouAreWhereYouEdit.ICWSM09.pdf>)

For those of you that don't want to read the whole paper, you can find
a recap of the most relevant findings in this presentation by Maurizio
Napolitano:
<http://www.slideshare.net/napo/social-geography-wikipedia-a-quick-overwiew>

The main idea is associating spatial coordinates to a Wikipedia
articles when possible, this articles are called "geopages". Then you
extract from the history of articles the users which have edited a
geopage. If you plot the geopages edited by a given contributor you
can see that they tend to cluster, so you can define an "edit area".
The study finds that 30-35% of contributors concentrate their edits in
an edit area smaller than 1 deg^2 (~12,362 km^2, approximately the
area of Connecticut or Northern Ireland[1] (thanks, Wikipedia!)).

For another free/libre project with a geographic focus like
OpenStreetMap this is even more marked, check out for example this
tool «“Your OSM Heat Map” (aka Where did you contribute?)»[2] by
Pascal Neis.

This, of course, is not a straightforward de-anonimization but this
methods work in principle for every contributor even if you obfuscate
their IP or username (provided that you can still assign all the edits
from a given user to a unique and univocal identifier)

C
[1] https://en.wikipedia.org/wiki/Square_degree
[2a] http://yosmhm.neis-one.org/
[2b] http://neis-one.org/2011/08/yosmhm/

_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Announcing: The Wikipedia Prize!

Lila Tretikov
All,

As Tim mentioned we are seriously looking at
privacy/identity/security/anonymity issues, specifically as it pertains to
IP address exposure -- both from legal and technical standpoint. This won't
happen overnight as we need to get people to work on this and there are a
lot of asks, but this is on our radar.

On a related note, let's skip the sarcasm and treat each other with
straightforward honestly. And for non-English speakers -- who are also (if
not more) in need of this -- sarcasm can be very confusing.

Thanks,
Lila

On Fri, Apr 3, 2015 at 4:02 PM, Cristian Consonni <[hidden email]>
wrote:

> Hi Brian,
>
> 2015-03-30 0:25 GMT+02:00 Brian <[hidden email]>:
> > Although the initial goal of the Netflix Prize was to design a
> > collaborative filtering algorithm, it became notorious when the data was
> > used to de-anonymize Netflix users. Researchers proved that given just a
> > user's movie ratings on one site, you can plug those ratings into another
> > site, such as the IMDB. You can then take that information, and with some
> > Google searches and optionally a bit of cash (for websites that sell user
> > information, including, in some cases, their SSN) figure out who they
> are.
> > You could even drive up to their house and take a selfie with them, or
> > follow them to work and meet their boss and tell them about their views
> on
> > the topics they were editing.
>
> somewhat tangentially, and to bring back this to topic to a more
> scientific setting I would like to point out that there has already
> been reasearch in the past on this topic.
>
> I highly recommend reading the following paper:
>
> Lieberman, Michael D., and Jimmy Lin. "You Are Where You Edit:
> Locating Wikipedia Contributors through Edit Histories." ICWSM. 2009.
> (PDF <
> http://www.pensivepuffin.com/dwmcphd/syllabi/infx598_wi12/papers/wikipedia/lieberman-lin.YouAreWhereYouEdit.ICWSM09.pdf
> >)
>
> For those of you that don't want to read the whole paper, you can find
> a recap of the most relevant findings in this presentation by Maurizio
> Napolitano:
> <
> http://www.slideshare.net/napo/social-geography-wikipedia-a-quick-overwiew
> >
>
> The main idea is associating spatial coordinates to a Wikipedia
> articles when possible, this articles are called "geopages". Then you
> extract from the history of articles the users which have edited a
> geopage. If you plot the geopages edited by a given contributor you
> can see that they tend to cluster, so you can define an "edit area".
> The study finds that 30-35% of contributors concentrate their edits in
> an edit area smaller than 1 deg^2 (~12,362 km^2, approximately the
> area of Connecticut or Northern Ireland[1] (thanks, Wikipedia!)).
>
> For another free/libre project with a geographic focus like
> OpenStreetMap this is even more marked, check out for example this
> tool «“Your OSM Heat Map” (aka Where did you contribute?)»[2] by
> Pascal Neis.
>
> This, of course, is not a straightforward de-anonimization but this
> methods work in principle for every contributor even if you obfuscate
> their IP or username (provided that you can still assign all the edits
> from a given user to a unique and univocal identifier)
>
> C
> [1] https://en.wikipedia.org/wiki/Square_degree
> [2a] http://yosmhm.neis-one.org/
> [2b] http://neis-one.org/2011/08/yosmhm/
>
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>
>
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>