Wikipedia tracks user behaviour via third party companies

classic Classic list List threaded Threaded
45 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Wikipedia tracks user behaviour via third party companies

Tim 'avatar' Bartel
Hi,

recently the report of the KnowPrivacy [1] study - a research project
by the School of Information from University of California in Berkeley
- hit the German media [2].

It came to the conclusion that "All of the top 50 websites contained
at least one web bug at some point in a one month time period." [3]
which includes wikipedia.org.

This is very troubleing and irritating for some of our (German) users
who are very sensitive to data privacy topics. So I established
contact to Brian W. Carver (University of California) who connected me
to David Cancel, the maintainer of Ghostery, which was used to
identify the web bugs. David wrote me today:

> The following web bug trackers were reported to us, on the following subdomains:
>   Google Analytics - vls.wikipedia.org
>   Doubleclick - hu.wikipedia.org
> Both were seen in yesterday's data so they're recent. We don't receive any page level information so that's as much detail as we have. Hope that helps.

I wasn't able to track down the Doubleclick web bug on the hungarian
Wikipedia, but Google Analytics web bug is integrated in every page of
the West Flemish Wikipedia via JavaScript [4].

Our privacy policy [5] states "The Wikimedia Foundation may keep raw
logs of such transactions [IP and other technical information], but
these will not be published or used to track legitimate users." and
"As a general principle, the access to, and retention of, personally
identifiable data in all projects should be minimal and should be used
only internally to serve the well-being of the projects."

I think we should stop the current use of Google Analytics ASAP.

Bye, Tim.

--
http://wikimedia.de

[1] http://knowprivacy.org
[2] http://www.heise.de/newsticker/Studie-Google-fuehrend-bei-Web-Bug-Nutzung--/meldung/139841
[3] http://www.knowprivacy.org/report/KnowPrivacy_Final_Report.pdf, p. 4
[4] http://vls.wikipedia.org/wiki/MediaWiki:Common.js
[5] http://wikimediafoundation.org/wiki/Privacy_policy

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Domas Mituzas
Hi,

> I think we should stop the current use of Google Analytics ASAP.

I'm usually proponent of indefinite bans to people who do this, but  
there are others who want milder approaches :-)
Indeed, this is violation of our privacy policy, and never should be  
allowed. Thanks for headsup.

Do note, hu.wikipedia.org has external stats aggregator,  
'stats.wikipedia.hu', which is hosted on vhost102.sx6.tolna.net - and  
all our traffic is sent there ( http://hu.wikipedia.org/w/index.php?title=MediaWiki:Lastmodifiedat&oldid=4493139 
  - as well as few other places )

I removed from both. Thanks again :)

Domas

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Neil Harris-2
In reply to this post by Tim 'avatar' Bartel
Tim 'avatar' Bartel wrote:

> Hi,
>
> recently the report of the KnowPrivacy [1] study - a research project
> by the School of Information from University of California in Berkeley
> - hit the German media [2].
>
> It came to the conclusion that "All of the top 50 websites contained
> at least one web bug at some point in a one month time period." [3]
> which includes wikipedia.org.
>
> This is very troubleing and irritating for some of our (German) users
> who are very sensitive to data privacy topics. So I established
> contact to Brian W. Carver (University of California) who connected me
> to David Cancel, the maintainer of Ghostery, which was used to
> identify the web bugs. David wrote me today:
>
>  
>> The following web bug trackers were reported to us, on the following subdomains:
>>   Google Analytics - vls.wikipedia.org
>>   Doubleclick - hu.wikipedia.org
>> Both were seen in yesterday's data so they're recent. We don't receive any page level information so that's as much detail as we have. Hope that helps.
>>    
>
> I wasn't able to track down the Doubleclick web bug on the hungarian
> Wikipedia, but Google Analytics web bug is integrated in every page of
> the West Flemish Wikipedia via JavaScript [4].
>
> Our privacy policy [5] states "The Wikimedia Foundation may keep raw
> logs of such transactions [IP and other technical information], but
> these will not be published or used to track legitimate users." and
> "As a general principle, the access to, and retention of, personally
> identifiable data in all projects should be minimal and should be used
> only internally to serve the well-being of the projects."
>
> I think we should stop the current use of Google Analytics ASAP.
>
> Bye, Tim.
>
>  
Surely this is something which should be possible to block at the
MediaWiki level, by suppressing the generation of any HTML  that loads
any indirect resources (scripts, iframes, images, etc.) whatsoever other
than from a clearly defined whitelist of Wikimedia-Foundation-controlled
domains?

Doing this should completely stop site admins from adding web bugs.

-- Neil


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Nikola Smolenski
In reply to this post by Domas Mituzas
Domas Mituzas wrote:
> Do note, hu.wikipedia.org has external stats aggregator,  
> 'stats.wikipedia.hu', which is hosted on vhost102.sx6.tolna.net - and  
> all our traffic is sent there ( http://hu.wikipedia.org/w/index.php?title=MediaWiki:Lastmodifiedat&oldid=4493139 
>   - as well as few other places )

One way to fight this would be to offer more detailed visitor statistics
to people who need them.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

John at Darkstar
In reply to this post by Neil Harris-2
We need tools to track user behavior inside Wikipedia. As it is now we
know nearly nothing at all about user behavior and nearly all people
saying anything about users at Wikipedia makes gross estimates and wild
guesses.

User privacy on Wikipedia is is close to a public hoax, pages are
transfered unencrypted and with user names in clear text. Anyone with
access to a public hub is able to intercept and identify users, in
addition to _all_ websites that are referenced during an edit on
Wikipedia through correlation of logs.

Compared to this the whole previous discussion about the Iranian steward
is somewhat strange, if not completely ridiculous.

Get real, the whole system and access to it is completely open!

John

Neil Harris skrev:

> Tim 'avatar' Bartel wrote:
>> Hi,
>>
>> recently the report of the KnowPrivacy [1] study - a research project
>> by the School of Information from University of California in Berkeley
>> - hit the German media [2].
>>
>> It came to the conclusion that "All of the top 50 websites contained
>> at least one web bug at some point in a one month time period." [3]
>> which includes wikipedia.org.
>>
>> This is very troubleing and irritating for some of our (German) users
>> who are very sensitive to data privacy topics. So I established
>> contact to Brian W. Carver (University of California) who connected me
>> to David Cancel, the maintainer of Ghostery, which was used to
>> identify the web bugs. David wrote me today:
>>
>>  
>>> The following web bug trackers were reported to us, on the following subdomains:
>>>   Google Analytics - vls.wikipedia.org
>>>   Doubleclick - hu.wikipedia.org
>>> Both were seen in yesterday's data so they're recent. We don't receive any page level information so that's as much detail as we have. Hope that helps.
>>>    
>> I wasn't able to track down the Doubleclick web bug on the hungarian
>> Wikipedia, but Google Analytics web bug is integrated in every page of
>> the West Flemish Wikipedia via JavaScript [4].
>>
>> Our privacy policy [5] states "The Wikimedia Foundation may keep raw
>> logs of such transactions [IP and other technical information], but
>> these will not be published or used to track legitimate users." and
>> "As a general principle, the access to, and retention of, personally
>> identifiable data in all projects should be minimal and should be used
>> only internally to serve the well-being of the projects."
>>
>> I think we should stop the current use of Google Analytics ASAP.
>>
>> Bye, Tim.
>>
>>  
> Surely this is something which should be possible to block at the
> MediaWiki level, by suppressing the generation of any HTML  that loads
> any indirect resources (scripts, iframes, images, etc.) whatsoever other
> than from a clearly defined whitelist of Wikimedia-Foundation-controlled
> domains?
>
> Doing this should completely stop site admins from adding web bugs.
>
> -- Neil
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

John at Darkstar
Forgot a link to an article which describes very well privacy on
Wikipedia! ;)

http://en.wikipedia.org/wiki/The_Emperor%27s_New_Clothes

John at Darkstar skrev:

> We need tools to track user behavior inside Wikipedia. As it is now we
> know nearly nothing at all about user behavior and nearly all people
> saying anything about users at Wikipedia makes gross estimates and wild
> guesses.
>
> User privacy on Wikipedia is is close to a public hoax, pages are
> transfered unencrypted and with user names in clear text. Anyone with
> access to a public hub is able to intercept and identify users, in
> addition to _all_ websites that are referenced during an edit on
> Wikipedia through correlation of logs.
>
> Compared to this the whole previous discussion about the Iranian steward
> is somewhat strange, if not completely ridiculous.
>
> Get real, the whole system and access to it is completely open!
>
> John
>
> Neil Harris skrev:
>> Tim 'avatar' Bartel wrote:
>>> Hi,
>>>
>>> recently the report of the KnowPrivacy [1] study - a research project
>>> by the School of Information from University of California in Berkeley
>>> - hit the German media [2].
>>>
>>> It came to the conclusion that "All of the top 50 websites contained
>>> at least one web bug at some point in a one month time period." [3]
>>> which includes wikipedia.org.
>>>
>>> This is very troubleing and irritating for some of our (German) users
>>> who are very sensitive to data privacy topics. So I established
>>> contact to Brian W. Carver (University of California) who connected me
>>> to David Cancel, the maintainer of Ghostery, which was used to
>>> identify the web bugs. David wrote me today:
>>>
>>>  
>>>> The following web bug trackers were reported to us, on the following subdomains:
>>>>   Google Analytics - vls.wikipedia.org
>>>>   Doubleclick - hu.wikipedia.org
>>>> Both were seen in yesterday's data so they're recent. We don't receive any page level information so that's as much detail as we have. Hope that helps.
>>>>    
>>> I wasn't able to track down the Doubleclick web bug on the hungarian
>>> Wikipedia, but Google Analytics web bug is integrated in every page of
>>> the West Flemish Wikipedia via JavaScript [4].
>>>
>>> Our privacy policy [5] states "The Wikimedia Foundation may keep raw
>>> logs of such transactions [IP and other technical information], but
>>> these will not be published or used to track legitimate users." and
>>> "As a general principle, the access to, and retention of, personally
>>> identifiable data in all projects should be minimal and should be used
>>> only internally to serve the well-being of the projects."
>>>
>>> I think we should stop the current use of Google Analytics ASAP.
>>>
>>> Bye, Tim.
>>>
>>>  
>> Surely this is something which should be possible to block at the
>> MediaWiki level, by suppressing the generation of any HTML  that loads
>> any indirect resources (scripts, iframes, images, etc.) whatsoever other
>> than from a clearly defined whitelist of Wikimedia-Foundation-controlled
>> domains?
>>
>> Doing this should completely stop site admins from adding web bugs.
>>
>> -- Neil
>>
>>
>> _______________________________________________
>> foundation-l mailing list
>> [hidden email]
>> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Neil Harris-2
In reply to this post by John at Darkstar
John at Darkstar wrote:

> We need tools to track user behavior inside Wikipedia. As it is now we
> know nearly nothing at all about user behavior and nearly all people
> saying anything about users at Wikipedia makes gross estimates and wild
> guesses.
>
> User privacy on Wikipedia is is close to a public hoax, pages are
> transfered unencrypted and with user names in clear text. Anyone with
> access to a public hub is able to intercept and identify users, in
> addition to _all_ websites that are referenced during an edit on
> Wikipedia through correlation of logs.
>
> Compared to this the whole previous discussion about the Iranian steward
> is somewhat strange, if not completely ridiculous.
>
> Get real, the whole system and access to it is completely open!
>
> John
>  

As you say, there is no possibility of absolute privacy from anyone with
access to the traffic stream, since the Internet was never engineered to
give this kind of privacy.  Wikipedia as "completely open" as any other
non-https website -- and, even with https, as with any other website
with publicly visible transactions, for anyone with access to the
traffic stream, simple traffic analysis is generally enough to correlate
user identities to IPs. A combination of http and Tor is probably as
good as it gets in attempting to avoid this, but even this has its
limitations.

But it is simply unreasonable to equate this with no privacy at all.
Most possible eavesdroppers do _not_ have access to the entire traffic
stream, and those who do have access to traffic generally only have
access to part of the traffic stream, and even then, most of them can't
be bothered to eavesdrop, or are discouraged from doing so by privacy laws.

Given this, it is quite reasonable to take appropriate technical
measures that attempt to keep as much of that remaining privacy as
secure as possible.

-- Neil



_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Pedro Sanchez-2
In reply to this post by Tim 'avatar' Bartel
On Thu, Jun 4, 2009 at 1:18 AM, Tim 'avatar' Bartel <
[hidden email]> wrote:

> Hi,
>
> recently the report of the KnowPrivacy [1] study - a research project
> by the School of Information from University of California in Berkeley
> - hit the German media [2].
>

The case of vlswiki is troubling as it's a single sysop who is stubbornly
adding the analytics bug

   - (huidig) (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=127221>)
   4 jun 2009 06:54<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=131158>
   Midom<http://vls.wikipedia.org/w/index.php?title=Gebruker:Midom&action=edit&redlink=1>
   (discuusjeblad<http://vls.wikipedia.org/w/index.php?title=Discuusje_gebruker:Midom&action=edit&redlink=1>|
   bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Midom> |
   blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Midom>) (74
   bytes) (privacy policy violation) (zêre
ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=rollback&from=Midom&token=7b9e891163dcfaa7ffbc86f28f2734a5%2B%5C>|
   ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=127221&undo=131158>)

   - (huidig<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=127221>)
   (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=127221&oldid=100968>)
   25 apr 2009 15:13<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=127221>
   Tbc <http://vls.wikipedia.org/wiki/Gebruker:Tbc>
(discuusjeblad<http://vls.wikipedia.org/wiki/Discuusje_gebruker:Tbc>|
   bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Tbc> |
   blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Tbc>) (363
   bytes) (cannot find in the policy this is not allowed)
(ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=100968&undo=127221>)

   - (huidig<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=100968>)
   (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=100968&oldid=70975>)
   9 jul 2008 21:13<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=100968>
   Drini <http://vls.wikipedia.org/wiki/Gebruker:Drini>
(discuusjeblad<http://vls.wikipedia.org/wiki/Discuusje_gebruker:Drini>|
   bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Drini> |
   blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Drini>) (74
   bytes) (google analytics is not allowed in global scripts)
(ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=70975&undo=100968>)


http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=history

What I propose is this being re-added would cause a removal of sysop bit due
to misuse of powers.
Don't we have a committee that checks privacy violations?
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

David Gerard-2
2009/6/4 Pedro Sanchez <[hidden email]>:

> What I propose is this being re-added would cause a removal of sysop bit due
> to misuse of powers.
> Don't we have a committee that checks privacy violations?


The Foundation would surely have this power.


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

John at Darkstar
In reply to this post by Neil Harris-2
The interesting thing is "who has interest in which users identity".
Lets make an example, some organization sets up a site with a honeypot
and logs all visitors. Then they correlates that with RC-logs from
Wikipedia and then checks out who adds external links back to
themselves. They do not need direct access to Wikipedia logs or the raw
traffic.

There is only one valid reason as I see it to avoid certain stat
engines, and that is to block advertising companies from getting
information about the readers. The writers does not have any real
anonymity at all.

John

Neil Harris skrev:

> John at Darkstar wrote:
>> We need tools to track user behavior inside Wikipedia. As it is now we
>> know nearly nothing at all about user behavior and nearly all people
>> saying anything about users at Wikipedia makes gross estimates and wild
>> guesses.
>>
>> User privacy on Wikipedia is is close to a public hoax, pages are
>> transfered unencrypted and with user names in clear text. Anyone with
>> access to a public hub is able to intercept and identify users, in
>> addition to _all_ websites that are referenced during an edit on
>> Wikipedia through correlation of logs.
>>
>> Compared to this the whole previous discussion about the Iranian steward
>> is somewhat strange, if not completely ridiculous.
>>
>> Get real, the whole system and access to it is completely open!
>>
>> John
>>  
>
> As you say, there is no possibility of absolute privacy from anyone with
> access to the traffic stream, since the Internet was never engineered to
> give this kind of privacy.  Wikipedia as "completely open" as any other
> non-https website -- and, even with https, as with any other website
> with publicly visible transactions, for anyone with access to the
> traffic stream, simple traffic analysis is generally enough to correlate
> user identities to IPs. A combination of http and Tor is probably as
> good as it gets in attempting to avoid this, but even this has its
> limitations.
>
> But it is simply unreasonable to equate this with no privacy at all.
> Most possible eavesdroppers do _not_ have access to the entire traffic
> stream, and those who do have access to traffic generally only have
> access to part of the traffic stream, and even then, most of them can't
> be bothered to eavesdrop, or are discouraged from doing so by privacy laws.
>
> Given this, it is quite reasonable to take appropriate technical
> measures that attempt to keep as much of that remaining privacy as
> secure as possible.
>
> -- Neil
>
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Andrew Gray-3
In reply to this post by Tim 'avatar' Bartel
2009/6/4 Tim 'avatar' Bartel <[hidden email]>:

> I think we should stop the current use of Google Analytics ASAP.

Indeed.

For the record, we've discussed Google Analytics before:

* in July 2007, for pms.wiki - nothing implemented, I think

* in October 2007, for en.wikibooks - implemented but then stopped. at
the same time, en.wikinews had implemented it and then stopped it
again; the Wikimania07 site also ran it for most of the year and then
had it taken out when discovered.

* in December 2007, for fi.wiki - implemented but then stopped

* in July 2008, for th.wiki - discovered and removed. a check then
found it on vls.wiki and th.wikisource; the discussion doesn't record
that these were removed, but checking the sites shows they were.

The vls one is interesting - it was removed by Drini in July, per the
foundation-l discussion, and only added back in at the end of April
2009... and there we get this problem.

So, yeah. Pretty solid consensus that this is something to avoid. If
we have some "explanatory notes" to go with the privacy policy
anywhere, it might be worth explicitly mentioning the use of external
logging services and Why Thou Shalt Not.

--
- Andrew Gray
  [hidden email]

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Gergő Tisza
In reply to this post by Domas Mituzas
Domas Mituzas <midom.lists@...> writes:

> Do note, hu.wikipedia.org has external stats aggregator,  
> 'stats.wikipedia.hu', which is hosted on vhost102.sx6.tolna.net - and  
> all our traffic is sent there (
> http://hu.wikipedia.org/w/index.php?title=MediaWiki:Lastmodifiedat&oldid=4493139 

The stats aggregator for hu.wikipedia.org was set up with community approval,
the public results contain no identifiable per-machine information (you can
check them here: http://stats.wikipedia.hu/cgi-bin/awstats.pl ), and the records
are not used for any other purposes. I think it is well within the lines of the
privacy policy.

As for Doubleclick, that was probably a mistake on KnowPrivacy's part - maybe
they misidentified the aggregator (we use awstats) because Doubleclick uses a
similar method? If not, I would appreciate if they could serve with more
detailed information.


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Mike.lifeguard-2
In reply to this post by Pedro Sanchez-2
The Ombudsman Commission would likely be that group. Although their
focus has traditionally been CheckUser, their purview actually covers
any and all violations of the privacy policy. Here is one such case. At
this moment, I agree: this sysop shouldn't be.

-Mike

On Thu, 2009-06-04 at 06:21 -0500, Pedro Sanchez wrote:

> On Thu, Jun 4, 2009 at 1:18 AM, Tim 'avatar' Bartel <
> [hidden email]> wrote:
>
> > Hi,
> >
> > recently the report of the KnowPrivacy [1] study - a research project
> > by the School of Information from University of California in Berkeley
> > - hit the German media [2].
> >
>
> The case of vlswiki is troubling as it's a single sysop who is stubbornly
> adding the analytics bug
>
>    - (huidig) (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=127221>)
>    4 jun 2009 06:54<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=131158>
>    Midom<http://vls.wikipedia.org/w/index.php?title=Gebruker:Midom&action=edit&redlink=1>
>    (discuusjeblad<http://vls.wikipedia.org/w/index.php?title=Discuusje_gebruker:Midom&action=edit&redlink=1>|
>    bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Midom> |
>    blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Midom>) (74
>    bytes) (privacy policy violation) (zre
> ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=rollback&from=Midom&token=7b9e891163dcfaa7ffbc86f28f2734a5%2B%5C>|
>    ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=127221&undo=131158>)
>
>    - (huidig<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=127221>)
>    (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=127221&oldid=100968>)
>    25 apr 2009 15:13<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=127221>
>    Tbc <http://vls.wikipedia.org/wiki/Gebruker:Tbc>
> (discuusjeblad<http://vls.wikipedia.org/wiki/Discuusje_gebruker:Tbc>|
>    bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Tbc> |
>    blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Tbc>) (363
>    bytes) (cannot find in the policy this is not allowed)
> (ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=100968&undo=127221>)
>
>    - (huidig<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=131158&oldid=100968>)
>    (latst<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&diff=100968&oldid=70975>)
>    9 jul 2008 21:13<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&oldid=100968>
>    Drini <http://vls.wikipedia.org/wiki/Gebruker:Drini>
> (discuusjeblad<http://vls.wikipedia.org/wiki/Discuusje_gebruker:Drini>|
>    bydroagn <http://vls.wikipedia.org/wiki/Specioal:Bijdragen/Drini> |
>    blokkeer <http://vls.wikipedia.org/wiki/Specioal:Blokkeren/Drini>) (74
>    bytes) (google analytics is not allowed in global scripts)
> (ersteln<http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=edit&undoafter=70975&undo=100968>)
>
>
> http://vls.wikipedia.org/w/index.php?title=MediaWiki:Common.js&action=history
>
> What I propose is this being re-added would cause a removal of sysop bit due
> to misuse of powers.
> Don't we have a committee that checks privacy violations?
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Tim 'avatar' Bartel
In reply to this post by Gergő Tisza
Hi,

2009/6/4 Tisza Gergő <[hidden email]>:
> As for Doubleclick, that was probably a mistake on KnowPrivacy's part - maybe
> they misidentified the aggregator (we use awstats) because Doubleclick uses a
> similar method? If not, I would appreciate if they could serve with more
> detailed information.

Sad but true, they don't have further information on that. I'll try to
reproduce it.

Bye, Tim.

--
http://wikimedia.de

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Neil Harris-2
In reply to this post by John at Darkstar
John at Darkstar wrote:

> The interesting thing is "who has interest in which users identity".
> Lets make an example, some organization sets up a site with a honeypot
> and logs all visitors. Then they correlates that with RC-logs from
> Wikipedia and then checks out who adds external links back to
> themselves. They do not need direct access to Wikipedia logs or the raw
> traffic.
>
> There is only one valid reason as I see it to avoid certain stat
> engines, and that is to block advertising companies from getting
> information about the readers. The writers does not have any real
> anonymity at all.
>
> John
>  

Indeed they could. But even so, they would still have great difficulty
in getting more than a small fraction of Wikipedia's readers to both
visit the honeypot and make an edit that links to it, and the vast
majority of unaffected users will still avoid being bitten by this
attack. And even then, they will still only have obtained a mapping
between the user's current IP and their Wikipedia account, and will
still have to correlate this back to a personal identity, which is often
harder than it might seem to be in theory.

The world is a dangerous place, but just because privacy and security
can never be absolute is not a reason to make good faith efforts to
preserve it as much of both as reasonably possible within the limits of
time and resources available.
 
Just because a door can be knocked down with a sledgehammer (or a wall
demolished with a pneumatic hammer) is not a reason not to have a lock
on it, or a door there in the first place.

-- Neil


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Neil Harris-2
In reply to this post by Nikola Smolenski
Nikola Smolenski wrote:

> Domas Mituzas wrote:
>  
>> Do note, hu.wikipedia.org has external stats aggregator,  
>> 'stats.wikipedia.hu', which is hosted on vhost102.sx6.tolna.net - and  
>> all our traffic is sent there ( http://hu.wikipedia.org/w/index.php?title=MediaWiki:Lastmodifiedat&oldid=4493139 
>>   - as well as few other places )
>>    
>
> One way to fight this would be to offer more detailed visitor statistics
> to people who need them.
>
> __________

And another, possibily even more effective one would be to prevent the
loading of external resources in the software, except possibly via
editors' own custom user javascript pages.

-- Neil


_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

David Gerard-2
In reply to this post by Tim 'avatar' Bartel
Web bugs for statistical data are a legitimate want but potentially a
horrible privacy violation.

So I asked on wikitech-l, and the obvious answer appears to be to do
it internally. Something like http://stats.grok.se/ only more so.

So - if you want web bug data in a way that fits the privacy policy,
please pop over to the wikitech-l thread with technical suggestions
and solutions :-)


- d.

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Michael Snow-3
David Gerard wrote:
> Web bugs for statistical data are a legitimate want but potentially a
> horrible privacy violation.
>
> So I asked on wikitech-l, and the obvious answer appears to be to do
> it internally. Something like http://stats.grok.se/ only more so.
>
> So - if you want web bug data in a way that fits the privacy policy,
> please pop over to the wikitech-l thread with technical suggestions
> and solutions
Precisely. External web bug trackers should be removed without
exception. People who add them innocently, out of an understandable
interest in collecting aggregated information that would not violate the
privacy policy, should be directed to request and help with internal
solutions, kept within appropriate limits to comply with the policy.

--Michael Snow

_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Unionhawk
In reply to this post by Neil Harris-2
>Surely this is something which should be possible to block at the
MediaWiki level
Maybe if we set up Google Analytics in the first place (done by the
Foundation office) and never used it; the foundation could set up analytics
for all projects with a super secure password, and never use it. Will this
work, or will somebody else be able to set up analytics still?

Go Freedom!
Unionhawk



On Thu, Jun 4, 2009 at 6:01 AM, Neil Harris <[hidden email]>wrote:

> Tim 'avatar' Bartel wrote:
> > Hi,
> >
> > recently the report of the KnowPrivacy [1] study - a research project
> > by the School of Information from University of California in Berkeley
> > - hit the German media [2].
> >
> > It came to the conclusion that "All of the top 50 websites contained
> > at least one web bug at some point in a one month time period." [3]
> > which includes wikipedia.org.
> >
> > This is very troubleing and irritating for some of our (German) users
> > who are very sensitive to data privacy topics. So I established
> > contact to Brian W. Carver (University of California) who connected me
> > to David Cancel, the maintainer of Ghostery, which was used to
> > identify the web bugs. David wrote me today:
> >
> >
> >> The following web bug trackers were reported to us, on the following
> subdomains:
> >>   Google Analytics - vls.wikipedia.org
> >>   Doubleclick - hu.wikipedia.org
> >> Both were seen in yesterday's data so they're recent. We don't receive
> any page level information so that's as much detail as we have. Hope that
> helps.
> >>
> >
> > I wasn't able to track down the Doubleclick web bug on the hungarian
> > Wikipedia, but Google Analytics web bug is integrated in every page of
> > the West Flemish Wikipedia via JavaScript [4].
> >
> > Our privacy policy [5] states "The Wikimedia Foundation may keep raw
> > logs of such transactions [IP and other technical information], but
> > these will not be published or used to track legitimate users." and
> > "As a general principle, the access to, and retention of, personally
> > identifiable data in all projects should be minimal and should be used
> > only internally to serve the well-being of the projects."
> >
> > I think we should stop the current use of Google Analytics ASAP.
> >
> > Bye, Tim.
> >
> >
> Surely this is something which should be possible to block at the
> MediaWiki level, by suppressing the generation of any HTML  that loads
> any indirect resources (scripts, iframes, images, etc.) whatsoever other
> than from a clearly defined whitelist of Wikimedia-Foundation-controlled
> domains?
>
> Doing this should completely stop site admins from adding web bugs.
>
> -- Neil
>
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Reply | Threaded
Open this post in threaded view
|

Re: Wikipedia tracks user behaviour via third party companies

Unionhawk
In reply to this post by Michael Snow-3
David Gerard wrote:
> External web bug trackers should be removed without
> exception. People who add them innocently, out of an understandable
> interest in collecting aggregated information that would not violate the
> privacy policy, should be directed to request and help with internal
> solutions, kept within appropriate limits to comply with the policy.

So how do you propose we enforce this? I'm thinking we need to prevent this
from happening in the first place. Analytics like this could pretty much
give checkuser powers to anybody!

They have a legitimate purpose, so, if analytics are wanted/needed by the
Foundation, they may be implemented by the Foundation. Otherwise, no
analytics.

Go Freedom!
Unionhawk



On Thu, Jun 4, 2009 at 11:13 AM, Michael Snow <[hidden email]> wrote:

> David Gerard wrote:
> > Web bugs for statistical data are a legitimate want but potentially a
> > horrible privacy violation.
> >
> > So I asked on wikitech-l, and the obvious answer appears to be to do
> > it internally. Something like http://stats.grok.se/ only more so.
> >
> > So - if you want web bug data in a way that fits the privacy policy,
> > please pop over to the wikitech-l thread with technical suggestions
> > and solutions
> Precisely. External web bug trackers should be removed without
> exception. People who add them innocently, out of an understandable
> interest in collecting aggregated information that would not violate the
> privacy policy, should be directed to request and help with internal
> solutions, kept within appropriate limits to comply with the policy.
>
> --Michael Snow
>
> _______________________________________________
> foundation-l mailing list
> [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>
_______________________________________________
foundation-l mailing list
[hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
123