[Wikimedia-l] Structured data ethical implications

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia-l] Structured data ethical implications

Mister Thrapostibongles
Dear all,
There have been announcements about the Structured data project on Commons,
that is intended to make it easier to view, search, edit, organize and
re-use the metadata on media.  This is clearly of great value to
researchers and developers in image recognition, who will have a large
repository of tagged image files to train their AI implementations on.

There is however an ethical issue here.  Readers will recall that Google
discovered that its facial regonition software was prone to classifying
African-American faces as "gorilla", because the training dataset had not
contained enough non-white faces -- see for example The Verge
https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai


Is the Foundation confident that the Commons repository is sufficiently
diverse that it can ethically offer it to others as a source of training
data?

Thrapostibongles
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia-l] Structured data ethical implications

Jonathan Morgan
Hi Mister Thrapostibongles,

This is a good point and a valid consideration. WMF is starting to think
about issues like this, and what tools we have available to mitigate
unintended consequences of AI tech (even in cases where we're not building
the AI tech itself, but rather providing training data). I wrote up a white
paper
<https://meta.wikimedia.org/wiki/File:Ethical_and_human-centererd_AI_-_Wikimedia_Research_2030.pdf>
on this topic recently, in consultation with some other folks in research,
product, and legal. This isn't a policy (yet), just a proposal and a
conversation starter. Feedback and discussion welcome!

Best,
Jonathan



On Sun, May 12, 2019 at 1:50 AM Mister Thrapostibongles <
[hidden email]> wrote:

> Dear all,
> There have been announcements about the Structured data project on Commons,
> that is intended to make it easier to view, search, edit, organize and
> re-use the metadata on media.  This is clearly of great value to
> researchers and developers in image recognition, who will have a large
> repository of tagged image files to train their AI implementations on.
>
> There is however an ethical issue here.  Readers will recall that Google
> discovered that its facial regonition software was prone to classifying
> African-American faces as "gorilla", because the training dataset had not
> contained enough non-white faces -- see for example The Verge
>
> https://www.theverge.com/2018/1/12/16882408/google-racist-gorillas-photo-recognition-algorithm-ai
>
>
> Is the Foundation confident that the Commons repository is sufficiently
> diverse that it can ethically offer it to others as a source of training
> data?
>
> Thrapostibongles
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: [hidden email]
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:[hidden email]?subject=unsubscribe>



--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
(Uses He/Him)
_______________________________________________
Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: [hidden email]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:[hidden email]?subject=unsubscribe>