[Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning

Janna Layton
Hi all,

The next Research Showcase will be live-streamed on Wednesday, May 20, at
9:30 AM PDT/16:30 UTC.

This month we will learn about recent research on machine learning systems
that rely on human supervision for their learning and optimization -- a
research area commonly referred to as Human-in-the-Loop ML. In the first
talk, Jie Yang will present a computational framework that relies on
crowdsourcing to identify influencers in Social Networks (Twitter) by
selectively obtaining labeled data. In the second talk, Estelle Smith will
discuss the role of the community in maintaining ORES, the machine learning
system that predicts the quality in Wikipedia applications.

YouTube stream: https://www.youtube.com/watch?v=8nDiu2ebdOI

As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

This month's presentations:

*OpenCrowd: A Human-AI Collaborative Approach for Finding Social
Influencers via Open-Ended Answers Aggregation*

By: Jie Yang, Amazon (current), Delft University of Technology (starting
soon)

Finding social influencers is a fundamental task in many online
applications ranging from brand marketing to opinion mining. Existing
methods heavily rely on the availability of expert labels, whose collection
is usually a laborious process even for domain experts. Using open-ended
questions, crowdsourcing provides a cost-effective way to find a large
number of social influencers in a short time. Individual crowd workers,
however, only possess fragmented knowledge that is often of low quality. To
tackle those issues, we present OpenCrowd, a unified Bayesian framework
that seamlessly incorporates machine learning and crowdsourcing for
effectively finding social influencers. To infer a set of influencers,
OpenCrowd bootstraps the learning process using a small number of expert
labels and then jointly learns a feature-based answer quality model and the
reliability of the workers. Model parameters and worker reliability are
updated iteratively, allowing their learning processes to benefit from each
other until an agreement on the quality of the answers is reached. We
derive a principled optimization algorithm based on variational inference
with efficient updating rules for learning OpenCrowd parameters.
Experimental results on finding social influencers in different domains
show that our approach substantially improves the state of the art by 11.5%
AUC. Moreover, we empirically show that our approach is particularly useful
in finding micro-influencers, who are very directly engaged with smaller
audiences.

Paper: https://dl.acm.org/doi/fullHtml/10.1145/3366423.3380254

*Keeping Community in the Machine-Learning Loop*

By:  C. Estelle Smith, MS, PhD Candidate, GroupLens Research Lab at the
University of Minnesota

On Wikipedia, sophisticated algorithmic tools are used to assess the
quality of edits and take corrective actions. However, algorithms can fail
to solve the problems they were designed for if they conflict with the
values of communities who use them. In this study, we take a
Value-Sensitive Algorithm Design approach to understanding a
community-created and -maintained machine learning-based algorithm called
the Objective Revision Evaluation System (ORES)—a quality prediction system
used in numerous Wikipedia applications and contexts. Five major values
converged across stakeholder groups that ORES (and its dependent
applications) should: (1) reduce the effort of community maintenance, (2)
maintain human judgement as the final authority, (3) support differing
peoples’ differing workflows, (4) encourage positive engagement with
diverse editor groups, and (5) establish trustworthiness of people and
algorithms within the community. We reveal tensions between these values
and discuss implications for future research to improve algorithms like
ORES.

Paper:
https://commons.wikimedia.org/wiki/File:Keeping_Community_in_the_Loop-_Understanding_Wikipedia_Stakeholder_Values_for_Machine_Learning-Based_Systems.pdf

--
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning

Janna Layton
Just a reminder that this is happening on Wednesday. There will be a Q&A,
which might be especially nice if you're in an area that's still socially
distancing. ;)

On Fri, May 15, 2020 at 1:04 PM Janna Layton <[hidden email]> wrote:

> Hi all,
>
> The next Research Showcase will be live-streamed on Wednesday, May 20, at
> 9:30 AM PDT/16:30 UTC.
>
> This month we will learn about recent research on machine learning systems
> that rely on human supervision for their learning and optimization -- a
> research area commonly referred to as Human-in-the-Loop ML. In the first
> talk, Jie Yang will present a computational framework that relies on
> crowdsourcing to identify influencers in Social Networks (Twitter) by
> selectively obtaining labeled data. In the second talk, Estelle Smith will
> discuss the role of the community in maintaining ORES, the machine learning
> system that predicts the quality in Wikipedia applications.
>
> YouTube stream: https://www.youtube.com/watch?v=8nDiu2ebdOI
>
> As usual, you can join the conversation on IRC at #wikimedia-research. You
> can also watch our past research showcases here:
> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>
> This month's presentations:
>
> *OpenCrowd: A Human-AI Collaborative Approach for Finding Social
> Influencers via Open-Ended Answers Aggregation*
>
> By: Jie Yang, Amazon (current), Delft University of Technology (starting
> soon)
>
> Finding social influencers is a fundamental task in many online
> applications ranging from brand marketing to opinion mining. Existing
> methods heavily rely on the availability of expert labels, whose collection
> is usually a laborious process even for domain experts. Using open-ended
> questions, crowdsourcing provides a cost-effective way to find a large
> number of social influencers in a short time. Individual crowd workers,
> however, only possess fragmented knowledge that is often of low quality. To
> tackle those issues, we present OpenCrowd, a unified Bayesian framework
> that seamlessly incorporates machine learning and crowdsourcing for
> effectively finding social influencers. To infer a set of influencers,
> OpenCrowd bootstraps the learning process using a small number of expert
> labels and then jointly learns a feature-based answer quality model and the
> reliability of the workers. Model parameters and worker reliability are
> updated iteratively, allowing their learning processes to benefit from each
> other until an agreement on the quality of the answers is reached. We
> derive a principled optimization algorithm based on variational inference
> with efficient updating rules for learning OpenCrowd parameters.
> Experimental results on finding social influencers in different domains
> show that our approach substantially improves the state of the art by 11.5%
> AUC. Moreover, we empirically show that our approach is particularly useful
> in finding micro-influencers, who are very directly engaged with smaller
> audiences.
>
> Paper: https://dl.acm.org/doi/fullHtml/10.1145/3366423.3380254
>
> *Keeping Community in the Machine-Learning Loop*
>
> By:  C. Estelle Smith, MS, PhD Candidate, GroupLens Research Lab at the
> University of Minnesota
>
> On Wikipedia, sophisticated algorithmic tools are used to assess the
> quality of edits and take corrective actions. However, algorithms can fail
> to solve the problems they were designed for if they conflict with the
> values of communities who use them. In this study, we take a
> Value-Sensitive Algorithm Design approach to understanding a
> community-created and -maintained machine learning-based algorithm called
> the Objective Revision Evaluation System (ORES)—a quality prediction system
> used in numerous Wikipedia applications and contexts. Five major values
> converged across stakeholder groups that ORES (and its dependent
> applications) should: (1) reduce the effort of community maintenance, (2)
> maintain human judgement as the final authority, (3) support differing
> peoples’ differing workflows, (4) encourage positive engagement with
> diverse editor groups, and (5) establish trustworthiness of people and
> algorithms within the community. We reveal tensions between these values
> and discuss implications for future research to improve algorithms like
> ORES.
>
> Paper:
> https://commons.wikimedia.org/wiki/File:Keeping_Community_in_the_Loop-_Understanding_Wikipedia_Stakeholder_Values_for_Machine_Learning-Based_Systems.pdf
>
> --
> Janna Layton (she, her)
> Administrative Assistant - Product & Technology
> Wikimedia Foundation <https://wikimediafoundation.org/>
>


--
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: [Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning

Janna Layton
In reply to this post by Janna Layton
May's Research Showcase will be starting in 30 minutes!

On Fri, May 15, 2020 at 1:04 PM Janna Layton <[hidden email]> wrote:

> Hi all,
>
> The next Research Showcase will be live-streamed on Wednesday, May 20, at
> 9:30 AM PDT/16:30 UTC.
>
> This month we will learn about recent research on machine learning systems
> that rely on human supervision for their learning and optimization -- a
> research area commonly referred to as Human-in-the-Loop ML. In the first
> talk, Jie Yang will present a computational framework that relies on
> crowdsourcing to identify influencers in Social Networks (Twitter) by
> selectively obtaining labeled data. In the second talk, Estelle Smith will
> discuss the role of the community in maintaining ORES, the machine learning
> system that predicts the quality in Wikipedia applications.
>
> YouTube stream: https://www.youtube.com/watch?v=8nDiu2ebdOI
>
> As usual, you can join the conversation on IRC at #wikimedia-research. You
> can also watch our past research showcases here:
> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>
> This month's presentations:
>
> *OpenCrowd: A Human-AI Collaborative Approach for Finding Social
> Influencers via Open-Ended Answers Aggregation*
>
> By: Jie Yang, Amazon (current), Delft University of Technology (starting
> soon)
>
> Finding social influencers is a fundamental task in many online
> applications ranging from brand marketing to opinion mining. Existing
> methods heavily rely on the availability of expert labels, whose collection
> is usually a laborious process even for domain experts. Using open-ended
> questions, crowdsourcing provides a cost-effective way to find a large
> number of social influencers in a short time. Individual crowd workers,
> however, only possess fragmented knowledge that is often of low quality. To
> tackle those issues, we present OpenCrowd, a unified Bayesian framework
> that seamlessly incorporates machine learning and crowdsourcing for
> effectively finding social influencers. To infer a set of influencers,
> OpenCrowd bootstraps the learning process using a small number of expert
> labels and then jointly learns a feature-based answer quality model and the
> reliability of the workers. Model parameters and worker reliability are
> updated iteratively, allowing their learning processes to benefit from each
> other until an agreement on the quality of the answers is reached. We
> derive a principled optimization algorithm based on variational inference
> with efficient updating rules for learning OpenCrowd parameters.
> Experimental results on finding social influencers in different domains
> show that our approach substantially improves the state of the art by 11.5%
> AUC. Moreover, we empirically show that our approach is particularly useful
> in finding micro-influencers, who are very directly engaged with smaller
> audiences.
>
> Paper: https://dl.acm.org/doi/fullHtml/10.1145/3366423.3380254
>
> *Keeping Community in the Machine-Learning Loop*
>
> By:  C. Estelle Smith, MS, PhD Candidate, GroupLens Research Lab at the
> University of Minnesota
>
> On Wikipedia, sophisticated algorithmic tools are used to assess the
> quality of edits and take corrective actions. However, algorithms can fail
> to solve the problems they were designed for if they conflict with the
> values of communities who use them. In this study, we take a
> Value-Sensitive Algorithm Design approach to understanding a
> community-created and -maintained machine learning-based algorithm called
> the Objective Revision Evaluation System (ORES)—a quality prediction system
> used in numerous Wikipedia applications and contexts. Five major values
> converged across stakeholder groups that ORES (and its dependent
> applications) should: (1) reduce the effort of community maintenance, (2)
> maintain human judgement as the final authority, (3) support differing
> peoples’ differing workflows, (4) encourage positive engagement with
> diverse editor groups, and (5) establish trustworthiness of people and
> algorithms within the community. We reveal tensions between these values
> and discuss implications for future research to improve algorithms like
> ORES.
>
> Paper:
> https://commons.wikimedia.org/wiki/File:Keeping_Community_in_the_Loop-_Understanding_Wikipedia_Stakeholder_Values_for_Machine_Learning-Based_Systems.pdf
>
> --
> Janna Layton (she, her)
> Administrative Assistant - Product & Technology
> Wikimedia Foundation <https://wikimediafoundation.org/>
>


--
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l