Gaps

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Gaps

Heather Ford-3
Having a look at the new WMF research site, I noticed that it seems that
notification and recommendations mechanisms are the key strategy being
focused on re. the filling of Wikipedia's content gaps. Having just
finished a research project on just this problem and coming to the opposite
conclusion i.e. that automated mechanisms were insufficient for solving the
gaps problem, I was curious to find out more.

This latest research that I was involved in with colleagues was based on an
action research project aiming to fill gaps in topics relating to South
Africa. The team tried a range of different strategies discussed in the
literature for filling Wikipedia's gaps without any wild success. Automated
mechanisms that featured missing and incomplete articles catalysed very few
edits.

When looking for related research, it seemed that others had come to a
similar conclusion i.e. that automated notification/recommendations alone
didn't lead to improvements in particular target areas. That makes me think
that a) I just haven't come across the right research or b) that there are
different types of gaps and that those different types require different
solutions i.e. the difference between filling gaps across language
versions, gaps created by incomplete articles about topics for which there
are few online/reliable sources is different from the lack of articles
about topics for which there are many online/reliable sources, gaps in
articles about particular topics, relating to particular geographic areas
etc.

Does anyone have any insight here? - either on research that would help
practitioners decide how to go about a project of filling gaps in a
particular subject area or about whether the key focus of research at the
WMF is on filling gaps via automated means such as recommendation and
notification mechanisms?

Many thanks!

Best,
Heather.
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Leila Zia
Hi Heather,

Thanks for writing. Below are some of my thoughts.

* Whether automatic recommendations work rely heavily on at least a
few factors: the users who interact with these recommendations and
their level of expertise with editing Wikimedia projects, the quality
of the recommendations, how much context is provided as part of the
recommendations, incentives, and the design of the platform/tool/etc.
where these recommendations get surfaced. The last point is something
very critical. Design is key in this context.

* We've had some good success stories with recommendations. As you
have seen, the work we did in 2015 shows that you can significantly
increase article creation rate (factor of 3.2 without loss in quality)
if you do personalized recommendations.[0] Obviously, creation of an
article is a task suited more towards the more experienced editors as
newcomers. Had we done a similar experiment with newcomers, my gut
feeling is that we would have seen a very different result. We also
build a recommendation API [1] that is now being used in Content
Translation for editors to receive Suggestions on what to edit next.
We could see a spike of increase in contributions in the tool after
this feature was introduced. somewhere between 8-15% of the
contributions through the tool come thanks to the recommendations
today.[2] There are other success stories around as well. For example,
Ma Commune [3] focuses on helping French Wikipedia editors expand the
already existing articles (specific and limited types of articles for
now). Recommendations have also worked really well in the context of
Wikidata, where contributions can be made through games such as The
Distributed Game [4].

* Specifically about the work we do in knowledge gaps, we're at the
moment very much focused on the realm of machine in the loop (as
opposed to human in the loop) [5]. By this I mean: our aim is to
understand what humans are trying to do on Wikimedia projects and
bring in machines/algorithms to do what they want to do more
easily/efficiently, with least frustration and pain. An example of
this approach was when we interviewed a couple of editathon organizers
in Africa as part of The Africa Destubathon and learned that they were
doing a lot of manual work extracting structures of articles to create
templates for newcomers to learn how to expand an already existing
article. That's when we became sure that investing on section
recommendations actually makes sense (later we learned we can help
other projects such as Ma Commune, too, which is great.)

* More recently, Contributors team conducted a research study to
understand the needs of Wikipedia editors through in-person interviews
with editors. The focus areas coming out of this research [6] suggest
that proving in-context help and task recommendations are important.

I hope these pointers help. I know we will talk about these more when
we talk next, but if you or others have questions or comments in the
mean time, I'd be happy to expand. Just be aware that it's annual
planning time around here and we may be slow in responding. :)

Best,
Leila

[0] https://arxiv.org/abs/1604.03235
[1] https://www.mediawiki.org/wiki/GapFinder/Developers
[2] These numbers are a few months old, I need to get updates. :)
[3] https://macommune.wikipedia.fr/
[4] http://magnusmanske.de/wordpress/?p=362
[5] Borrowing the term from Ricardo Baeza-Yates.
[6] https://www.mediawiki.org/wiki/New_Editor_Experiences#Focuses

--
Leila Zia
Senior Research Scientist
Wikimedia Foundation


On Thu, Feb 8, 2018 at 7:03 PM, Heather Ford <[hidden email]> wrote:

> Having a look at the new WMF research site, I noticed that it seems that
> notification and recommendations mechanisms are the key strategy being
> focused on re. the filling of Wikipedia's content gaps. Having just
> finished a research project on just this problem and coming to the opposite
> conclusion i.e. that automated mechanisms were insufficient for solving the
> gaps problem, I was curious to find out more.
>
> This latest research that I was involved in with colleagues was based on an
> action research project aiming to fill gaps in topics relating to South
> Africa. The team tried a range of different strategies discussed in the
> literature for filling Wikipedia's gaps without any wild success. Automated
> mechanisms that featured missing and incomplete articles catalysed very few
> edits.
>
> When looking for related research, it seemed that others had come to a
> similar conclusion i.e. that automated notification/recommendations alone
> didn't lead to improvements in particular target areas. That makes me think
> that a) I just haven't come across the right research or b) that there are
> different types of gaps and that those different types require different
> solutions i.e. the difference between filling gaps across language
> versions, gaps created by incomplete articles about topics for which there
> are few online/reliable sources is different from the lack of articles
> about topics for which there are many online/reliable sources, gaps in
> articles about particular topics, relating to particular geographic areas
> etc.
>
> Does anyone have any insight here? - either on research that would help
> practitioners decide how to go about a project of filling gaps in a
> particular subject area or about whether the key focus of research at the
> WMF is on filling gaps via automated means such as recommendation and
> notification mechanisms?
>
> Many thanks!
>
> Best,
> Heather.
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Kerry Raymond
In reply to this post by Heather Ford-3
I think there are two parts to the problem of filling gaps. Drawing attention to the gaps is half of the problem. The other half of the problem is finding the editor who wants to write that article. For example, I often check on the "missing topics" list for WikiProject Queensland (which is machine-generated by counting the number of redlinks in articles tagged on the Talk page as belonging to that project).

https://en.wikipedia.org/wiki/Wikipedia:WikiProject_Queensland/Missing_topics

This is not a highly sophisticated algorithm but it does result in my thinking "oh well, I am sure I could at least write a stub on that topic" and so I write an article.

But if you look at the first couple of screens of those "most missing" topics, there are lots of racing car drivers. I have no interest whatsoever in racing car drivers, I have no idea what sources might exist or which might be reliable. So as I pick off other topics from the "most missing" list, it has the effect of increasing the density of racing car drivers at the top of the list. Clearly we have a content gap around racing car drivers, but I won't be doing anything about it.

This reinforces the point Leila makes about personalising the recommendations. I think it's more important to target the right people even if the list you present to them isn't overly sophisticated. The right person will be able to mentally filter a list of things vaguely associated with their topic interests. As Leila says, there's probably less benefit in targeting new users to write new articles. But I've started over 4000 articles and I bet 90% are WikiProject Queensland. Show me any list of wanted Queensland topics and I'll probably be willing to write about *many * of them (but not all). Similarly if you look at the categories of the articles I write, the category Queensland Heritage Register will come up a lot (probably 1/3 of my articles are about heritage properties). Probably another 1/3 are articles about Queensland towns/suburbs/localities. I think looking at the categories/projects of the articles people write is a very strong indicator of interest areas. And the more articles they write, the more sure you can be that they are confident about starting new articles (a lot of people are not willing to start new articles but will happily contribute to a stub -- probably had a past bad experience with article creation) and the more you can be sure about their areas of interest.

With the exception of redirects and disambiguation pages, I would think anyone who has started many articles is likely to have easily-inferred topic space interests. For that matter, a lot of people (myself included) talk about their interest areas on their user page, so key words in user pages that fuzzy-match to project names or category names may be another indicator.

However, some of the content gaps on Wikipedia exist because we don't have contributors who are interested in the topic. Given that there is a known difference between the topics that women generally write about compared to men, it's clear that a lack of diversity in editors is likely to lead to content gaps. I would suspect the same is true about other personal characteristics. As an Australian, I am more likely to write about Australian than say Greenland, but I did holiday there last year, so actually I have written a little about Greenland and uploaded some photos, but that's just a "blip" in my contribution profile (and I don't think I started any new articles about Greenland). If we have a content gap about Greenland, maybe we don't have enough Greenlanders to fill it? I think we can't address content gaps unless we also address contributor gaps. This in turn may result in devolving responsibility for things like notability and verifiability down to the Project level. For example, it is often commented that Indigenous Australian topics are a content gap. The problem is a lack of sources. Indigenous Australians did not have a written language so oral sources are very important, but en.Wikipedia isn't keen on oral sources, so there's a content gap that's hard to fill. And I suspect we have very few Indigenous Australians writing for Wikipedia. Statistically 3% of our population self-identifies as Indigenous but they tend to have lower educational attainments which probably makes them less likely to be Wikipedia contributors who, based on the 2011 survey, have above average likelihood of having a university degree.

So I think we have two flavours of content gap, those for which we have active contributors in the broader topic space who may be enticed to write about the missing topics (which is the problem being principally addressed by this area of research), and those where we do not have active contributors.

Kerry





_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Leila Zia
On Thu, Feb 8, 2018 at 8:56 PM, Kerry Raymond <[hidden email]> wrote:
> I think we can't address content gaps unless we also address contributor gaps.

This is very important. We very likely have reader/consumer gaps, (for
sure) content gaps, and contributor gaps and these gaps are connected
to each other in ways that we need to much better understand.

Leila

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Heather Ford-3
Thanks so much for the super helpful comments and suggestions, Leila,
Kerry! I so appreciate it.

And yes, this is a great way to frame the distinction i.e. that some gaps
can be filled by existing contributors (using automated techniques like
recommendations) but others can only be filled by bringing in new
contributors and/or by creating alternative support mechanisms or
incentives (in the way that programmes like GLAM or editing competitions
might do). Curious if anyone else on the list has recommendations for
research in the latter category... I'm still convinced we need more
academic research here :)

Best,
Heather.



Dr Heather Ford
Senior Lecturer, School of Arts & Media <https://sam.arts.unsw.edu.au/>,
University of New South Wales
w: hblog.org / EthnographyMatters.net <http://ethnographymatters.net/> / t:
@hfordsa <http://www.twitter.com/hfordsa>


On 9 February 2018 at 12:18, Leila Zia <[hidden email]> wrote:

> On Thu, Feb 8, 2018 at 8:56 PM, Kerry Raymond <[hidden email]>
> wrote:
> > I think we can't address content gaps unless we also address contributor
> gaps.
>
> This is very important. We very likely have reader/consumer gaps, (for
> sure) content gaps, and contributor gaps and these gaps are connected
> to each other in ways that we need to much better understand.
>
> Leila
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Amir E. Aharoni
In reply to this post by Heather Ford-3
Heather,

Thanks for starting this thread.

Where can I read your research that comes to the conclusion that automated
mechanisms are insufficient for solving the gaps problem?

Sorry if this was mentioned somewhere already; I sometimes get lost on long
emails, and it's possible that I missed it :)


בתאריך 9 בפבר׳ 2018 05:04,‏ "Heather Ford" <[hidden email]> כתב:

Having a look at the new WMF research site, I noticed that it seems that
notification and recommendations mechanisms are the key strategy being
focused on re. the filling of Wikipedia's content gaps. Having just
finished a research project on just this problem and coming to the opposite
conclusion i.e. that automated mechanisms were insufficient for solving the
gaps problem, I was curious to find out more.

This latest research that I was involved in with colleagues was based on an
action research project aiming to fill gaps in topics relating to South
Africa. The team tried a range of different strategies discussed in the
literature for filling Wikipedia's gaps without any wild success. Automated
mechanisms that featured missing and incomplete articles catalysed very few
edits.

When looking for related research, it seemed that others had come to a
similar conclusion i.e. that automated notification/recommendations alone
didn't lead to improvements in particular target areas. That makes me think
that a) I just haven't come across the right research or b) that there are
different types of gaps and that those different types require different
solutions i.e. the difference between filling gaps across language
versions, gaps created by incomplete articles about topics for which there
are few online/reliable sources is different from the lack of articles
about topics for which there are many online/reliable sources, gaps in
articles about particular topics, relating to particular geographic areas
etc.

Does anyone have any insight here? - either on research that would help
practitioners decide how to go about a project of filling gaps in a
particular subject area or about whether the key focus of research at the
WMF is on filling gaps via automated means such as recommendation and
notification mechanisms?

Many thanks!

Best,
Heather.
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Heather Ford-3
Dear Amir,

I did send this via Twitter, but wanted to send here too in case anyone
else is interested. Our paper summarises some of the research on
notifications. A pre-print is available here:

https://makebuildplay.files.wordpress.com/2018/02/wp_primary_school_paper_acceptedv.pdf


Happy to chat more and would very much like to chat to others doing
research on knowledge gaps on Wikipedia.

Best,
Heather.

Dr Heather Ford
Senior Lecturer, School of Arts & Media <https://sam.arts.unsw.edu.au/>,
University of New South Wales
w: hblog.org / EthnographyMatters.net <http://ethnographymatters.net/> / t:
@hfordsa <http://www.twitter.com/hfordsa>


On 9 February 2018 at 20:53, Amir E. Aharoni <[hidden email]>
wrote:

> Heather,
>
> Thanks for starting this thread.
>
> Where can I read your research that comes to the conclusion that automated
> mechanisms are insufficient for solving the gaps problem?
>
> Sorry if this was mentioned somewhere already; I sometimes get lost on long
> emails, and it's possible that I missed it :)
>
>
> בתאריך 9 בפבר׳ 2018 05:04,‏ "Heather Ford" <[hidden email]> כתב:
>
> Having a look at the new WMF research site, I noticed that it seems that
> notification and recommendations mechanisms are the key strategy being
> focused on re. the filling of Wikipedia's content gaps. Having just
> finished a research project on just this problem and coming to the opposite
> conclusion i.e. that automated mechanisms were insufficient for solving the
> gaps problem, I was curious to find out more.
>
> This latest research that I was involved in with colleagues was based on an
> action research project aiming to fill gaps in topics relating to South
> Africa. The team tried a range of different strategies discussed in the
> literature for filling Wikipedia's gaps without any wild success. Automated
> mechanisms that featured missing and incomplete articles catalysed very few
> edits.
>
> When looking for related research, it seemed that others had come to a
> similar conclusion i.e. that automated notification/recommendations alone
> didn't lead to improvements in particular target areas. That makes me think
> that a) I just haven't come across the right research or b) that there are
> different types of gaps and that those different types require different
> solutions i.e. the difference between filling gaps across language
> versions, gaps created by incomplete articles about topics for which there
> are few online/reliable sources is different from the lack of articles
> about topics for which there are many online/reliable sources, gaps in
> articles about particular topics, relating to particular geographic areas
> etc.
>
> Does anyone have any insight here? - either on research that would help
> practitioners decide how to go about a project of filling gaps in a
> particular subject area or about whether the key focus of research at the
> WMF is on filling gaps via automated means such as recommendation and
> notification mechanisms?
>
> Many thanks!
>
> Best,
> Heather.
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Gaps

Jonathan Morgan
Thanks, Heather! This looks super interesting and relevant. I look forward
to reading it :)

Jonathan

On Tue, Feb 20, 2018 at 3:28 PM, Heather Ford <[hidden email]> wrote:

> Dear Amir,
>
> I did send this via Twitter, but wanted to send here too in case anyone
> else is interested. Our paper summarises some of the research on
> notifications. A pre-print is available here:
>
> https://makebuildplay.files.wordpress.com/2018/02/wp_primary_school_paper_
> acceptedv.pdf
>
>
> Happy to chat more and would very much like to chat to others doing
> research on knowledge gaps on Wikipedia.
>
> Best,
> Heather.
>
> Dr Heather Ford
> Senior Lecturer, School of Arts & Media <https://sam.arts.unsw.edu.au/>,
> University of New South Wales
> w: hblog.org / EthnographyMatters.net <http://ethnographymatters.net/> /
> t:
> @hfordsa <http://www.twitter.com/hfordsa>
>
>
> On 9 February 2018 at 20:53, Amir E. Aharoni <[hidden email]
> >
> wrote:
>
> > Heather,
> >
> > Thanks for starting this thread.
> >
> > Where can I read your research that comes to the conclusion that
> automated
> > mechanisms are insufficient for solving the gaps problem?
> >
> > Sorry if this was mentioned somewhere already; I sometimes get lost on
> long
> > emails, and it's possible that I missed it :)
> >
> >
> > בתאריך 9 בפבר׳ 2018 05:04,‏ "Heather Ford" <[hidden email]> כתב:
> >
> > Having a look at the new WMF research site, I noticed that it seems that
> > notification and recommendations mechanisms are the key strategy being
> > focused on re. the filling of Wikipedia's content gaps. Having just
> > finished a research project on just this problem and coming to the
> opposite
> > conclusion i.e. that automated mechanisms were insufficient for solving
> the
> > gaps problem, I was curious to find out more.
> >
> > This latest research that I was involved in with colleagues was based on
> an
> > action research project aiming to fill gaps in topics relating to South
> > Africa. The team tried a range of different strategies discussed in the
> > literature for filling Wikipedia's gaps without any wild success.
> Automated
> > mechanisms that featured missing and incomplete articles catalysed very
> few
> > edits.
> >
> > When looking for related research, it seemed that others had come to a
> > similar conclusion i.e. that automated notification/recommendations alone
> > didn't lead to improvements in particular target areas. That makes me
> think
> > that a) I just haven't come across the right research or b) that there
> are
> > different types of gaps and that those different types require different
> > solutions i.e. the difference between filling gaps across language
> > versions, gaps created by incomplete articles about topics for which
> there
> > are few online/reliable sources is different from the lack of articles
> > about topics for which there are many online/reliable sources, gaps in
> > articles about particular topics, relating to particular geographic areas
> > etc.
> >
> > Does anyone have any insight here? - either on research that would help
> > practitioners decide how to go about a project of filling gaps in a
> > particular subject area or about whether the key focus of research at the
> > WMF is on filling gaps via automated means such as recommendation and
> > notification mechanisms?
> >
> > Many thanks!
> >
> > Best,
> > Heather.
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l