February 15, 2017 Research Showcase

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

February 15, 2017 Research Showcase

Sarah Rodlund
Hi Everyone,

The next Research Showcase will be live-streamed this February 15, 2017 at 11:30 AM (PST) 18:30 UTC. 


As usual, you can join the conversation on IRC at #wikimedia-research. And, you can watch our past research showcases here.

This month's presentations:

Wikipedia and the Urban-Rural Divide
By Isaac Johnson
Wikipedia articles about places, OpenStreetMap features, and other forms of peer-produced content have become critical sources of geographic knowledge for humans and intelligent technologies. We explore the effectiveness of the peer production model across the rural/urban divide, a divide that has been shown to be an important factor in many online social systems. We find that in Wikipedia (as well as OpenStreetMap), peer-produced content about rural areas is of systematically lower quality, less likely to have been produced by contributors who focus on the local area, and more likely to have been generated by automated software agents (i.e. “bots”). We continue to explore and codify the systemic challenges inherent to characterizing rural phenomena through peer production as well as discuss potential solutions.


Wikipedia Navigation Vectors
By Ellery Wulczyn
In this project, we learned embeddings for Wikipedia articles and Wikidata items by applying Word2vec models to a corpus of reading sessions. Although Word2vec models were developed to learn word embeddings from a corpus of sentences, they can be applied to any kind of sequential data. The learned embeddings have the property that items with similar neighbors in the training corpus have similar representations (as measured by the cosine similarity, for example). Consequently, applying Wor2vec to reading sessions results in article embeddings, where articles that tend to be read in close succession have similar representations. Since people usually generate sequences of semantically related articles while reading, these embeddings also capture semantic similarity between articles.

--
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: February 15, 2017 Research Showcase

Sarah Rodlund
Just a reminder this will be taking place in one hour!


On Tue, Feb 14, 2017 at 2:49 PM, Sarah R <[hidden email]> wrote:
Hi Everyone,

The next Research Showcase will be live-streamed this February 15, 2017 at 11:30 AM (PST) 18:30 UTC. 


As usual, you can join the conversation on IRC at #wikimedia-research. And, you can watch our past research showcases here.

This month's presentations:

Wikipedia and the Urban-Rural Divide
By Isaac Johnson
Wikipedia articles about places, OpenStreetMap features, and other forms of peer-produced content have become critical sources of geographic knowledge for humans and intelligent technologies. We explore the effectiveness of the peer production model across the rural/urban divide, a divide that has been shown to be an important factor in many online social systems. We find that in Wikipedia (as well as OpenStreetMap), peer-produced content about rural areas is of systematically lower quality, less likely to have been produced by contributors who focus on the local area, and more likely to have been generated by automated software agents (i.e. “bots”). We continue to explore and codify the systemic challenges inherent to characterizing rural phenomena through peer production as well as discuss potential solutions.


Wikipedia Navigation Vectors
By Ellery Wulczyn
In this project, we learned embeddings for Wikipedia articles and Wikidata items by applying Word2vec models to a corpus of reading sessions. Although Word2vec models were developed to learn word embeddings from a corpus of sentences, they can be applied to any kind of sequential data. The learned embeddings have the property that items with similar neighbors in the training corpus have similar representations (as measured by the cosine similarity, for example). Consequently, applying Wor2vec to reading sessions results in article embeddings, where articles that tend to be read in close succession have similar representations. Since people usually generate sequences of semantically related articles while reading, these embeddings also capture semantic similarity between articles.

--
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation



--
Sarah R. Rodlund
Senior Project Coordinator-Product & Technology, Wikimedia Foundation

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l