Summary of findings from WMF Summer of Research program now available

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Summary of findings from WMF Summer of Research program now available

Steven Walling-3
Greetings everyone,

Now that the the WMF summer research program in the Community Department has come to a close, I wanted to point interested parties to the body of findings we've produced.

We covered a lot of territory so to save you the trouble if you just want to browse, we collected our most salient results into one wiki page.
Next steps are twofold for this program:
  1. We'll be working with the Global Development team and some volunteers from the local community to extend these analyses to cover Portuguese Wikipedia, specifically to support Global Dev's work in Brazil.
  2. We're choosing and implementing a platform to release not just our code, but the datasets we compiled over the summer. You'll hear more about this soon, but we're taking our time in order to decide on a solution that will work in the long term for sharing open data beyond the dumps.
Last but not least, if anyone would like to have a more in-depth discussion about these findings and the research that produced them, I'm definitely open to hosting an IRC office hours with some members of the team. Just let me know if you're interested (on or offlist) and I'll set something up soon.

--
Steven Walling
Fellow at Wikimedia Foundation


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: [Foundation-l] Summary of findings from WMF Summer of Research program now available

John Mark Vandenberg
Thanks Steven, and the Community Department.

I am instantly drawn to the analysis of redlinks.
Can we please have this data!!
Article writers are on stand by ready to kill red links ;-)

The special page for this is dead.

http://en.wikipedia.org/wiki/Special:WantedPages

--
John Vandenberg

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: [Foundation-l] Summary of findings from WMF Summer of Research program now available

R.Stuart Geiger
Thanks for the interest, John!  I put the list of the top 250 up at
http://en.wikipedia.org/wiki/Wikipedia:Most_wanted_articles -- but I
didn't exactly publicize it.  I guess this is my chance to do so now!
Also, a list of the top 1000 redlinked articles is up on a separate
page at http://en.wikipedia.org/wiki/Wikipedia:Most_wanted_articles/July_2011
and the entire dataset is up at
http://toolserver.org/~swalker/redlink_list.csv -- note that it is
42.8mb!

If you have any other questions about the redlinks/bluelinks dataset,
feel free to ask me.  And you can check out the meta page for more fun
links data, such as how many more links we added between 2009 and
2011, or incoming links to articles about countries / each country's
population: http://meta.wikimedia.org/wiki/Research:One_Link,_Two_Links,_Red_Links,_Blue_Links

Stuart

----
Stuart Geiger
User:Staeiou / @staeiou
Ph.D student, UC-Berkeley School of Information

On Tue, Sep 6, 2011 at 10:19 AM, John Vandenberg <[hidden email]> wrote:

> Thanks Steven, and the Community Department.
>
> I am instantly drawn to the analysis of redlinks.
> Can we please have this data!!
> Article writers are on stand by ready to kill red links ;-)
>
> The special page for this is dead.
>
> http://en.wikipedia.org/wiki/Special:WantedPages
>
> --
> John Vandenberg
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Summary of findings from WMF Summer of Research program now available

Emilio J. Rodríguez-Posada
In reply to this post by Steven Walling-3
The interesting thing here is, 4.8M unique red links in 2009, and unique 5.6M red links in 2011. The more articles are created, the more articles are missing.

2011/9/6 Steven Walling <[hidden email]>
Greetings everyone,

Now that the the WMF summer research program in the Community Department has come to a close, I wanted to point interested parties to the body of findings we've produced.

We covered a lot of territory so to save you the trouble if you just want to browse, we collected our most salient results into one wiki page.
Next steps are twofold for this program:
  1. We'll be working with the Global Development team and some volunteers from the local community to extend these analyses to cover Portuguese Wikipedia, specifically to support Global Dev's work in Brazil.
  2. We're choosing and implementing a platform to release not just our code, but the datasets we compiled over the summer. You'll hear more about this soon, but we're taking our time in order to decide on a solution that will work in the long term for sharing open data beyond the dumps.
Last but not least, if anyone would like to have a more in-depth discussion about these findings and the research that produced them, I'm definitely open to hosting an IRC office hours with some members of the team. Just let me know if you're interested (on or offlist) and I'll set something up soon.

--
Steven Walling
Fellow at Wikimedia Foundation


_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Summary of findings from WMF Summer of Research program now available

Piotr Konieczny-4
On 9/10/2011 5:04 PM, emijrp wrote:
The interesting thing here is, 4.8M unique red links in 2009, and unique 5.6M red links in 2011. The more articles are created, the more articles are missing.

Doesn't surprise me; my rough calculations (http://en.wikipedia.org/wiki/User:Piotrus/Wikipedia_interwiki_and_specialized_knowledge_test) suggest Wikipedia is not even a tenth-complete at this point (just talking about existing articles, and not their quality).
-- 
Piotr Konieczny
PhD Candidate
Dept of Sociology
Uni of Pittsburgh

http://pittsburgh.academia.edu/PiotrKonieczny/
http://en.wikipedia.org/wiki/User:Piotrus

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Summary of findings from WMF Summer of Research program now available

Emilio J. Rodríguez-Posada
Interesting Piotr, I'm working in a similar approach http://en.wikipedia.org/wiki/User:Emijrp/All_human_knowledge

2011/9/12 Piotr Konieczny <[hidden email]>
On 9/10/2011 5:04 PM, emijrp wrote:
The interesting thing here is, 4.8M unique red links in 2009, and unique 5.6M red links in 2011. The more articles are created, the more articles are missing.

Doesn't surprise me; my rough calculations (http://en.wikipedia.org/wiki/User:Piotrus/Wikipedia_interwiki_and_specialized_knowledge_test) suggest Wikipedia is not even a tenth-complete at this point (just talking about existing articles, and not their quality).
-- 
Piotr Konieczny
PhD Candidate
Dept of Sociology
Uni of Pittsburgh

http://pittsburgh.academia.edu/PiotrKonieczny/
http://en.wikipedia.org/wiki/User:Piotrus

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l



_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l