Help with bots

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Help with bots

Adrianne Wadewitz
User:Sadads and I are currently working on an academic article related
to the coverage of historical information on Wikipedia (see the
outline of our methods below). We were thinking that some of the
existing bots or perhaps people in the bot-writing community might be
able to help us write some scripts that will pull the information we
need off wiki. We are not script writers, but we think what we want to
do is pretty easy. Please let us know if you can help us out! Thanks!

Adrianne (User:Wadewitz) and Alex (User:Sadads)

______________________________________________________________________

In this study, we use the following ways to analyze the ways in which
Wikipedia articles approach historiography. We approached our analysis
in two ways: quantitatively and qualitatively. In our quantitative
approach, we followed the following procedures:

First, we look at the number of different sources the article cites.
We determined this by running a script over the article that counted
the number of discrete citations in the footnotes and works cited.
Because many articles have a large number of sources but rely on a
small number of them for much of their information, we also look at
how often each source is used and whether any one source is used
disproportionately. While there are reliable sources that could be
used in this way, we have found that this is a marker of an article
that presents only one historiographical viewpoint.

Second, we are also interested in the types of sources used. So, using
a script to check the publication information and template information
of the source, we analyzed the ratio of journal to book to newspaper
to web sources. Moreover, because articles that have a wide span of
publication dates tend to have a good representation of
historiography, we analyzed the dates published of the sources.

Third, we searched the articles for the following words, based on a
preliminary survey of 25 articles we used as initial. These words
indicated that the articles approached history and historiography from
an ambiguous or debatable position: “probably”, “possibly”, “on the
other hand”, “one view”, “bias”, “perspectives”. We also searched for
sections such as “Historiography”, “Modern view”, “Legacy”, and
“Assessment”.

We chose to analyze 19th-century FA, GA, and B articles. The GA and FA
articles have undergone a review process on Wikipedia and thus should
be better. We excluded any B article that had been through a peer
review on the site, as we wanted to contrast articles that had been
through Wikipedia content revision process. We wanted to know what the
“best” articles Wikipedia had to offer before and after comment by the
community. We also chose this field as both of us have some
familiarity with the time period but neither of us had worked
extensively on the articles, so there was no conflict of interest. We
also excluded any military history articles because of the significant
difference in historiographic focus of the military history community.
Additionally, the Wikipedia community has a significant more coverage
on the topic of military history, both in number of articles and level
of commitment to that subtopic within the community, WikiProject
Military History being one of the most active and having a different
standard of topic coverage.


--
Dr. Adrianne Wadewitz
Mellon Digital Scholarship Fellow
Center for Digital Learning + Research
Occidental College
http://www.oxy.edu/center-digital-learning-research/about
https://sites.google.com/site/wadewitz/

_______________________________________________
Wikibots-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikibots-l
Reply | Threaded
Open this post in threaded view
|

Re: Help with bots

Stephen LaPorte-2
On Wed, Jan 2, 2013 at 1:04 PM, Adrianne Wadewitz <[hidden email]>wrote:

> We were thinking that some of the
> existing bots or perhaps people in the bot-writing community might be
> able to help us write some scripts that will pull the information we
> need off wiki.
>

Hello Adrianne,

I have a python script that I use for gathering similar data.[0] It might
be a little complicated to set up and is poorly documented at the moment,
but I would be happy to run it for you and send the data as a CSV or JSON.

Cheers,
Stephen

0. https://github.com/slaporte/qualityvis/blob/master/loupe.py
_______________________________________________
Wikibots-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikibots-l
Reply | Threaded
Open this post in threaded view
|

Re: Help with bots

Morten Wang
In reply to this post by Adrianne Wadewitz
Adrianne,

You might also be interested in posting this on wiki-research-l,
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

There are quite a lot of academic researchers reading that list, and I know
at least one of them has done research into references & citations, so in
addition to code you might also get some good feedback on your methods.


Regards,
Morten



On 2 January 2013 22:04, Adrianne Wadewitz <[hidden email]> wrote:

> User:Sadads and I are currently working on an academic article related
> to the coverage of historical information on Wikipedia (see the
> outline of our methods below). We were thinking that some of the
> existing bots or perhaps people in the bot-writing community might be
> able to help us write some scripts that will pull the information we
> need off wiki. We are not script writers, but we think what we want to
> do is pretty easy. Please let us know if you can help us out! Thanks!
>
> Adrianne (User:Wadewitz) and Alex (User:Sadads)
>
> ______________________________________________________________________
>
> In this study, we use the following ways to analyze the ways in which
> Wikipedia articles approach historiography. We approached our analysis
> in two ways: quantitatively and qualitatively. In our quantitative
> approach, we followed the following procedures:
>
> First, we look at the number of different sources the article cites.
> We determined this by running a script over the article that counted
> the number of discrete citations in the footnotes and works cited.
> Because many articles have a large number of sources but rely on a
> small number of them for much of their information, we also look at
> how often each source is used and whether any one source is used
> disproportionately. While there are reliable sources that could be
> used in this way, we have found that this is a marker of an article
> that presents only one historiographical viewpoint.
>
> Second, we are also interested in the types of sources used. So, using
> a script to check the publication information and template information
> of the source, we analyzed the ratio of journal to book to newspaper
> to web sources. Moreover, because articles that have a wide span of
> publication dates tend to have a good representation of
> historiography, we analyzed the dates published of the sources.
>
> Third, we searched the articles for the following words, based on a
> preliminary survey of 25 articles we used as initial. These words
> indicated that the articles approached history and historiography from
> an ambiguous or debatable position: “probably”, “possibly”, “on the
> other hand”, “one view”, “bias”, “perspectives”. We also searched for
> sections such as “Historiography”, “Modern view”, “Legacy”, and
> “Assessment”.
>
> We chose to analyze 19th-century FA, GA, and B articles. The GA and FA
> articles have undergone a review process on Wikipedia and thus should
> be better. We excluded any B article that had been through a peer
> review on the site, as we wanted to contrast articles that had been
> through Wikipedia content revision process. We wanted to know what the
> “best” articles Wikipedia had to offer before and after comment by the
> community. We also chose this field as both of us have some
> familiarity with the time period but neither of us had worked
> extensively on the articles, so there was no conflict of interest. We
> also excluded any military history articles because of the significant
> difference in historiographic focus of the military history community.
> Additionally, the Wikipedia community has a significant more coverage
> on the topic of military history, both in number of articles and level
> of commitment to that subtopic within the community, WikiProject
> Military History being one of the most active and having a different
> standard of topic coverage.
>
>
> --
> Dr. Adrianne Wadewitz
> Mellon Digital Scholarship Fellow
> Center for Digital Learning + Research
> Occidental College
> http://www.oxy.edu/center-digital-learning-research/about
> https://sites.google.com/site/wadewitz/
>
> _______________________________________________
> Wikibots-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikibots-l
>
_______________________________________________
Wikibots-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikibots-l