Vandalism

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Vandalism

Thomas Stieve
 Dear Listserv,

Hope all is well. I am mapping IP address edits per country for 271
language Wikipedias. I would like to exclude IP addresses that are
vandalism. I was thinking of using the ipblocks table for the IP addresses
to be excluded. Because this project is in so many different languages and
my programming skills are intermediate, I would like to use the Wikipedia
tables or registers that the Wikipedians in those language use to mark
vandalism. If anyone has another idea, I would be most grateful. Perhaps I
am missing a way that Wikipedians across languages are using to mark
vandalism.

Thank you,
Tom


--
Thomas Stieve
Ph.D. Candidate
School of Geography and Development
University of Arizona
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Vandalism

Kerry Raymond
I’m not quite sure what you want. An IP address may be used by one or many anonymous contributors (workplaces, universities and schools can often appear to Wikipedia as a single IP address). Each of those contributors may make one or more edits. Each of those edits may be vandalism (a deliberate intention to damage and hopefully reverted), poor quality but good faith edits (which are reverted for a wide variety of reasons) or acceptable contributions.

Also there is a reluctance to block a known multi-user IP address because of misbehaviour by what appears to be one person.

So, when you say “IP addresses that are vandalism”, can you more specific about what you want or don’t want?

Kerry

Sent from my iPad

> On 16 Jan 2019, at 9:03 pm, Thomas Stieve <[hidden email]> wrote:
>
> Dear Listserv,
>
> Hope all is well. I am mapping IP address edits per country for 271
> language Wikipedias. I would like to exclude IP addresses that are
> vandalism. I was thinking of using the ipblocks table for the IP addresses
> to be excluded. Because this project is in so many different languages and
> my programming skills are intermediate, I would like to use the Wikipedia
> tables or registers that the Wikipedians in those language use to mark
> vandalism. If anyone has another idea, I would be most grateful. Perhaps I
> am missing a way that Wikipedians across languages are using to mark
> vandalism.
>
> Thank you,
> Tom
>
>
> --
> Thomas Stieve
> Ph.D. Candidate
> School of Geography and Development
> University of Arizona
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Vandalism

Kerry Raymond
In reply to this post by Thomas Stieve
And, FWIW, I don’t think we have a flag on an edit saying that is vandalism. We have a history that can show an edit that is reverted. On inspection of the edit summary of the reversion, there may be some textual clues e.g. “rvv” a common abbreviation for “reverting vandalism”. There may be a message in the reverted IP’s talk page that uses words that suggest vandalism (noting that many of these messages are templates and so have highly predictable structure, usually with initially neutral terms like “not constructive” escalating to the explicit use of the word “vandalism” in some form). However, these messages may not specifically link to the problematic edit so you would be looking for talk page messages appearing “shortly” after the revert of the edit.

Not all vandalism is immediately  detected; there may be a number of other edits intervening, which may make it impossible to revert.

Not all vandalism is removed with revert, it may occur by “normal editing” perhaps as part of a larger edit.

Not all reverted edits are vandalism. They may be well-intentioned but breach a Wikipedia policy (eg requirement for citation, present an opinion as a fact). Some acceptable edits get reverted for a range of (mostly unacceptable) reasons like gatekeeping, style errors, UI errors (if the GUI loads slowly, my click to say thanks sometimes turns into a revert!), etc.

And finally, as someone who does her watch list diligently, sometimes you just can’t tell if an edit is vandalism. The classic is the small change in dates. If there is no citation or the citation is to a off-line resource or a deadlink, it may be impossible to tell if the changed information is a genuine correction or a deliberately damaging action. Obviously I may have my suspicions, but I do have the obligation to Assume Good Faith. It’s not easy.

Kerry



Sent from my iPad

> On 16 Jan 2019, at 9:03 pm, Thomas Stieve <[hidden email]> wrote:
>
> Dear Listserv,
>
> Hope all is well. I am mapping IP address edits per country for 271
> language Wikipedias. I would like to exclude IP addresses that are
> vandalism. I was thinking of using the ipblocks table for the IP addresses
> to be excluded. Because this project is in so many different languages and
> my programming skills are intermediate, I would like to use the Wikipedia
> tables or registers that the Wikipedians in those language use to mark
> vandalism. If anyone has another idea, I would be most grateful. Perhaps I
> am missing a way that Wikipedians across languages are using to mark
> vandalism.
>
> Thank you,
> Tom
>
>
> --
> Thomas Stieve
> Ph.D. Candidate
> School of Geography and Development
> University of Arizona
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Vandalism

Jonathan Morgan
Tom,

You may be interested in the ORES Platform
<https://www.mediawiki.org/wiki/ORES>, which provides a vandalism detection
service across many (but not all) Wikipedia languages. It works at the
revision level, not the user level, but I suppose you could filter and/or
aggregate.

Best,
Jonathan

On Wed, Jan 16, 2019 at 1:19 PM Kerry Raymond <[hidden email]>
wrote:

> And, FWIW, I don’t think we have a flag on an edit saying that is
> vandalism. We have a history that can show an edit that is reverted. On
> inspection of the edit summary of the reversion, there may be some textual
> clues e.g. “rvv” a common abbreviation for “reverting vandalism”. There may
> be a message in the reverted IP’s talk page that uses words that suggest
> vandalism (noting that many of these messages are templates and so have
> highly predictable structure, usually with initially neutral terms like
> “not constructive” escalating to the explicit use of the word “vandalism”
> in some form). However, these messages may not specifically link to the
> problematic edit so you would be looking for talk page messages appearing
> “shortly” after the revert of the edit.
>
> Not all vandalism is immediately  detected; there may be a number of other
> edits intervening, which may make it impossible to revert.
>
> Not all vandalism is removed with revert, it may occur by “normal editing”
> perhaps as part of a larger edit.
>
> Not all reverted edits are vandalism. They may be well-intentioned but
> breach a Wikipedia policy (eg requirement for citation, present an opinion
> as a fact). Some acceptable edits get reverted for a range of (mostly
> unacceptable) reasons like gatekeeping, style errors, UI errors (if the GUI
> loads slowly, my click to say thanks sometimes turns into a revert!), etc.
>
> And finally, as someone who does her watch list diligently, sometimes you
> just can’t tell if an edit is vandalism. The classic is the small change in
> dates. If there is no citation or the citation is to a off-line resource or
> a deadlink, it may be impossible to tell if the changed information is a
> genuine correction or a deliberately damaging action. Obviously I may have
> my suspicions, but I do have the obligation to Assume Good Faith. It’s not
> easy.
>
> Kerry
>
>
>
> Sent from my iPad
>
> > On 16 Jan 2019, at 9:03 pm, Thomas Stieve <[hidden email]>
> wrote:
> >
> > Dear Listserv,
> >
> > Hope all is well. I am mapping IP address edits per country for 271
> > language Wikipedias. I would like to exclude IP addresses that are
> > vandalism. I was thinking of using the ipblocks table for the IP
> addresses
> > to be excluded. Because this project is in so many different languages
> and
> > my programming skills are intermediate, I would like to use the Wikipedia
> > tables or registers that the Wikipedians in those language use to mark
> > vandalism. If anyone has another idea, I would be most grateful. Perhaps
> I
> > am missing a way that Wikipedians across languages are using to mark
> > vandalism.
> >
> > Thank you,
> > Tom
> >
> >
> > --
> > Thomas Stieve
> > Ph.D. Candidate
> > School of Geography and Development
> > University of Arizona
> > _______________________________________________
> > Wiki-research-l mailing list
> > [hidden email]
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Reply | Threaded
Open this post in threaded view
|

Re: Vandalism

Martin Potthast-2
In reply to this post by Thomas Stieve
Hi Tom,

maybe take  a look here: https://webis.de/publications.html#filter:ICWSM

In this paper we conducted a large-scale analysis of vandalism on Wikipedia
across many languages, and the undeyling software is available.

Best,
Martin

On Wed, Jan 16, 2019 at 8:03 PM Thomas Stieve <[hidden email]>
wrote:

>  Dear Listserv,
>
> Hope all is well. I am mapping IP address edits per country for 271
> language Wikipedias. I would like to exclude IP addresses that are
> vandalism. I was thinking of using the ipblocks table for the IP addresses
> to be excluded. Because this project is in so many different languages and
> my programming skills are intermediate, I would like to use the Wikipedia
> tables or registers that the Wikipedians in those language use to mark
> vandalism. If anyone has another idea, I would be most grateful. Perhaps I
> am missing a way that Wikipedians across languages are using to mark
> vandalism.
>
> Thank you,
> Tom
>
>
> --
> Thomas Stieve
> Ph.D. Candidate
> School of Geography and Development
> University of Arizona
> _______________________________________________
> Wiki-research-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>


--
Jun-Prof. Dr. Martin Potthast
Leipzig University
Germany

+49 341 97 32382
+49 171 809 1945

leipzig.webis.de
_______________________________________________
Wiki-research-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l