Abstract. We present results of a new approach to detect destructive article revi- sions, so-called vandalism, in Wikipedia. Vandalism detection is a one-class clas- siﬁcation problem, where vandalism edits are the target to be identiﬁed among
all revisions. Interestingly, vandalism detection has not been addressed in the In- formation Retrieval literature by now. In this paper we discuss the characteristics of vandalism as humans recognize it and develop features to render vandalism
detection as a machine learning task. We compiled a large number of vandalism edits in a corpus, which allows for the comparison of existing and new detection approaches. Using logistic regression we achieve 83% precision at 77% recall
with our model. Compared to the rule-based methods that are currently applied in Wikipedia, our approach increases the F -Measure performance by 49% while being faster at the same time.
Open the PDF, scan to page 667. This bot outperforms MartinBot,
T-850 Robotic Assistant, WerdnaAntiVandalBot, Xenophon, ClueBot,
CounterVandalismBot, PkgBot, MiszaBot, and AntiVandalBot. It
outperforms the best of those (AntiVandalBot) by a very wide margin.
So why are you wasting the ISPs time and the police's time when the
best of the passive technology routes have not been explored? Using
machine learning you pit the vandals against themselves.Every time they perform a particular kind of vandalism, it can never be performed again because the bot will recognize it.