The Objective Revision Evaluation System (ORES) is a service for evaluating Wikipedia edits by various metrics, backed by a machine learning system. This vignette provides a guide to that service and to the API client in R of the same name.
ORES provides a machine learning platform for Wikimedia users, letting them plug in machine learning algorithms or sets of information about edits and train models to classify edits in various ways. It currently identifies whether an edit was made in good faith, the quality of an edit, and whether an edit harmed the article it was made to.
The platform has a public API, and so this R package hooks into it and makes it easy to score a series of edits using the models that ORES has.
Not all projects are supported yet, and not all projects have all of the models; adapting a model to a new language and cultural context takes time. The ores
package provides functions to identify what projects are supported, what models they have, and how accurate those models are.
With list_wikis
you can do just what it says on the tin; list projects that have some kind of ORES support:
library(ores)
list_wikis()
# [1] "dewiki" "enwiki" "eswiki" "etwiki" "fawiki"
# [6] "frwiki" "hewiki" "idwiki" "itwiki" "nlwiki"
# [11] "ptwiki" "testwiki" "trwiki" "ukwiki" "viwiki"
# [16] "wikidatawiki"
Once you’ve identified the project you want and that it is supported, you can see what models it has and various information about them - accuracy, methodology, the size of the training group - with list_models
, which accepts a wiki name and spits out all of that information.
As said, ORES currently has three models:
check_goodfaith()
;check_damaging()
’check_quality()
All of these functions have the same structure - you provide a project and a vector of edit IDs, and it spits out a data.frame containing the project, edit ID, prediction, and probability of that prediction being right or wrong.