Tag Archives: regression

Review of “Taking up the mop: identifying future wikipedia administrators”

Paper by Moira Burke and Robert Kraut of Carnegie Mellon University, presented at CHI ’08, Conference on Human Factors in Computing Systems.

This paper presents a model of editors who have successfully passed the peer review process to become admins. The lightweight model is based on behavioral metadata and comments, and does not require any page text. It demonstrates that the Wikipedia community has shifted in the last two years to prioritizing policymaking and organization experience over simple article-level coordination, and mere edit count does not lead to adminship.

In short, authors compute lots of stats for every single user and then they do regression with the binary variable “election successful, i.e. X became admin”. They separate Request for Adminship pre-2006 and after-2006.

The stats they compute are:
Strong edit history
* Article edits ‡
* Months since first edit
Varied experience
* Wikipedia (policy) edits ‡
* WikiProject edits ‡
* Diversity score
* User page edits ‡
User interaction
* Article talk edits ‡
* User talk edits ‡
* Wikipedia talk edits
* Arb/mediation/wikiquette edits
* Newcomer welcomes
* “Please” in comments
* “Thanks” in comments
Helping with chores
* “Revert” in comments ‡
* Vandal-fighting (AIV) edits
* Requests for protection
* “POV” in comments
* Admin attention/noticeboard edits
* X for deletion/review edits ‡
* Minor edits (%)
Observing consensus
* Other RfAs
* Village pump
* Votes
Edit summaries / comments
* Commented (%)
* Avg. comment length (log2 chars)
Merely performing a lot of production work is insufficient for “promotion” in Wikipedia. Candidates’ article edits were weak predictors of success. They also have to demonstrate more managerial behavior. Diverse experience and contributions to the development of policies and Wiki Projects were stronger predictors of RfA success. This is consistent with findings that Wikipedia is a bureaucracy [1] and that coordination work has increased substantially [8][13].

However, future work is needed to examine more closely what the admins are doing. Future admins also use article talk pages and comments for coordination and negotiation more often than unsuccessful nominees, and tend to escalate disputes less often.

Although this research has shown that judges pay attention to candidates’ job-relevant behavior and especially behavior that suggests the candidate will be a good manager and not just a good worker, it is silent about whether other factors and probit regressions on the likelihood of success in a identified in the organizational literature [9]—social networks, irrelevant attributes, or strategic self- presentation.

Indeed, recent evidence that Wikipedia admins use a secret mailing list to coordinate their actions toward others suggest that sponsorship may also play a role in promotion.

Future research in Wikipedia using techniques like those in the current paper can be used to test theories in organizational behavior about criteria for promotion. An important limitation of the current model is that it does not take the quality of contribution into account. We plan to improve the model by examining measures of length, persistence, and pageviews of edits, which are already being used in more processor intensive models of existing admin behavior [7] and impact of edits [10].

Criteria for admins have changed modestly over time. Success rates were much higher (75.5%) prior to 2006, and collaboration via article talk pages helped more in the past (+15% for every 1000 article talk edits, compared to +6.3% today). The diversity score performs similarly prior to 2006 (+3.7% then, +2.8% now). However, participation in Wikipedia policy and Wiki Projects? was not predictive of adminship prior to 2006, suggesting the community as a whole is beginning to prioritize policymaking and organization experience over simple article-level coordination.

If you want to read the details, you can read the PDF of the paper.
Credit: Picture by inju released under Creative Commons.