1
Sep
Tags: Arabic, Bengali, Google, health, Hindi, Kiswahili, pilot, Swahili, Tamil, translate, Wikipedia | By paolo |
Add a comment
In 2008, Google opened a project competing with Wikipedia: Knol. The project at January 2009 had grown to 100,000 articles, something it is hard to define a success.
Since then it seems the attitude of Google towards Wikipedia have changed a bit, more like “Ok, you (Wikipedia) can become the de facto monopolist in the user-generated creation of knowledge, we have other and more challenging competitors to defeat now, we will incorporate you later on down the way”.
Two example of this new attitude (according to my view of course) are the Kiswahili Wikipedia Challenge and the Health Speaks Wikipedia pilot project.
The Kiswahili Wikipedia Challenge was a challenge launched in November 2009 by Google. The task was to translate English Wikipedia articles into Kiswahili or to write Wikipedia articles from scratch. Participants received prizes such as laptops, mobile phones, prepaid internet access modems, Google T-shirts. Google stated goal: “We hope to make the online experience richer and more relevant for 100 million African users who speak Kiswahili.”
The results might not be that great. The Wikipedia Signpost of 2010-07-26 quotes from the blog post what happened on the Google Challenge @ the Swahili Wikipedia:
Nearly all of them are gone now and left a lot of articles which often are not really state of the art formally and also linguistically … they don’t care because they were there for laptops and other prizes (no need to be rude, but it hurts me pretty bad).
An article in New York Times is similarly not exalted. The last paragraphs of the article comments on Google-generated content in Wikipedias in languages of India.
However, the surge in content created by Google’s project to improve these sites still needs work, according some local site administrators. For example the Wikipedia in Tamil – one of the underrepresented South Asian languages – the entries covered “too many American pop stars and Hindi movies, which Tamils may not need as a priority.” There was also sloppiness in language and coding.
Despite these concerns, Tamil Wikipedia plans on working with Google to continue the additions. The Bengali Wikipedia, however, took greater umbrage and simply deleted the Google-generated content. The Bengali Wikipedians explained that the material simply did not meet their standards.
The Health Speaks Wikipedia pilot project was announced yesterday and is focused on increasing the quantity and quality of online health information in languages spoken in developing countries. They started a pilot project to support community-based, crowd-sourced translation of health information from English Wikipedias into Arabic, Hindi and Swahili Wikipedias.
They have chosen hundreds of good quality English language health articles from Wikipedia that they hope will be translated with the assistance of Google Translator Toolkit, made locally relevant, reviewed and then published to the corresponding local language Wikipedia site. They have also funded the professional translation of a small subset of these articles. And they are additionally providing a donation incentive to encourage community translators to participate. For the first 60 days, they will donate 3 cents (US) for each English word translated to the Children’s Cancer Hospital Egypt 57357, the Public Health Foundation of India and the African Medical and Research Foundation (AMREF) for the pilots in Arabic, Hindi and Swahili, respectively, up to $50,000 each. This means that community translators will help their friends and neighbors access quality health information in a local language, while also supporting a local non-profit organization working in health or health education.
26
Aug
Tags: britannica, errors, history, Nature, Rosenzweig, Wikipedia | By paolo |
Add a comment
While reading “Can History be Open Source? Wikipedia and the Future of the Past” (review soon!) by Roy Rosenzweig, founder and ex-director of the Center for History and New Media (which also created Zotero and Omeka!), I got across the mention to the list of 74 Errors in the Encyclopædia Britannica that have been corrected in Wikipedia.
Lovely! ;)
25
Aug
Tags: Anarchy, Bureaucracy, democracy, Despotism, Meritocracy, Plutocracy, power, Technocracy, Wikipedia | By paolo |
Add a comment
There is an interesting essay over at meta.wikimedia about Wikipedia power structure: Wikimedia’s present power structure is a mix of anarchic, despotic, democratic, republican, meritocratic, plutocratic, technocratic, and bureaucratic elements.
Wow! The entire self-reflection of the Wikipedia community is amazing and the topic is very interesting.
Personally I find interesting how much these policies and ethos are created by the community (the humans) and how much they are created by the socio-technical system (the Mediawiki software). My impression is that the software influences a lot and the same community will perform very differently under different softwares: I think it is often mentioned that Wikis work because it is very easy (easier?) fix things than destroying them, but this is a feature of the software and of the buttons and functionalities (such as rollback) that the software gives to users.
Many of these points resonates in me since I read the glorious book by Lawrence Lessig Code and Other Laws of Cyberspace but now I’m in a position to test them … at least in Wikipedia! I guess I would be classified as a technocratic ;)
The essay is released under the Creative Commons Attribution/Share-Alike License, so, just because I can, I copy and paste the original HTML after the jump (and most links are of course broken). Enjoy!
Continue Reading
25
Aug
Tags: aggressive, Expertise, larry, sanger, Wikipedia | By paolo |
Add a comment
Larry Sanger in the paper “The Fate of Expertise after Wikipedia”:
Over the long term, the quality of a given Wikipedia article will do a random walk around the highest level of quality permitted by the most persistent and aggressive people who follow an article.
Larry Sanger is co-founder of Wikipedia but left years ago. You can read the hyper-interesting account of his involvement with Wikipedia in “The Early History of Nupedia and Wikipedia: A Memoir” (part 1, part 2).
23
Aug
Tags: fun, ideology, motivations, Wikipedia | By paolo |
2 Comments
Paper by Oded Nov, published on Communications of the ACM (November 2007)

A random sample of 370 Wikipedians were emailed a request to participate in a Web-based survey.
A total of 151 valid responses were received (40.8% response rate), of which 140 (92.7%) were from males (first “gosh”!).
The respondents’ mean age was 30.9, and on average they have been contributing content to Wikipedia 2.3 years.
The average level of contribution was 8.27 hours per week.
The Wikipedians were asked to state how strongly they agree or disagree on a scale of 1 to 7 with items.
Items were related to 8 different types of motivations: Protective, Values, Career, Social, Understanding, Enhancement (typical measures about volunteering motivations) and Fun, Ideology (added by authors since relevant for Wikipedia).
Overall, the top motivations were found to be Fun and Ideology. Agreement with Fun was in average 6.10 (in the range 1 to 7!). Ideology was 5.59. The other motivations were inferior to 4.
Each of the six motivations positively correlated with contribution level.
The Ideology case is particularly interesting (…): while people state that ideology is high on their list of reasons to contribute, being more ideologically motivated does not translate into increased contribution.
It would make sense for organizers of user-generated content outlets to focus marketing, recruitment, and retention efforts by highlighting the fun aspects of contributing.
Credit for image: nojhan released under Creative Commons
18
Aug
Tags: administrator, election, paper, predict, regression, review, sonet, Wikipedia | By paolo |
4 Comments
Paper by Moira Burke and Robert Kraut of Carnegie Mellon University, presented at CHI ‘08, Conference on Human Factors in Computing Systems.
This paper presents a model of editors who have successfully passed the peer review process to become admins. The lightweight model is based on behavioral metadata and comments, and does not require any page text. It demonstrates that the Wikipedia community has shifted in the last two years to prioritizing policymaking and organization experience over simple article-level coordination, and mere edit count does not lead to adminship.
In short, authors compute lots of stats for every single user and then they do regression with the binary variable “election successful, i.e. X became admin”. They separate Request for Adminship pre-2006 and after-2006.
The stats they compute are:
Strong edit history
* Article edits ‡
* Months since first edit
Varied experience
* Wikipedia (policy) edits ‡
* WikiProject edits ‡
* Diversity score
* User page edits ‡
User interaction
* Article talk edits ‡
* User talk edits ‡
* Wikipedia talk edits
* Arb/mediation/wikiquette edits
* Newcomer welcomes
* “Please” in comments
* “Thanks” in comments
Helping with chores
* “Revert” in comments ‡
* Vandal-fighting (AIV) edits
* Requests for protection
* “POV” in comments
* Admin attention/noticeboard edits
* X for deletion/review edits ‡
* Minor edits (%)
Observing consensus
* Other RfAs
* Village pump
* Votes
Edit summaries / comments
* Commented (%)
* Avg. comment length (log2 chars)
|
Conclusions
Merely performing a lot of production work is insufficient for “promotion” in Wikipedia. Candidates’ article edits were weak predictors of success. They also have to demonstrate more managerial behavior. Diverse experience and contributions to the development of policies and Wiki Projects were stronger predictors of RfA success. This is consistent with findings that Wikipedia is a bureaucracy [1] and that coordination work has increased substantially [8][13].
However, future work is needed to examine more closely what the admins are doing. Future admins also use article talk pages and comments for coordination and negotiation more often than unsuccessful nominees, and tend to escalate disputes less often.
Although this research has shown that judges pay attention to candidates’ job-relevant behavior and especially behavior that suggests the candidate will be a good manager and not just a good worker, it is silent about whether other factors and probit regressions on the likelihood of success in a identified in the organizational literature [9]—social networks, irrelevant attributes, or strategic self- presentation.
Indeed, recent evidence that Wikipedia admins use a secret mailing list to coordinate their actions toward others suggest that sponsorship may also play a role in promotion.
Future research in Wikipedia using techniques like those in the current paper can be used to test theories in organizational behavior about criteria for promotion. An important limitation of the current model is that it does not take the quality of contribution into account. We plan to improve the model by examining measures of length, persistence, and pageviews of edits, which are already being used in more processor intensive models of existing admin behavior [7] and impact of edits [10].
Criteria for admins have changed modestly over time. Success rates were much higher (75.5%) prior to 2006, and collaboration via article talk pages helped more in the past (+15% for every 1000 article talk edits, compared to +6.3% today). The diversity score performs similarly prior to 2006 (+3.7% then, +2.8% now). However, participation in Wikipedia policy and Wiki Projects? was not predictive of adminship prior to 2006, suggesting the community as a whole is beginning to prioritize policymaking and organization experience over simple article-level coordination.
|
If you want to read the details, you can read the PDF of the paper.
Credit: Picture by inju released under Creative Commons.
13
Aug
Tags: chart, Deletionism, funny, Inclusionism, Philosophy, pie, Wikipedia | By paolo |
2 Comments
Wow, Wikipedia developed over time a set of internal editing philosophies and users can express their agreement to a certain philosophy simply by adding a specific template in their user page.
So I could extract the following pie chart from the Wikipedians by Wikipedia editing philosophy page. (Update: as HaeB says in a commento “categories are not disjoint (…) a pie chart might not be the best visualization”. A bar chart might be better…)

The main ideological dichotomy is between Inclusionists and Deletionists. Inclusionists favor keeping and amending problematic articles over deleting them, Deletionists favor removing articles that are not encyclopedic. Currently there are 1123 self-declared Inclusionists and 261 Deletionists.
As it is typical of Wikipedia, fun enters the stage and a new philosophy emerges AWWDMBJAWGCAWAIFDSPBATDMTD, acronym for “Association of Wikipedians Who Dislike Making Broad Judgments About the Worthiness of a General Category of Article, and Who Are in Favor of the Deletion of Some Particularly Bad Articles, but That Doesn’t Mean They Are Deletionists”. Currently this the 3rd most frequest philosophy with 434 adherents, denoting how Wikipedians likes to have fun ;)
And in fact the 6th most frequent philosophy is WikiGnome (makes useful incremental edits without clamouring for attention, works behind the scenes of a wiki, tying up little loose ends and making things run more smoothly, fixing things like typos, poor grammar, and broken links) but there are also 265 WikiFairies (beautifies Wikipedia by organizing messy articles, improving style, or adding color and graphics).
Myself, I think I’m a Darwikinist or maybe not … ;)
Below the complete table and the same pie but in 3D.
Well, there’s a lot of Philosophy(ies) in Wikipedia! ;)
|
Wikipedian WikiGnomes
|
2543
|
|
Inclusionist Wikipedians
|
1123
|
|
Wikipedians in the AWWDMBJAWGCAWAIFDSPBATDMTD
|
434
|
|
Wikipedian WikiFairies
|
265
|
|
Deletionist Wikipedians
|
261
|
|
Wikipedians open to trout slapping
|
245
|
|
Wikipedians against notability
|
228
|
|
Eventualist Wikipedians
|
222
|
|
Mergist Wikipedians
|
184
|
|
Exopedianist Wikipedians
|
111
|
|
Darwikinist Wikipedians
|
109
|
|
Wikipedia users who oppose Flagged Revisions
|
94
|
|
Structurist Wikipedians
|
86
|
|
Incrementalist Wikipedians
|
85
|
|
Exclusionist Wikipedians
|
77
|
|
Wikipedian WikiElves
|
74
|
|
Metapedianist Wikipedians
|
62
|
|
Immediatist Wikipedians
|
53
|
|
Wikipedia users who support Flagged Revisions
|
51
|
|
Precisionist Wikipedians
|
39
|
|
Delusionist Wikipedians
|
35
|
|
Eguor Wikipedians
|
34
|
|
Categorist Wikipedians
|
31
|
|
Hyphen Luddites
|
19
|
|
Redlinking Wikipedians
|
18
|
|
Redirectionist Wikipedians
|
11
|
|
Wikidemocratism Wikipedians
|
11
|
|
Separatist Wikipedians
|
4
|
|
Wikipedians open to whale squishing
|
3
|
|
Transwikist Wikipedians
|
2
|
|
Unsourced BLP Rescuers
|
2
|
|
TOTAL
|
6516
|

3
Aug
Tags: Andrew Lih, presentation, slides, slideshare, Wikipedia, Wikysym | By paolo |
Add a comment
The presentation (embedded below) consists of 148 slides. Below I selected few interesting ones.
Slide 42
• Wikitravel: only 5% of those who press “edit” actually save
• Wikipedia: 1/5 to 2/5
• WikiHow: 30% with guided editing
• Wikia: WYSIWYG editor >> 50%
Sources: Jack Herrick, WikiHow; Erik Zachte, Wikimedia Foundation
Slide 91:
An experiment by The Guardian on crowdsourcing journalism.
The Guardian obtained two million pages of explosive documents that outed your country’s biggest political scandal of the decade. They’ve had a team of professional journalists on the job for a month, slamming out a string of blockbuster stories as they find them in their huge stack of secrets.
How do you catch up? If you’re the Guardian of London, you wait for the associated public-records dump, shovel it all on your Web site next to a simple feedback interface and enlist more than 20,000 volunteers to help you find the needles in the haystack.
Your cost for the operation? One full week from a software developer, a few days’ help from others in his department, and £50 to rent temporary servers.
21
Jul
Tags: comparison, culture, vietnam war, Wikipedia | By paolo |
Add a comment
Just a quick play: below I embedded the page about Vietnam war from English Wikipedia and the translation in English of the page about Vietnam war from Vietnamese Wikipedia. (click here to open just the page embedding the 2 pages).
Would be interesting to automatically check the differences in how different communities (in this case defined by the language) represent the same concepts.
For example the beginning of the article from the Vietnamese wikipedia (automatically translated) says: In Vietnam, newspapers still use the name of resistance against American for just this war, [9] as well as to distinguish it from other wars that happened in Vietnam when anti- French , anti- Japanese , anti- Mongolia , against China. Some people [10] feels not name the U.S. invasions of neutrality by the war also reflects elements of a civil war; [10] that some other name for the Vietnam War reflected the views of West rather than the people living in Vietnam. [10] The name of this war is still a matter of controversy. But now scholars in and outside Vietnam have gradually accepted the name “Vietnam War” because of its international nature.
16
Jul
Tags: papers, Research, wikia, Wikipedia | By paolo |
Add a comment
February report of few papers about Wikipedia at CSCW conference by David Karger at Haystack Blog, MIT CSAIL Research.
The paper briefly reviewed are
* Socialization Tactics in Wikipedia and their Effects, by Choi, Alexander, Kraut and Levine: studied how participants early experiences of Wikipedia—whether they were invited or began editing on their own; whether their work was ignored, admired, or critiqued; what kind of advice they received—affected users later participation in and contributions to Wikipedia.
* The work of sustaining order in Wikipedia: The banning of a vandal by Geiger and Ribes
* Readers are Not Free-Riders: Reading as a Form of Participation on Wikipedia, by Antin and Cheshire: the more you know about wikipedia (sampled with a survey), the more you participate
* Egalitarians at the Gate: One-Sided Gatekeeping Practices in Participatory Social Media, by Keegan and Gergle: which breaking news stories are featured on the front page? They studied whether this decision is made in an egalitarian fashion or whether some individuals have significantly more power. Most interestingly, they found that certain ‘elite users’ who participate in the discussion to an unusually high degree do have inordinate power to “spike” stories, preventing them from appearing, but do not seem to have power to push stories they like into appearance.
* Beyond Wikipedia: Coordination and Conflict in Online Production Groups by Kittur and Kraut. Interestingly they studied Wikia.com, a service hosting over 6000 distinct wikis all running on the same Mediawiki platform as Wikipedia. The uniformity of implementation meant that it could be ruled out as a source of different behaviors in different wikis.
Recent Comments