Tag Archives: Wikipedia

Wikimedia Foundation is hiring!

Wikimedia Foundation (which runs among others Wikipedia) is looking for creative, motivated people who want to work in a highly-collaborative environment. They are positions in 22 areas and many are open until April 17, 2011 so hurry up!
The positions are based in San Francisco, but in some cases may be open to the possibility of people working remotely.



Global Development:

Finance, Administration and Legal:

Cross-cultural studies of Wikipedia

Papers I’m aware of that compare different Wikipedias. Do you know of other investigations comparing Wikipedias?

Cultural Differences in Collaborative Authoring of Wikipedia” [1] compared French, German, Japanese and Duch Wikipedia. They used content analysis methods on just the page “Game” from the different Wikipedias, i.e just 4 pages. Authors find some correlations between patterns of contributions (number of deleting actions, of adding actions, of corrective actions) and the four dimensions of cultural influences proposed by Hofstede (Power Distance, Collectivism versus Individualism, Femininity versus Masculinity, and Uncertainty Avoidance). They conclude thatcultural differences that are observed in the physical world also exist in the virtual world.
Cross-cultural analysis of the Wikipedia community” [2] analyzed English, Hebrew, Japanese, and Malay. They used content analysis of 120 Wikipedia talk pages (randomly sampled among “user talk pages”, “article talk pages”, and “Wikipedia policies talk pages”) in 4 language Wikipedias that differ in size and culture: English (western, big), Hebrew (western, small), Japanese (eastern, big) and Malay (eastern, small). Authors find that “Courtesy” postings were more frequent in large than in small Wikipedias, and in Eastern than in Western (significant). This is probably connected to Hofstede’s high vs low power distance, because high politeness is associated with high power distance. Plus, in collectivistics/high power distance cultures relationships prevail over tasks. Other correlations were not significant.
Issues of cross-contextual information quality evaluation — The case of Arabic, English, and Korean Wikipedia” [3] compared Arabic, English, and Korean Wikipedias. Authors used many different methods, including content analysis of featured articles and count of number of Internal Links, of edits, of Adjacent Pages, of Registered Users, … and applied multivariate statistical analysis in order to find correlations. Hofstede’s cultural dimension scores for the United States, South Korea and the Arab World were also used to assess pair-wise similarity of the Wikipedias at the cultural level. They conclude that different Wikipedia communities may have different models for quality.

Conflictual Consensus in the Chinese Version of Wikipedia” [4] focuses on one single Wikipedia, the Chinese one, and compares point of regional differences of its contributors based on four regions of origin (Mainland, Hong Kong / Macau, Taiwan, and Singapore / Malaysia). Authors claim that the main issue threatening the potential growth of Chinese Wikipedia are not the internal conflicts, nor the external competition by Baidu Baike but the evolution of the newly established “Avoid Region-Centric Policy”.

Analyzing Cultural Differences in Collaborative Innovation Networks by Analyzing Editing Behavior in Different-Language Wikipedias” [5] does not use manual content analysis but social network analysis as a lens for comparing English, German, Japanese, Korean, and Finish language Wikipedias finding a difference between egalitarian cultures such as the Finnish, and quite hierarchical ones such as the Japanese.


[1] Pfeil, U., Zaphiris, P. and Ang, C. S. 2006. Cultural Differences in Collaborative Authoring of Wikipedia. Journal of Computer-Mediated Communication, 12, 88–113.

[2] Hara, N., Shachaf, P., & Hew, K.F. 2010. Cross-cultural analysis of the Wikipedia community. Journal of the American Society for Information Science and Technology, 61(10), 2097–2108.

[3] Stvilia, B., Al-Faraj, A., & Yi, Y. 2009. Issues of cross- contextual information quality evaluation—The case of Arabic, English, and Korean Wikipedias. Library & Information Science Research, 31(4), 232-239.

[4] Liao, H. 2009. Conflictual Consensus in the Chinese Version of Wikipedia. IEEE Technology and Society Magazine.

[5] Nemoto, K. Gloor, P. 2010. Analyzing Cultural Differences in Collaborative Innovation Networks by Analyzing Editing Behavior in Different-Language Wikipedias. Proceedings of COINs 2010, Collaborative Innovations Networks Conference, Savannah GA, Oct 7-9, 2010

Diderot on the Encyclopédie

Noting that it could not be the work of a single man, for no one man is capable of knowing everything, Diderot refutes the Jesuit argument that the task would never be completed by saying that time, energy, and genius make impossible tasks possible.

An encyclopedia ought to make good the failure to execute such a project hitherto, and should encompass not only the fields already covered by the academies, but each and every brand of human knowledge. This is a work that cannot be completed except by a society of men of letters and skilled workmen, each working separately on his own part, but all bound together solely by their zeal for the best interests of the human race and a feeling of mutual good will.

Wondering what Diderot would say about Wikipedia today…

Source: Historical Text Archive.

Paper “Wikipedia research and tools: Review and comments”

The draft paper “Wikipedia research and tools: Review and comments” by Finn Arup Nielsen (dated March 17, 2011) is a very useful 56-pages resource highlighting key areas of research for Wikipedia (with citations to relevant work already published). The key areas identified are in the following. The cited papers (with annotations!) are 236! Even if this is draft paper, it is a super valuable resource! Check the pdf file.

Identified key research areas. Quality, Factual errors, Coverage and bias, Actuality, Sources, Accessibility, Size across languages, Network analysis, matrix factorizations and other operations, Genre, Article feedback, Vandalism reversion, Biased editing, Use of Wikipedia in court, User contributions, User characteristics, Organization, Popularity, Why do people edit?, Why do people leave?, Why does it work?, Serving content, Using categories, Thesaurus construction, Translation, Trend spotting and prediction, Searching with Wikipedia, Databasing the structured content, Geography, Extending Wikipedia, Quality assessment, certification and rating, Automatic creation of content, Tables and databases, Semantic wikis, Form-based editing, Markup, Extended Authoring, Geographical extension, Extending browsing, Graphic extensions, Video extensions, Real-time editing, Distributed and disconnected Wikipedia, Wiki and programming, Using Wikipedia and other wikis in research and education, Attitude towards Wikipedia, Use of Wikipedia, Citing Wikipedia, Special wikis, Censorship, Carl Hewitt vs. Wikipedia, Wikipedia and wikis as a teach- project, Wikiversity, serves the purpose of building a ressource for teaching and learning tool, Using wikis for course communication, Textbooks, Future.

Abstract: I here give an overview of Wikipedia and wiki research and tools. Well over 1,000 reports have been published in the field and there exist dedicated scientific meetings for Wikipedia research. It is not possible to give a complete review of all material published. This overview serves to describe some key areas of research.

Credits: Image by XKCD released under a Creative Commons Attribution-NonCommercial 2.5 License.

Video of evolution in time of the Wikipedia page about London bombings

History unfolding from phauly on Vimeo.

7 July 2005
08.50 London is struck by three bombs.
09.18 (just 28 minutes later) on Wikipedia, the user Morwen creates the page “7 July 2005 London bombings”.
10.38 76 different Wikipedians made 250 edits to this page already, trying to make sense of reality in realtime …
By the end of the day the Wikipedia page “7 July 2005 London bombings” have been edited 2581 times!

The video “History unfolding” shows the evolution in time of the Wikipedia page “7 July 2005 London bombings”. Technically, I extracted from the API all the revisions of the Wikipedia page and I got a screenshot of each of them using Firefox with Page Saver extension running on an X virtual framebuffer (I tried khtml2png but I was unable to install it). Then I put together all the screenshots with mencoder and added the audio.
Wikipedia pages are released under the Creative Commons Attribution-ShareAlike License. The soundtrack I added is Unfinished History by Johaness Gilther, released on Jamendo as Creative Commons Attribution-NoDerivs. So my video is released under Creative Commons Attribution-ShareAlike License. Enjoy!

The video is just one example of history unfolding under your eyes as it develops, of how people create their collective memories in real time.
We can now investigate how we, as a society, create our world, our perceptions of the past.
Now we can research past, present and future! And control it together!

“Who controls the past, controls the future; who controls the present, controls the past.”
Nineteen Eighty-Four – George Orwell

Wikipedia mentioned in books in 1975

UPDATE: Dami, in a comment to this post, says “if a word appears in a newer edition of an older work (e.g. in the introduction section of cheap reprints of public domain books) Google will count it as an appearance at the time the original work was published.” I checked and this is true, thanks Dami!

I was playing with Google Books Ngram Viewer, which allows you to check how frequently certain phrases occurred in books published since 1950 up to 2008.
Curiously the following graph reports that some books (only 0.0000011% but greater than zero anyway!) were containing the work “wikipedia” (and “wiki”) already in 1950 and in 1975. Maybe there is a small bug even in mighty google services?

The following graph instead shows the increase (as expected) of mentions to “wikipedia” and “wiki” in books since 2003.

Percentage of men and women on different social networking sites (Facebook, Twitter, Linkedin, …)

Lots of debate arose around the fact almost 87% of Wikipedia editors are male. This is not necessarily true since the survey on which this “fact” is based has some biases (for example, people self-elected to answer).
However, a query run on the Wikipedia database showed that more than 83% self-identified as male.
While these numbers are not 100% representative of reality, it is probably true that most of editors are male. This is acknowledged also on a Wikipedia page about the systemic bias of Wikipedia (yes, I know this very page has been written by people whose bias we are trying to interpret but, going to the extremes, it’s turtles all the way down ;)

So the question could be: what is the ratio male/female on other social networking sites?

Just, for comparative reasons (and a bit for fun too), I compiled the following table based on the Social Network Analysis Report by Ignite Social Media. The table is sorted so that first lines are sites in which there are relatively more females than males. I’m not familiar with all the sites but it seems that sites more populated by women are the very social and playful (such as Haboo, Bebo, Myspace, Xanga, Facebook). On the other side of the spectrum there are sites populated most by males: sites showing what’s interesting right now thanks to social bookmarking such as Reddit, Digg, Identi.ca, and “professional” network sites such as Linkedin and Plaxo.
This table is not “scientific” in any way as well (for instance, percentages in the report are gathered from Google Ad Planner and Google Insights for Search).
Consider the following table just as more food for thought. Does it confirm your intuitions? Or should I say prejudices? ;)

  Social network site Percentage of females
Habbo 66%
Bebo 62%
Myspace 62%
Xanga 62%
Facebook 55%
Ning 55%
Hi5 52%
Meetup 52%
Tribe 52%
Twitter 52%
Yelp 52%
Flixster 50%
Foursquare 50%
Friendster 50%
Flickr 48%
Last.fm 48%
Livejournal 48%
Metafilter 48%
Multiply 48%
Plaxo 45%
Stumbleupon 45%
Badoo 43%
Mixx 43%
Linkedin 40%
Netlog 40%
Newsvine 40%
Plurk 40%
Identi.ca 34%
Digg 32%
Indianpad 24%
Reddit 24%

Credits: Icons by socialshift, elegantthemes and WpZoom.

Percentage of men and women on different Wikipedias

Few days ago there was an interesting article on NYTimes about the small percentage of women on Wikipedia.
Today on the gendergap mailing list at wikipedia there is a very interesting ongoing discussion. Some preliminary statistics from the discussion are:

Wikipedia in specific language Number of users who specified gender in preferences Percentage of users who specified gender in preferences How many men How many women Percentage of women
13959842 2.01% 233312 46973 16.76%
1167708 3.47% 35726 4800 11.84%
998668 2.16% 18556 3054 14.13%
78180 2.66% 1666 414 19.90%
620393 16.80% 80491 23750 22.78%
414511 3.64% 12106 2999 19.85%
368815 2.92% 8977 1781 16.56%
1464442 2.26% 27980 5070 15.34%

Interesting to note how on Russian Wikipedia, users tend to express their gender much more (16.80%!). Do you have ideas if (1) this is a cultural issue specific of Russians, (2) it depends on the practices of the specific Wikipedia in Russian or (3) it depends on the user interface, for example it might be that when you register you are redirect to an HTML page in which you can specify also your gender?
Also interesting is the fact that in this Wikipedia the percentage of women is the highest (22.78%). Probably the reason is that in a place in which gender is more represented, it is more normal for women to represent it as well. While where gender it is not represent, it is in general foolish for women to explicitly say “Hey, I’m female!” in order not to attract (additional) unwanted messages. Or put in other terms, OMG Girlz Don’t Exist on teh Intarweb!!!!1.

Img by nojhan, under Creative Commons

Professor: What is an encyclopedia? Student: Is it something like Wikipedia?

I was viewing the presentation by Steven Walling titled “Why Wikipedians are the Weirdest People on the Internet” (embedded below) and the second slide was a twit by alisonclement which says:

Yesterday I asked one of my students if she knew what an encyclopedia is,
and she said, Is it something like Wikipedia?

Amazing! Changing times indeed, I remember when I was a kid and one of the most valuable things in our house was a 20-something volumes encyclopedia, admiringly and respectfully placed at the center of our best cupboard … ;)