“Social networks of Wikipedia” paper accepted at HyperText 2011

The paper I wrote “Social networks of Wikipedia” got accepted for the 22nd ACM Conference on Hypertext and Hypermedia.If you are going to be as well in Eindhoven, on June 6-9, 2011, please let me know!
If you are interested, you can read the entire paper, the abstract is below. We also released the source code (Python) at sonetlab and released some network datasets extracted from User Talk pages (in GraphML format so you can easily import it in your tool, we like Gephi).


Network extracted from User Talk pages of Venetian Wikipedia visualized with Gephi.

Wikipedia, the free online encyclopedia anyone can edit, is a live social experiment: millions of individuals volunteer their knowledge and time to collective create it. It is hence interesting trying to understand how they do it. While most of the attention concentrated on article pages, a less known share of activities happen on user talk pages, Wikipedia pages where a message can be left for the specific user. This public conversations can be studied from a Social Network Analysis perspective in order to highlight the structure of the “talk” network. In this paper we focus on this preliminary extraction step by proposing different algorithms. We then empirically validate the differences in the networks they generate on the Venetian Wikipedia with the real network of conversations extracted manually by coding every message left on all user talk pages. The comparisons show that both the algorithms and the manual process contain inaccuracies that are intrinsic in the freedom and unpredictability of Wikipedia growth. Nevertheless, a precise description of the involved issues allows to make informed decisions and to base empirical findings on reproducible evidence. Our goal is to lay the foundation for a solid computational sociology of wikis. For this reason we release the scripts encoding our algorithms as open source and also some datasets extracted out of Wikipedia conversations, in order to let other researchers replicate and improve our initial effort.

Subscribe to RSS Feed If you enjoyed reading this, subscribe to my RSS Feed
(you can always unsubscribe later)

3 Responses to this post.

  1. Przykuta's Gravatar

    Posted by Przykuta on 18.04.11 at 9:18 am

    Mozilla Firefox 3.6.16 Ubuntu Linux

    Huh, talk pages are good for newbies (or maybe new wikis too), but “old” wikipedians use IRC, e-mail, mailing lists, IM (ICQ etc). I rarely talk with my colleagues from Wikipedia by their talk pages ;) So, we can’t describe social structure well only by public talk pages. Visualisation is very nice. THX for pdf.

  2. Masur's Gravatar

    Posted by Masur on 18.04.11 at 9:18 am

    Mozilla Firefox 4.0 Windows Vista

    As my comment I wanted to add, that we discussed this issue (Wiki as a social network) and were wondering whether user talk pages can reflect networking of users. As Przykuta noted above, first dubious thing about using talk pages as indicators for networking is that they aren’t the major way of communicating and in some instances are completely neglected way of “socializing”.

    Another thing is, that they contain way too much unsignificant records, to be treated as a reliable indicator of a networking. I.e. bot entries, warnings and informations based on templates, different invitations (for wikiprojects and so on) and single exchanges of information. Weekly, as an admin, I leave hundreds of such entries on hundres of different talk pages, and none of them mean that I really form any kind of network (certainly not a social one) between me and other users.

    I think these issues should be considered before using talk pages as data for any kind of research.

  3. paolo's Gravatar

    Posted by paolo on 18.04.11 at 9:18 am

    Google Chrome 10.0.648.205 Linux

    Thanks Przykuta and Masur for your comments (I’m going to write something on your talk pages on Wikipedia in order to form a (social) link ;)

    I’m aware considering messages on talk pages is not perfect but it is an easily measurable indicators. If this corresponds to social networking (and how much) is still to be seen. This is just a first step and surely interviews with admins and other users can be very useful in order to understand how users use user talk pages and which other communication channels they use.

    So, thanks for your comments and … stay tuned! ;)

Respond to this post

mcdonald army health centerbuy metronidazole online no prescription