Papers about Wikipedia at CSCW 2010

February report of few papers about Wikipedia at CSCW conference by David Karger at Haystack Blog, MIT CSAIL Research.
The paper briefly reviewed are
* Socialization Tactics in Wikipedia and their Effects, by Choi, Alexander, Kraut and Levine: studied how participants early experiences of Wikipedia—whether they were invited or began editing on their own; whether their work was ignored, admired, or critiqued; what kind of advice they received—affected users later participation in and contributions to Wikipedia.
* The work of sustaining order in Wikipedia: The banning of a vandal by Geiger and Ribes
* Readers are Not Free-Riders: Reading as a Form of Participation on Wikipedia, by Antin and Cheshire: the more you know about wikipedia (sampled with a survey), the more you participate
* Egalitarians at the Gate: One-Sided Gatekeeping Practices in Participatory Social Media, by Keegan and Gergle: which breaking news stories are featured on the front page? They studied whether this decision is made in an egalitarian fashion or whether some individuals have significantly more power. Most interestingly, they found that certain ‘elite users’ who participate in the discussion to an unusually high degree do have inordinate power to “spike” stories, preventing them from appearing, but do not seem to have power to push stories they like into appearance.
* Beyond Wikipedia: Coordination and Conflict in Online Production Groups by Kittur and Kraut. Interestingly they studied Wikia.com, a service hosting over 6000 distinct wikis all running on the same Mediawiki platform as Wikipedia. The uniformity of implementation meant that it could be ruled out as a source of different behaviors in different wikis.

Review of “Feedback Effects between Similarity and Social Influence in Online Communities”

Today I presented to the other SoNetters a wonderful paper titled “Feedback Effects between Similarity and Social Influence in Online Communities” by David Crandall, Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, Siddharth Suri of Cornell University, presented at the 2008 KDD conference on Knowledge discovery and data mining. My review just under the slides I used for the presentation.

Besides the points already presented in the slides, here I add few points relevant for our research on Wikipedia.

Social influence: People become similar to those they interact with
Interaction ? similarity
Selection: People seek out similar people to interact with
Similarity ? interaction

They considered registered users to the English Wikipedia who have a user discussion page (~510,000 users as of April 2, 2007). They are responsible for 61% of edits to the roughly 3.4 million articles. They ignore actions by users without discussion pages, who tend to have very few social connections.

User’s activity vector v(t): number of times that he or she has edited each article up to that point in time t.
Similarity(u,v): similarity between activity vectors of user u and v.
Time of ?rst meeting for two users u and v = time at which one of them ?rst makes a post on the user discussion page of the other.

In principle, we could also try to infer social interactions based on posting to the interactions based on posting to the same article’s discussion page. Moreover, we found that using simple heuristics to infer interaction based on posts to article discussion pages produced closely analogous results to what we obtain from analyzing user discussion pages.

They ?nd that there is a sharp increase in the similarity between two editors just before they ?rst interact (selection), with a continuing but slower increase that persists long after this ?rst interaction (social influence).

They also create a model and estimate the unobservable parameters based on maximum-likelihood. The estimates are as follows:
* The parameter ?, the probability of communicating versus editing, was 0.058 (i.e. every 100 actions, 6 are talks while 94 are page edits). We can cite it and we can even verify this across different wikipedias and at different time slots.
* When considering article edits as actions, the article is chosen from one’s own interests with probability ? = 0.35, from a neighbor’s interests with probability ? = 0.081, from the overall interests of Wikipedia editors with probability ? = 0.5, and by creating a totally new article with probability ? = 0.069.
* When considering talks as actions, the user to communicate with is chosen randomly from the overall set of users with probability ? = 0.71, and someone who has engaged in a common activity with probability 1-? = 0.29

They also do some content analysis (30 instances of two users meeting for the ?rst time. We examined the content of the initial communication and any reply, looking for references to speci?c articles or other artifacts in Wikipedia. We also compared the edit history of the two users).
Of the 30 messages, 26 referenced a speci?c article, image, or topic. In 21 cases, the users had both recently worked on the artifact that was the subject of conversation.
The gap between co-activity and communication was usually short, often less than a day, though it stretched back three months in one case.
Informally, communications tended to fall into a few broad categories: o?ering thanks and praise, making requests for help, or trying to understand the editing.behavior of the other person.
This sample of interactions suggests that people most often come to talk to each other in Wikipedia when they become aware of the other person through recent shared activity around an artifact. Awareness then leads to communication, and often coordination.

A really wonderful paper!

Two talks by David Orban in Trento on April 8th: The Open Internet Of Things, and

The SoNet FBK research group is happy to invite you to two talks by David Orban on April 8th in Trento.
The first talk, “The Open Internet Of Things”, will be about OpenSpime. It will be interesting if you are interested in sensors, positioning devices and memory, social, Web 2.0-style services in the real world, green technology, tech applied to the environment, open hardware and software, communications protocols, and future in general.
The second talk, “Preparing Humanity For The Impact Of Accelerating Technological Change”, will talk about the Singularity University, a recent new initiative funded by Nasa, Google and more.
I’ll wait you on April 8th!

First talk: The Open Internet Of Things
8 April 2009 - at 10.00 - Conference Room - Fondazione Bruno Kessler - Povo (TN) (up in the hills, see the map)
If we want the the forthcoming Internet of Things to flourish, the distributed smart sensor networks which take the current infrastructures for granted and base their necessarily autonomous activities on massive data collection, then we have to adopt an open architecture. Only an interoperable approach to the design of the next generation of hardware and software systems is going to be able and leverage the dramatic effects, and express the value to human civilization that the network of tens, or thousands of billions of new objects, the spime network is going to shape. For more info see http://www.openspime.com

Second talk: Preparing Humanity For The Impact Of Accelerating Technological Change
8 April 2009 - at 15.00 - Conference Room - Fondazione Bruno Kessler - Trento (downtown, see the map)
The impact of advanced technologies on our societies is becoming more and more extreme, exposing new tensions in our models of human relationships, learning, and values in policies, politics, and business. While relinquishment has been recommended by some, it appears that the way ahead will be the use of more, not less technology, as billions of people aim to achieve a high quality of life for themselves, and their children. The Singularity University, recently formed on an open, international and interdisciplinary approach employs an advanced curriculum to analyze how the future leaders of enterprise, culture, and science can best prepare to face the serious challenges ahead.

About the speaker:
David Orban is an entrepreneur and visionary. In recognition of his lifetime contribution to exponentially advancing technologies, he has been honored with the position of Advisor and European Lead to the prestigious Singularity University.
He is a Founder and Chief Evangelist of WideTag, Inc., a high technology start-up company providing the infrastructure for an open Internet of Things. David cuts across the limits of deep specialization to contribute to the new renaissance. He explains, “My vision is at the crossroads of technology and society as defined by their co-evolution.” David Orban’s personal motto is, “What is the question I should be asking?” This concept is his vehicle to accelerating cycles of invention and innovation in order to build the new world ahead.

Reblog this post [with Zemanta]

Insights into relationships on Facebook

Interesting blog post by Cameron Marlow, research scientist at Facebook over at overstated.net: Maintained Relationships on Facebook.

They start from a simple question: is Facebook increasing the size of people’s personal networks?

They looked at the communications of a random sample of users over the course of 30 days and defined networks in 4 different ways:

  • All Friends: the largest representation of a person’s network is the set of all people they have verified as friends. In research papers this number ranges between 300 and 3000. In facebook on average every users has 120 friends.
  • Reciprocal Communication: as a measure of a sort of core network, we counted the number of people with whom a person had had reciprocal communications, or an active exchange of information between two parties. In research papers, this numbers ranges from 3 as individuals with whom I can discuss important matters (for Americans) to 10 or 20 as ongoing contacts at a university.
  • One-way Communication: the total set of people with whom a person has communicated.
  • Maintained Relationships: the set of people for whom a user had clicked on a News Feed story or visited their profile more than twice. This is a sort of over-the-shoulder relationship, I’m “following” (this is the relationship type) the target user without she necessarily knowing it. This is a new type of relationship (not really available says 50 years ago), similar to reading the flow of thoughts of someone via a blog or just looking at the pictures uploaded on Flickr.

An interesting observation: “as a function of the people a Facebook user actively communicate with, you are passively engaging with between 2 and 2.5 times more people in their network”.

And another one: The stark contrast between reciprocal and passive networks shows the effect of technologies such as News Feed. If these people were required to talk on the phone to each other, we might see something like the reciprocal network, where everyone is connected to a small number of individuals. Moving to an environment where everyone is passively engaged with each other, some event, such as a new baby or engagement can propagate very quickly through this highly connected network.

facebook stats

Social Networks and Web 2.0 papers at WWW2009

The recently announced list of accepted papers at WWW 2009 conference is at the end of this post. I’m particularly interested in the track “Social Networks and Web 2.0″ and in the following papers:

  • Ulrik Brandes, Patrick Kenis, Juergen Lerner and Denise van Raaij. Network Analysis of Collaboration Structure in Wikipedia
  • Yutaka Matsuo and Hikaru Yamamoto. Community Gravity: Measuring Bidirectional Effects by Trust and Rating on Online (mentioning the Epinions dataset, maybe the dataset I released on Trustlet)
  • Shilad Sen, Jesse Vig and John Riedl. Tagommenders: Connecting Users to Items through Tags
  • Jérôme Kunegis, Andreas Lommatzsch and Christian Bauckhage. The Slashdot Zoo: Mining a Social Network with Negative Edges
  • Cristian Danescu Niculescu-Mizil, Gueorgi Kossinets, Jon Kleinberg and Lillian Lee. How opinions are received by online communities: A case study on Amazon.com helpfulness votes
  • Meeyoung Cha, Alan Mislove and Krishna Gummadi. A Measurement-driven Analysis of Information Propagation in the Flickr Social Network

Continue Reading

My chapter in “Computing with Social Trust”

Computing with Social TrustThe book “Computing with Social Trust” is out. In it you can find a chapter by Paolo Avesani and myself about my PhD work on Trust in Recommender Systems. You can download my chapter or buy the dead-tree book from Amazon. Following you can find the Table of contents. Enjoy!

.
.
.
.
.
.
.
.
.
.
.

Continue Reading

Happiness as a contagious virus: please spread it!

Some papers are more worth than others.
Dynamic spread of happiness in a large social network: longitudinal analysis over 20 years in the Framingham Heart Study by James H Fowler and Nicholas A Christakis.
Solid analysis based on data from 4739 individuals followed from 1983 to 2003.

Conclusions People’s happiness depends on the happiness of others with whom they are connected. This provides further justification for seeing happiness, like health, as a collective phenomenon.

Objectives To evaluate whether happiness can spread from person to person and whether niches of happiness form within social networks.

Results:
Clusters of happy and unhappy people are visible in the network, and the relationship between people’s happiness extends up to three degrees of separation (for example, to the friends of one’s friends’ friends).
People who are surrounded by many happy people and those who are central in the network are more likely to become happy in the future.
Longitudinal statistical models suggest that clusters of happiness result from the spread of happiness and not just a tendency for people to associate with similar individuals. A friend who lives within a mile (about 1.6 km) and who becomes happy increases the probability that a person is happy by 25% (95% confidence interval 1% to 57%). Similar effects are seen in coresident spouses (8%, 0.2% to 16%), siblings who live within a mile (14%, 1% to 28%), and next door neighbours (34%, 7% to 70%). Effects are not seen between coworkers. The effect decays with time and with geographical separation.

(credits: Photo by beija-flor released on Flickr under Creative Commons Attribution Noncommercial No Derivative license)

Kickoff meeting and public presentation for LiveMemories project with Ricardo Baeza-Yates from Yahoo! Research

livememories Wednesday October 22th 2008, in Trento there will be the kickoff meeting for the LiveMemories project, Active Digital Memories of Collective Life (in which I’m involved). The public workshop is open to everybody (it will be at least translated in Italian).
UPDATE: Now with blog in Italian http://lamemoriaaltempodiinternet.wordpress.com.
Check the program of the workshop or read it here below copy and pasted. There will be Ricardo Baeza-Yates, Director of the Yahoo! Research labs at Barcelona speaking about the Impact of Social Networks, Alessandro Cavalli - Professore di Sociologia, Università di Pavia, speaking about “La Costruzione Sociale della Memoria Collettiva”, Simon Delafond - Web producer - BBC, UK speaking about “BBC Memoryshare initiative” and presentations from the project partners and a collective discussion about “Quale modello per la libera circolazione della Memoria?”

I’m really looking forward for the event! If you are interested or you are coming, please let me know! See you!

Continue Reading

My first paper published under Creative Commons!

Page-reRank: Using Trust to Re-Rank Authority
Time ago I received the request to republish one of my paper in the book “Internet Search Engines - An Introduction“. So I took the chance to extend my paper “Page-reRank: using trusted links to re-rank authority” from 4 to 10 pages and cordially give permission to include it in the book.
The publisher is ICFAI University Press which of course is not Oxford Press; it is an publisher for Indian Universities and in fact after publishing I received few emails from Indian students.
Anyway what I’m more proud of is that I have a Creative Commons released paper published on a book! When they asked me to publish it, I put this as condition and they said “yes”. Since I tried many other times to amend the copyright form publishers ask you to sign before publication (in general it basically says “you give us all the rights”) with something a bit more liberal such as a Creative Commons license, I’m very happy about this, about the license.
Page-reRank: Using Trust to Re-Rank AuthorityThe license is a Creative Commons Attribution-Share Alike 3.0 License so you can legally do whatever you want with the paper as long as you cite me and share what you produce with the same license.
Anyway in the book I’m in good company: there is also a paper by Prabhakar Raghavan, head of Yahoo! Research “Using PageRank to Characterize Web Structure” and one by Ricardo Baeza-Yates, director of Yahoo! Research labs at Barcelona “Pagerank Increase under Different Collusion Topologies”.
This post is also an excuse for starting my blog on Nature.

Following there is the summary of my paper as it appears on the book, but you can also download the paper from my site.

The tenth article titled “Page-reRank: Using Trusted Links to Re-rank Authority” by Paolo Massa, highlights that the present HTML linking mechanism does not allow the author of a web page to express the endorsements of its content. Consequently, algorithms like PageRank produce rankings that do not capture the different intentions of web authors. The authors explore the possibility of adding simple semantic extensions to the hyper linking mechanism, by using a large real world data set and demonstrate the different page rankings produced by considering extra semantic information in page links. The paper concludes that by adopting (programming) languages that allow authors easily encode simple semantic extensions to their hyperlinks, the web (or search) intelligence can be optimized to pull relevant pages for a given search query.

Blogging on Nature: why not?

Some months ago I was asked to open a blog on Nature. I’m in a period of small mood for blogging, so I postponed the idea of opening the blog on Nature until now.
Partially I was also wondering about some questions such as “Wow! A blog on Nature! How does it count? I’ll probably never have a paper in Nature but a blog yes. So what? How many blogs there are at the moment on Nature? A quick check says around 80. Uhm. This is not so exclusive. Will I insert it in my curriculum? Probably not. Does a blog counts as a paper? Surely not. Maybe things will be different in future? For sure, but not too different”.
Anyway, if 10 years ago somebody would have told me “one day, you will blog on Nature!”, I would have replied “No bet!” … well, actually 10 years ago the word “blog” was still to be proposed (the term “blog” was coined by Peter Merholz in April or May of 1999 according to the Blog page on Wikipedia as it is today) so maybe the reply would have been more a “I will do what?!?”.
Nevertheless, blogging on Nature is surely about new ways of doing research and of publishing your ideas so I’m in the game.
My blog on Nature is at http://network.nature.com/blogs/user/paolo-massa, the plan for now is to repost and possibly extend some posts related to trust and society I post at gnuband.org, for the future I guess we’ll see.