Use of Tags on del.icio.us follows a powerlaw

I read the wonderful Ontology is Overrated: Categories, Links, and Tags by Clay Shirky (highly recommended! Read it all!). Near the end, he speaks about “Tag Distributions on del.icio.us” and shows a graph that resembles a powerlaw (even if this is about only 2 hours of activity of 64 del.icio.users). After 2 weeks of powerlaws, I see powerlaws everywhere and I thought “let’s try to test the hypothesis on a bigger dataset from del.icio.us”. Well, few googling-minutes told me that many people had already had this idea and already performed tests on del.icio.us.
And of course many of them can be found looking at http://del.icio.us/tag/powerlaw (the del.icio.us page that shows all the URLs tagged under “powerlaw”) [this is kind of uber-cool-self-referentialism].
Among the many, I just cite http://www.cozy.org/d/
(from which the image shown here is taken), where 84 popular URLs are studied and shown to exhibit a powerlaw structure (in the tags used for them). I suspect the value of del.icio.us can be found in the long tail of tagging as well.
Each dot on the log-log charts represent a tag. The most used tag appears to the left while the least appears to the right. All charts have the same x and y range, .5 to 1350; so the slope of these lines is about -1.

Leave a Reply

Your email address will not be published. Required fields are marked *