Tag Archives: Semantic web

Use of Tags on del.icio.us follows a powerlaw

I read the wonderful Ontology is Overrated: Categories, Links, and Tags by Clay Shirky (highly recommended! Read it all!). Near the end, he speaks about “Tag Distributions on del.icio.us” and shows a graph that resembles a powerlaw (even if this is about only 2 hours of activity of 64 del.icio.users). After 2 weeks of powerlaws, I see powerlaws everywhere and I thought “let’s try to test the hypothesis on a bigger dataset from del.icio.us”. Well, few googling-minutes told me that many people had already had this idea and already performed tests on del.icio.us.
And of course many of them can be found looking at http://del.icio.us/tag/powerlaw (the del.icio.us page that shows all the URLs tagged under “powerlaw”) [this is kind of uber-cool-self-referentialism].
Among the many, I just cite http://www.cozy.org/d/
(from which the image shown here is taken), where 84 popular URLs are studied and shown to exhibit a powerlaw structure (in the tags used for them). I suspect the value of del.icio.us can be found in the long tail of tagging as well.
Each dot on the log-log charts represent a tag. The most used tag appears to the left while the least appears to the right. All charts have the same x and y range, .5 to 1350; so the slope of these lines is about -1.

Folksonomies criticism

I tend to be enthusiastic about folksonomy and forget considering in what they are good and in what they are not, basically I forget to keep asking myself questions instead of blatantly state “Here we need a folksonomy! Yeahhey!!!”. Anyway, as a sort of balance, you might want to read a post by Gene Smith and one by danah that are more critics than I am (unfortunately).

New paper: Learning Contextualised Weblog Topics

I forgot about another paper I wrote: Learning Contextualised Weblog Topics (pdf) will be presented at WWW 2005 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics in Chiba, Japan, May 10th 2005. My boss was going to WWW2005 for presenting another paper and so we decided to submit our ongoing work to this workshop to get some feedback. We are still working with the system but we should be ready for prime time soon enough … stay tuned!
[I would have loved to meet Ethan Zuckerman that is the invited speaker at this workshop and whose work on media attention is just delicious. (I even proposed to help him in coding something for monitoring the Italian media world but it’s too bad I’m so lazy)]

If you like, check the paper Learning Contextualised Weblog Topics (pdf)
Abstract: In this paper, we examine how a topic-centric view of the Blogosphere can be created. We characterise the problems in aligning similar concepts created by a set of distributed, autonomous users and describe current iniatives to solve the problem. We introduce the Tagsocratic project, a novel initiave to solve the concept alignment problem using techniques derived from research in language acquisition among distributed, autonomous agents.

Tag your friends

The interface of Rojo is totally unusable (at least to me), i don’t understand the interface metaphors. What attracted me was the ability to tag your friends. So a curiosity: how would you tag me?
Our vision is that the next generation of feed reading requires new forms of organization so we built in the ability to tag your world, your content, your feeds, and even your friends.

FolkOS: Folksonomy Operating System

We were used to organize our bookmarks in folders, then del.icio.us came and we now appreciate folksonomies (flat taxonomies, just a set of free keywords you can attach to URLs). We are used to operating systems that allow us to categorize files (knowledge) on folders, would it make sense to have an operating system that allows us to categorize files only based on taxonomy (just add keywords to any file, all the files are in a flat pool)? I don’t know.
What I know is that the total lack of concurrency in the Operating Systems domain (actually just one global monopoly) is depriving all of us of new ideas, new paradigms, progress. If you compare it with the vibrant Web, where a new idea gets implemented and proposed almost daily, you can maybe see how far we would be if there were a free market for Operating Systems.
Anyway, how could we call it? What about FolkOS? FolkOS, the Folksonomy Operating System, I can already see the advertisements…. And, yes, I patented the idea, I got every possible TradeMark and not only on Earth. I patented FolkOS also on Venus and Alpha Centauri (venusians and alphacentaurians be aware! Don’t use my patented ideas! I have the best lawyers of the galaxy!).
[I tend to overload my emails of smilies (for expressing when I’m joking) but I don’t like them on blog posts, so I’m not sure my 4 readers understand when I (try to) make a joke. So, just to be sure, this is a joke … I think patenting computational ideas is a total nonsense (maybe a video can help in understanding why)].

CiteULike: A free online service to organize your academic papers

[I’ll write something about my trip in Israel later on, as time permits]
I just found on HubLog an online service I was really waiting for: CiteULike (a prototype service to manage your personal library of academic papers). When you are logged in and visiting a page related to a paper, you can post that paper to your online library using a bookmarklet. In doing so, you can also specify tags, a list of keywords you’d like to associate with this article (a la del.icio.us and flickr) and optional notes. The service is very similar to del.icio.us (simple, tag-powered and social), but precisely tailored for academic papers. You can also see all the papers tagged under a certain tag (for example networks). Cool!
Continue reading

Repository of category-tagged blog posts: anyone?

Some colleagues of mine are working on “how people can reach a shared common dictionary/language to denote concepts” (or at least understand each other still using their keywords). See Advertising games. We want to test ideas using real data from the blogosphere. The idea is to detect when 2 bloggers are posting about the same concept/topic but use different names to tag it (the post’s category). For example, I use “trust and reputation”, someone else uses “reputation” but we may speak about the same concept.
The questions:
– There is an aggregated repository of posts with categories?
– If not, Have you any idea about how can I collect this information?
– posts must have a category associated (livejournal and blogger don’t let do this, while MovableType and WordPress yes).
Some ongoing web search about the topic we’re doing can be found at this wiki page, and this too. Thanks for help!

(Late) report on FOAF workshop

The FOAF workshop in Galway was almost 20 days ago, so the following report is a little bit late. Hope it can be useful at least as an historical memory.
It was fantastic to meet in flesh many people I just learnt to appreciate through their blogs. Many of the papers were very interesting. I especially like the idea of “Semantic cookies” (you keep your profile [as FOAF file] in a cookie and, with some trick, you give access to every site to it, sites can read it and give you a personalized experience) and “Bootstrapping the FOAF-Web: An Experiment in Social Network Mining” by Peter Mika (the idea is to use Google to infer social relationships among people). And there was also my paper of course. The presentation was so and so, I think I try to put too many concepts for a 15 minutes presentation. The only stuff I liked was the subtitle I wrote at the last second on the first slide: “Moleskiing: Climbing the peaks of FOAF”.
Almost half of the workshop was devoted to very interesting Breakout sessions.
Continue reading