Tag Archives: Folksonomy

Comments to “A cognitive analysis of tagging” by Rashmi Sinha

I wrote this comment to the great post A cognitive analysis of tagging (or how the lower cognitive cost of tagging makes it popular) but it does not appear in the comments so I post it here.

Wow, I overenjoyed your short-enough essay. Extremely clear!
Might I suggest you 3 additional topics you might want to consider and include in your struggle for understanding? I would do it myself but I’ll never be able to write as clearly as you ;-)

1) Visit http://cloudalicio.us/tagcloud.php?url=http://boingboing.net/
The graph shows the evolution in time of the tags used to tag a specific URL (in this case http://boingboing.net). You may notice that in the beginning people were using more “blogs” and now people use more “blog”. This suggests people are moving from a category-like way of using del.icio.us (I put boingboing in the “blogs” folder that contains all the blogs) to a tag-like way of using del.icio.us (I name boingboing as a prototype of the class “blog”).
Someone was making this point (surely more clearly) on some blog but I could not find it again. Anyway this is true also for other blogs and this is real, thriving evidence.

2) At http://www.blumpy.org/tagwebs/ there is another “cognitive” approach to the tagweb (or tagspace or tagsphere).
I wrote about it at http://moloko.itc.it/paoloblog/archives/2005/02/04/tag_the_tag_tag_and_metadadaism.html
Jakob argues “a neuron in your brain is a lot like a tag in a tagweb”. A tagweb is a network of tags whose edges are the “this tag is tagged with this tag” relationship, for example he tags the tag “Victoria” with the tag “female”.
Will it be possible/useful to let users tag the tags themselves?

3) Of course it would be better to have people tagging stuff in a way that makes sense to them but, as soon as tags are public (everyone can see them), there is concern about tag spam (I tag something with a certain tag so that other people will be exposed to it). This is not a problem when tags are private, for example for the tag you use in your gmail account: no big deal in spamming yourself, no?
I wrote about it at
http://moloko.itc.it/paoloblog/archives/2005/01/29/what_is_tag_spam_or_better_tag_spam_exists.html (from where you can find interesting links). Or check the image at http://www.micropersuasion.com/2005/07/yahoo_myweb_bec.html
In order to make better tag systems (I think this is one of your goals), we must take into account this issue as well. Of course one simple solution would be to give you the possibility to see only resources tagged by friends (flickr and Y!MyWeb2.0 let you do this) or friends of friends, i.e. users deemed trustworthy by a simple and customizable trust metric. What do you think?

AAAI05: terrific talk by Marty Tenenbaum

AI Meets Web 2.0: Building The Web of Tomorrow Today by Dr. Jay M. Tenenbaum.
Terrific terrific talk, fascinating. I should have podcasted it because you really missed something (except I have nothing to record audio on, would you consider sending me your old mp3 recorder pen?). I was so excited during the talk that I happened to take a photo of almost any slide. Actually the slides were 94 and I photoed 59 of them! Incredible to me as well.
Anyway, you might want to read the slides (pdf) or maybe you want to have a look at my pictures (possibly as a slideshow).
He introduced all the stuff I enjoy, such as Blogs, RSS, wiki (wikipedia), folksonomies, tags, flickr, Del.icio.us, microformats (aka Lower case semantic web), technorati, pubsub, greasemonkey (bookburro, greasemap) and much more; all tied together in a fascinating, convincing, making-sense manner!
After his presentation, we spoke about my research and he seemed interested. He invited me to visit commerce.net for one month or so and I have to say that I really like the idea. I spoke also with Rohit Khare that is actually working with Tenenbaum and he has a whole bunch of very clever, fascinating, realizable ideas that would really make an impact. They also underline more than once that this kind of architecture/language-of-web2.0 projects should be open source and I totally agree with them and like it.
Actually after the presentation, while I was speaking with Marty and Rohit, there was also Jesse Andrews, the creator of the mind-blowing book burro (actually he got most of the attention, totally deserved by the way). I guess it should be too cool having someone presenting your hack on a conference and then go to meet that person and say “You know the Book Burro extension you presented? Well, I’m the creator of it!”. Cool! If you want to see how Jesse looks like, here is a picture of him and wait some more great hacks from him in few days.

Visualizing time trends in how a site is tagged on del.icio.us: cloudalicious

The previous entry was about “powerlaws in the use of tags on del.icio.us”. Then at http://del.icio.us/tag/powerlaw, i found Pietro Speroni’s great post Tagclouds and cultural changes that (also) introduces cloudalicious, a one-night project of Terrell Russell. Cloudalicious shows the evolution in time of the tags used to tag any page on del.icio.us. Very very cool!!!
I tried to find a URL that was showing a non-converging behaviour but I failed. (Pietro was already providing some examples of sites presenting interesting trends in tags use.) Are your able to find at least one controversial URL? A site for which there was a great swift in time in the tags used for it.
For your information, I already tried with sites tagged on del.icio.us under controversial tags (such as abortion, scientology, jew), I tried with microsoft.com (as I was thinking may people would have tagged it as evil but this is not the case [in general people tend to tag what they like and less what they don’t like in order not to increase the visibility of it, so I tried with “terri schiavo blog” that was very visible for a short period of time and I was suspecting the “tasteless” or “awful” tags were much more and growing over time but this is not the case]).
The only one with a little bit of variance over time I was able to find is boingboing.net. See cloudalicious for http://boingboing.netcloudgraph_boingboing.jpgDel.icio.users seem to recognize it as a news site as time passes by. And it also seems that Del.icio.users are moving from “blogs” to “blog” as tag (common pattern or just for boingboing?).
There is some variance also with http://del.icio.us itself: see cloudalicious for http://del.icio.us
So I just repeat the small challenge: Can you find a URL that presents non-converging tags use?

Small suggestion for Terrell Russell (I write it here since I was not able to find his email address on his web site). [I’m sure he probably has already figured out by itself this suggestion since he was so good to put together in one night a great tool!]
Cloudalicious interface at the moment asks for These URLs (that) can be found at del.icio.us – they’re the red “and X other people” links. (for example, http://del.icio.us/url/ec08a8ddfda4f2f9cad3a142dc49e23b represents http://boingboing.net/).
ec08a8ddfda4f2f9cad3a142dc49e23b is the md5sum of http://boingboing.net/
There are 2 easy way to obtain it automatically: (1) run md5sum on the server, (2) use http://del.icio.us/url?url=http://… (in which http://… can be replaced by the website we want to cloudicious).
In this way, users could enter in the Cloudicious interface, the real URL they are interested in (http://boingboing.net) and not the less easy to find (http://del.icio.us/url/ec08a8ddfda4f2f9cad3a142dc49e23b)
A bookmarklet and a greasemonkey extension (working on the site the user is browsing) are left as easy exercise for the reader as well ;-)

Lastly, let me mention that one of the key point of Clay Shirky in Ontology is Overrated: Categories, Links, and Tags that is also present is Pietro’s post is that the correct way of categorizing something does not exist (initial Yahoo! approach was trying to force this and failed and librarians still (must) try to adopt this semplifying but wrong assumption). Instead there are as many correct ways of categorizing a thing as there are users. This resonates with my study on controversial users on Epinions (pdf): the idea that there is a global value of trustworthiness/reputation for every user/peer in the system does not make sense but still most of the papers in the reputation/trust literature start with this wrong and misleading assumption.

UPDATE: I just found it now but Pietro in
On Tag Clouds, Metric, Tag Sets and Power Laws was already mentioning that the paper by Clay Shirky “Power Laws, Weblogs, and Inequality” started to be tagged as longtail only after the article from Wired: The Long Tail came out. See cloudalicious for http://www.shirky.com/writings/powerlaw_weblog.html.

Use of Tags on del.icio.us follows a powerlaw

I read the wonderful Ontology is Overrated: Categories, Links, and Tags by Clay Shirky (highly recommended! Read it all!). Near the end, he speaks about “Tag Distributions on del.icio.us” and shows a graph that resembles a powerlaw (even if this is about only 2 hours of activity of 64 del.icio.users). After 2 weeks of powerlaws, I see powerlaws everywhere and I thought “let’s try to test the hypothesis on a bigger dataset from del.icio.us”. Well, few googling-minutes told me that many people had already had this idea and already performed tests on del.icio.us.
And of course many of them can be found looking at http://del.icio.us/tag/powerlaw (the del.icio.us page that shows all the URLs tagged under “powerlaw”) [this is kind of uber-cool-self-referentialism].
Among the many, I just cite http://www.cozy.org/d/
(from which the image shown here is taken), where 84 popular URLs are studied and shown to exhibit a powerlaw structure (in the tags used for them). I suspect the value of del.icio.us can be found in the long tail of tagging as well.
Each dot on the log-log charts represent a tag. The most used tag appears to the left while the least appears to the right. All charts have the same x and y range, .5 to 1350; so the slope of these lines is about -1.

Folksonomies criticism

I tend to be enthusiastic about folksonomy and forget considering in what they are good and in what they are not, basically I forget to keep asking myself questions instead of blatantly state “Here we need a folksonomy! Yeahhey!!!”. Anyway, as a sort of balance, you might want to read a post by Gene Smith and one by danah that are more critics than I am (unfortunately).

New paper: Learning Contextualised Weblog Topics

I forgot about another paper I wrote: Learning Contextualised Weblog Topics (pdf) will be presented at WWW 2005 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics in Chiba, Japan, May 10th 2005. My boss was going to WWW2005 for presenting another paper and so we decided to submit our ongoing work to this workshop to get some feedback. We are still working with the system but we should be ready for prime time soon enough … stay tuned!
[I would have loved to meet Ethan Zuckerman that is the invited speaker at this workshop and whose work on media attention is just delicious. (I even proposed to help him in coding something for monitoring the Italian media world but it’s too bad I’m so lazy)]

If you like, check the paper Learning Contextualised Weblog Topics (pdf)
Abstract: In this paper, we examine how a topic-centric view of the Blogosphere can be created. We characterise the problems in aligning similar concepts created by a set of distributed, autonomous users and describe current iniatives to solve the problem. We introduce the Tagsocratic project, a novel initiave to solve the concept alignment problem using techniques derived from research in language acquisition among distributed, autonomous agents.

Tag your friends

The interface of Rojo is totally unusable (at least to me), i don’t understand the interface metaphors. What attracted me was the ability to tag your friends. So a curiosity: how would you tag me?
Our vision is that the next generation of feed reading requires new forms of organization so we built in the ability to tag your world, your content, your feeds, and even your friends.

FolkOS: Folksonomy Operating System

We were used to organize our bookmarks in folders, then del.icio.us came and we now appreciate folksonomies (flat taxonomies, just a set of free keywords you can attach to URLs). We are used to operating systems that allow us to categorize files (knowledge) on folders, would it make sense to have an operating system that allows us to categorize files only based on taxonomy (just add keywords to any file, all the files are in a flat pool)? I don’t know.
What I know is that the total lack of concurrency in the Operating Systems domain (actually just one global monopoly) is depriving all of us of new ideas, new paradigms, progress. If you compare it with the vibrant Web, where a new idea gets implemented and proposed almost daily, you can maybe see how far we would be if there were a free market for Operating Systems.
Anyway, how could we call it? What about FolkOS? FolkOS, the Folksonomy Operating System, I can already see the advertisements…. And, yes, I patented the idea, I got every possible TradeMark and not only on Earth. I patented FolkOS also on Venus and Alpha Centauri (venusians and alphacentaurians be aware! Don’t use my patented ideas! I have the best lawyers of the galaxy!).
[I tend to overload my emails of smilies (for expressing when I’m joking) but I don’t like them on blog posts, so I’m not sure my 4 readers understand when I (try to) make a joke. So, just to be sure, this is a joke … I think patenting computational ideas is a total nonsense (maybe a video can help in understanding why)].