Some colleagues of mine are working on “how people can reach a shared common dictionary/language to denote concepts” (or at least understand each other still using their keywords). See Advertising games. We want to test ideas using real data from the blogosphere. The idea is to detect when 2 bloggers are posting about the same concept/topic but use different names to tag it (the post’s category). For example, I use “trust and reputation”, someone else uses “reputation” but we may speak about the same concept.
The questions:
– There is an aggregated repository of posts with categories?
– If not, Have you any idea about how can I collect this information?
– posts must have a category associated (livejournal and blogger don’t let do this, while MovableType and WordPress yes).
Some ongoing web search about the topic we’re doing can be found at this wiki page, and this too. Thanks for help!

  1. Riccardo "Bru" Cambiassi

    Just a dirty trick, but it could be useful:
    why not keeping an eye on for common tags?

    I mean, when you bookmark something, in you can see under what tags other people bookmarked that specific page.
    When you have a bigger enough library, probably you could map different ontology systems / dictionaries.

    Moreover, this can be applied effectively on the specific resource (user): so, in your system, upon a new post, the system could actually go to (or some similar in-house social bookmarking tool), and look for alternate tagnames for each individual and not based on the topic name, but ne the meaning (the actual referenced item, in the case of bookmarks).
    hmm, dunno if I was clear enough… maybe I’ll post something about it :D

  2. farez

    yes, was also going to suggest social bookmarks for this, e.g., furl ( or stumble (


    furl allows optional stars ratings to be attached to your bookmarks, and stumble has a -ve/+ve (like/dislike) rating scheme.


  5. paolo

    Thanks for the comments. We are deciding in which direction we will go: delicious (easier) or blogs (cooler).
    It seems there is a website pinged by all the wordpress blogs (who are category-enabled) and, if there is, we will use this service in order to collect category-enabled posts.
    Again thanks for comments.
    I will let you know how we are going.
    Did I said “thanks”? ;-)

  6. Lilia

    I guess I’m a bit late :)

    I suggest to use there is no good repository of weblog data by post, getting data by post/category is even more difficult.

    In case you decide to go for weblogs – email/Skype me, we are struggling with getting post-based archives, so I can share all the horror stories :)

    Also –

