Tag Archives: Semantic web

Report of Conference on Business Information 2008

Business Information Systems conference logoI spent the beginning of the past week in Innsbruck for the 11th International Conference on Business Information Systems.
My presentation went well but I’ll post about it later. Overall the conference was interesting and worth the trip.
Many talks were mentioning Semantic Web. What extremely positively surprised me was that the approach to Semantic Web was very very pragmatic in all the presentations, a sort of Pragmatic Web or, as I prefer, a lowercase semantic web.
The peaks of the conference were a great keynote speech by Fabio Ciravegna titled “Challenges and Methodologies for Acquiring and Sharing Knowledge in Large Distributed Environments”. He presented the approach of his group at the University of Sheffield on knowledge capture, which is very very pragmatic and just makes sense. Among others, he reported how noting that what workers in a big company (Rolls-Royce) were doing was creating word and excel forms and passing them around via email, they decided to provide a simple web interface for creating forms. This simple change allowed a lot of interesting services on top of it, services which use semantics when it adds value and not for the sake of it. I cannot resume his very interesting many points here but you might want to check his slides (from a different presentation) at around page 71 or just his Web page with a list of the many projects in which semantics is used in a pragmatic and reasonable and adding value way.
Another peak was a great tutorial by Emanuele Dalla Valle titled “RSWA 2008 – Realizing a Semantic Web Application”. He explained how to develop step-by-step a Semantic Web application that expects a music style as an input; retrieves data from online music archives and event databases; merges them and let the users explore events related to artists that practice the required style. He challenged the Semantic Web technologies on the Web 2.0 ground of realizing a mash-up that reuses, transforms and combines existing data taken from the open Web (namely MusicBrainz, MusicMoz and EVDB). Again a clever use of semantics when semantics can add some value and a clear explanation.
I suggested him to record this tutorial next time and put the video somewhere on the Web because it is really a great example (the first I’ve seen) in which Semantic Web really add value over more simple way of developing applications (Web2.0). For now you can just check his slides (released under a Creative Commons license). And also check the Semantic Web Activities group at Cefriel which has many interesting projects and ideas.
And there were few other peaks: Couchsurfing is always a great experience which never ceases to amaze me. We were 6 people (Khrista and Sarah, 2 canadian girls, Bruno, a dutch guy which is spending one year traveling around Europe , see useuropeans.com, myself, and Manuel and Yvonne, our 2 lovely hosts) sleeping in a small house with mattresses everywhere.
And I started twittering thanks to the push by Andre’ Passant at the conference, who also helped me to make my foaf file to remain always up to date by automatically including the results of export of facebook, flickr and other web2.0 services. However for now the foafing didn’t really work out though.
And I also started geocaching thanks to jailway: after the conference dinner we found my first cache near the Golden Roof.
Summaryzing: lowercasesemanticwebbing, couchsurfing, geocaching, twittering, foafing, and some more *ing …

A Semantic Mobs Manifesto for the (r)Evolutionary Web: rejected!

One night, many days ago, Bru and I had a night divertissement (as I liked to call it). During a funny Skype session, we created a paper for SWAP2005 (Semantic Web Applications and Perspectives, 2nd Italian Semantic Web Workshop, Trento, Faculty of Economics,14-15-16 December, 2005). The title of the paper is “A Semantic Mobs Manifesto for the (r)Evolutionary Web” (pdf). Since the conference is in Trento, I’ll probably go anyway so the idea was to get one more publication (is there another reason for sending a paper to a conference? ;-)). As I already said, it was a night divertissement, it took us few hours creating it, well, most of the time was spent in chatting about the possible title. We skyped really improbable titles I think I remember. And it was a lot of fun.
Anyway I received few days ago an email saying that the paper was rejected (in the following there are the reviews in case you are curious). I think reviewers did the right thing in rejecting it. It was not a serious contribution to science but more a provocation (and a funny-for-us night divertissement).
So how we created the paper? We took verbatim a blog post by Ryan King titled “An Evolutionary Revolution – On the shoulders of giants” and we inserted it in the paper. Since the blog post was resealed under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 licence, we could import it legally, in fact we of course gave credit to Ryan King in the paper, we re-released the paper under the same licence and the paper was not a commercial work.
Then we added a short quibbling about how with the Semantic Web envisioned in the conference, a paper like this one would be easily creatable by some software tool, expecially when i a short future the number of creative commons released text works will be huge. The last lines of the abstract hinted a matrix-like scenario in which (human) researchers will be no more needed. The title was entirely Bru’s fault. Don’t tell him but I think we got rejected because of the title ;-)
So, well, enjoy it, it is released under a creative commons licence, respect the licence and do whatever you want with it, yes you might even want to cite it in a “real”paper, that would be a larger point about “what are conferences for in an era of free, decentralized publishing?” but I guess you will have to wait another post for it. Anyway, don’t worry, it is not a unexpected or clever post, nothing more that our rejected submission for SWAP ;-)

In the following you find the reviews we got and at the really bottom the text version of the paper.
I wonder if Danny Ayers was one of the reviewer since he writes “I nearly had a dilemma over whether to give something a positive rating simply because it was really cool, rather than bringing significant academic value to the field. Again fortunately for me the material in question did have value in the latter sense as well, so I could call it a Clear Accept without any ethical worries.” But it can’t be because our paper has no value in the latter sense ;-)
Continue reading

Presentation in standard format, S5

Some days ago I had to give a presentation for the 2K* symposium, a joint initiative of research groups from different IT institutions, based in Trento and in Genova. The 40 mins presentation was titled “Trust in Recommender Systems: an historical overview and recent developments” (check the source code!). It is heavily based on an old presentation, I just added some slides about microformats, a concept I wanted to convey to the audience.
Anyway, I took the occasion to try to create the presentation in HTML using S5: A Simple Standards-Based Slide Show System developed by Eric Meyer. I think I will create all my future presentation in S5 from now on. The advantages: it “forces” you to keep the slides simple (no unnatural flow of information) and short (however you can have animations, check this slide); it is easy to publish the presentation on the Web, anyone can link to a specific slide, search engines find the information and index them, it is highly standard, evolutionary and small-pieces-loosely-connected-philosophy-like (for exaple it would be possible to create a small piece of javascript code that collect slides from different presentations in some meaningful automatic way to create a new presentation, but the possibilities are endless of course, especially if using the S5 format based on XOXO microformat), I can create the presentation with whatever text editor (perfect if you are in text mode), it does not require the viewer to have some fancy program (openoffice for the freedom lovers, powerpoint for the others) but a browser suffices.
You can find many presentations in S5 format in the microformats wiki; I also liked this presentation of Firefox, with style vulpes-flagrans or with style greenery. Yes, I know the stile I used for my presentation is not that great, if someone with graphical skills would like to create a style for me, it will be very appreciated of course.
For starting playing with S5, I suggest you S5 primer (you need to download HTML code and edit it) or S5present, an open-source web-based slideshow application (you just create an identity there and then use the site for creating the presentation). Guess what? S5 Presents was written in under 10 hours and 500 lines of code using the fantastic Ruby on Rails framework.


A Microformat for grouping all your identities?

Jesse comments on my Identity Burro post in which I spoke about OpenId as a possible method for tieing together various ids you have on social sites (flickr, del.icio.us, …). I want to be able to say that on flickr I’m phauly and on del.cio.us I’m paolomassa and on 43thing I’m mariah, etc. He ponders 2 solutions, a centralized and a decentralized one. I’m totally for the decentralized solution. I was suggesting OpenID but, to be sincere, I still need to interiorize well OpenID, I can feel it is a great idea but still need to understand all its power (and how to use it).
Actually I think a microformat would be killer for this. Jesse says # Distributed solution – people can embed their information on their homepage, which can be mined by a greasemonkey script. If I want to know Paolo’s del.icio.us, flickr, 43thigns, … I need to visit his homepage and grab his list.
I would add: they can embed this information … using a microformat and hence adding some simple semantics and a possibility to thousands of services to bloom!
So according to the microformats process I’m going to send an email in the mailing list to see if there is interest, then we will Document current human behavior on the microformats wiki: are people already writing on their blogs which are their identities on social sites? Do they already do it using some formats? There are already formats for expressing your identities? And then I guess we’ll see what happen.
I can hear someone asking “Attacks? Spamming?”. Yes, on my blog I can claim my identities are boingboing (blog), danah (photos), ethanzuckermann (URLs on del.icio.us), etc. But it is just as now I can open a blog on blogger and claim I’m bill gates or the pope. Or I can leave comments on anyone’s blog as Scoble writing “Microsoft is watching you”, no? Read “What about spam?” on OpenID.net homepage to get an idea.
Technorati tag:

A lot of available RDF data

I think RDF is a bit too complicated to be embraced in these times of “bottom-up” evolution. Anyway the biggest problem was (at least in my mind) lack of data.
But today I found a lot of RDF data at rdfdata.org. I didn’t even start thinking of all the cool services you could build with them since I don’t want to spend the next days diverting from what I should do (writing the thesis). And yes, some are more interesting than
Metadata about Elvis impersonators [RDF] (2005-04-01) At last, the semantic web is complete. Extensive metadata about 81 Elvis impersonators, some with scary videoclips. (slurred southern accent:) “Thank you, thank you very much.” ;-)
(via Leigh Dodds)

AAAI05: terrific talk by Marty Tenenbaum

AI Meets Web 2.0: Building The Web of Tomorrow Today by Dr. Jay M. Tenenbaum.
Terrific terrific talk, fascinating. I should have podcasted it because you really missed something (except I have nothing to record audio on, would you consider sending me your old mp3 recorder pen?). I was so excited during the talk that I happened to take a photo of almost any slide. Actually the slides were 94 and I photoed 59 of them! Incredible to me as well.
Anyway, you might want to read the slides (pdf) or maybe you want to have a look at my pictures (possibly as a slideshow).
He introduced all the stuff I enjoy, such as Blogs, RSS, wiki (wikipedia), folksonomies, tags, flickr, Del.icio.us, microformats (aka Lower case semantic web), technorati, pubsub, greasemonkey (bookburro, greasemap) and much more; all tied together in a fascinating, convincing, making-sense manner!
After his presentation, we spoke about my research and he seemed interested. He invited me to visit commerce.net for one month or so and I have to say that I really like the idea. I spoke also with Rohit Khare that is actually working with Tenenbaum and he has a whole bunch of very clever, fascinating, realizable ideas that would really make an impact. They also underline more than once that this kind of architecture/language-of-web2.0 projects should be open source and I totally agree with them and like it.
Actually after the presentation, while I was speaking with Marty and Rohit, there was also Jesse Andrews, the creator of the mind-blowing book burro (actually he got most of the attention, totally deserved by the way). I guess it should be too cool having someone presenting your hack on a conference and then go to meet that person and say “You know the Book Burro extension you presented? Well, I’m the creator of it!”. Cool! If you want to see how Jesse looks like, here is a picture of him and wait some more great hacks from him in few days.

GreaseMonkey on Trenitalia.com

Trenitalia.com (Italy’s public railways) has some links that works only on IExploder. Few weeks ago you could do nothing but sending tons of email asking Trenitalia.com to support standard (you can also sign a petition for Making Internet Explorer Standards Compliant and hope).
BUT NOW you can GreaseMonkey it! [you need the great Firefox browser] Install the Trenitalia Link Fixer script that fixes wrong links in Trenitalia website.
(via blackbirdblog)

GreaseMonkey is the real Semantic Web (and now works on HospitalityClub)

GreaseMonkey is an extension for Firefox that allows you to totally (and easily) change the layout of any received web page. Don’t like the color of the banner of that_site.com? You can change it! Do you prefer to have the login link on the_other_site.org on the right? You can place it wherever you want! While visiting the page of a certain book on Amazon.com, do you want to see the prices other sites ask for the same book (with this information embedded on “original” Amazon page)? You can do it (with BookBurro extension)! Want to hide forever every Google AdSense ad? You can do it! You find hundreds of scripts (for hundreds of different sites) over at GreaseMonkey UserScripts wiki or you can easily create yours (as I did, see the end of this post).
Oh yes, this will blow up your business model and “any kid with a bright idea and a knack for DHTML can create a new interface for your site, and it will probably be better than yours.
And yes, this is much much more real (and useful) than all the Semantic Web you listen about at conferences (with tons of papers and tons of highly funded programs that, at least at the moment, produces almost nothing you can use and play with; if I’m wrong, use the comment to point out interesting stuff).
Anyway, I played a bit with GreaseMonkey. I recommend you diveintogreasemonkey by Mark Pilgrim and I suggest you to follow it step by step (this is faster than trying to jump to what you need because you will jump back to understand that what you skipped was important).
And eventually, I created 2 GreaseMonkey scripts for HospitalityClub, that I think can save me a lot of time in using the site. I used HospitalityClub for finding hospitality in Trieste when I was attending the School on Networks (thanks truesmile and inquis), I used it in order to find hospitality in Pittsburgh where I’ll be for the AAAI conference (thanks roder) and yesterday I wanted to use it for finding hospitality for my (short) holidays in Italy [not going to tell where]. The problem with HospitalityClub is that the interface is not too usable. My usual use case is the following: I search all the people offering hospitality in the place where I want to go, and I send to all of them the same request. This requires visiting the list of users, clicking on every username to go to her userpage and, on the userpage, click on “send message to this user” that leads to a new page, then copying my name in a field, my passport number in another field, the request text in a text area and push Submit. All these steps must be done for all the users!
So I created a GreaseMonkey extension that add a link near every username: the link allows to go directly to the “send message” page.
      [ script: hospitalityclub_addSendMsgLink.user.js ]
And I created another extension that prefill the values in the “send message” page with the default ones (my username, my passport number, the request message).
      [ script: hospitalityclub_defaultValuesInMsg.user.js ]
In this way you just have to push Submit. It would be possible to push Submit automatically with the extension but I wanted to keep some control … interestingly GreaseMonkey gives you so much power that then your small brain is no more able to manage it. I mean, for example, I have at least 4 extensions that modify google.com pages and I’m no more able to tell which extension inserts what in which cases… this is something I need to think a little bit more about.
Anyway the 2 extensions are released under GPL (software that gives you freedom) so you are free to play with them, free to study them and free to modify them. Enjoy!

Visualizing time trends in how a site is tagged on del.icio.us: cloudalicious

The previous entry was about “powerlaws in the use of tags on del.icio.us”. Then at http://del.icio.us/tag/powerlaw, i found Pietro Speroni’s great post Tagclouds and cultural changes that (also) introduces cloudalicious, a one-night project of Terrell Russell. Cloudalicious shows the evolution in time of the tags used to tag any page on del.icio.us. Very very cool!!!
I tried to find a URL that was showing a non-converging behaviour but I failed. (Pietro was already providing some examples of sites presenting interesting trends in tags use.) Are your able to find at least one controversial URL? A site for which there was a great swift in time in the tags used for it.
For your information, I already tried with sites tagged on del.icio.us under controversial tags (such as abortion, scientology, jew), I tried with microsoft.com (as I was thinking may people would have tagged it as evil but this is not the case [in general people tend to tag what they like and less what they don’t like in order not to increase the visibility of it, so I tried with “terri schiavo blog” that was very visible for a short period of time and I was suspecting the “tasteless” or “awful” tags were much more and growing over time but this is not the case]).
The only one with a little bit of variance over time I was able to find is boingboing.net. See cloudalicious for http://boingboing.netcloudgraph_boingboing.jpgDel.icio.users seem to recognize it as a news site as time passes by. And it also seems that Del.icio.users are moving from “blogs” to “blog” as tag (common pattern or just for boingboing?).
There is some variance also with http://del.icio.us itself: see cloudalicious for http://del.icio.us
So I just repeat the small challenge: Can you find a URL that presents non-converging tags use?

Small suggestion for Terrell Russell (I write it here since I was not able to find his email address on his web site). [I’m sure he probably has already figured out by itself this suggestion since he was so good to put together in one night a great tool!]
Cloudalicious interface at the moment asks for These URLs (that) can be found at del.icio.us – they’re the red “and X other people” links. (for example, http://del.icio.us/url/ec08a8ddfda4f2f9cad3a142dc49e23b represents http://boingboing.net/).
ec08a8ddfda4f2f9cad3a142dc49e23b is the md5sum of http://boingboing.net/
There are 2 easy way to obtain it automatically: (1) run md5sum on the server, (2) use http://del.icio.us/url?url=http://… (in which http://… can be replaced by the website we want to cloudicious).
In this way, users could enter in the Cloudicious interface, the real URL they are interested in (http://boingboing.net) and not the less easy to find (http://del.icio.us/url/ec08a8ddfda4f2f9cad3a142dc49e23b)
A bookmarklet and a greasemonkey extension (working on the site the user is browsing) are left as easy exercise for the reader as well ;-)

Lastly, let me mention that one of the key point of Clay Shirky in Ontology is Overrated: Categories, Links, and Tags that is also present is Pietro’s post is that the correct way of categorizing something does not exist (initial Yahoo! approach was trying to force this and failed and librarians still (must) try to adopt this semplifying but wrong assumption). Instead there are as many correct ways of categorizing a thing as there are users. This resonates with my study on controversial users on Epinions (pdf): the idea that there is a global value of trustworthiness/reputation for every user/peer in the system does not make sense but still most of the papers in the reputation/trust literature start with this wrong and misleading assumption.

UPDATE: I just found it now but Pietro in
On Tag Clouds, Metric, Tag Sets and Power Laws was already mentioning that the paper by Clay Shirky “Power Laws, Weblogs, and Inequality” started to be tagged as longtail only after the article from Wired: The Long Tail came out. See cloudalicious for http://www.shirky.com/writings/powerlaw_weblog.html.