2007.10.22

Building the Semantic Web in Blocks 2007-10-22T00:45:42ZTitled entry permalink

I've been mulling over Semantic Web things for the last few days - and this idea hit me today which I'm calling 'Blocks'. Blocks could potentially be a difficult project to implement, but let me chuck out an idea and see what people think.

I think that people have fallen in to a sad loop with regards to the Semantic Web. The Semantic Web is seen as this deeply scary proposition. It has all this weird stuff - like academics and logicians working on ontologies. Cover that with enough nonsense about "committees" - I mean, urgh! Who wants to be on a committee? We make much better decisions on our own without any expert guidance! Basically, we're stuck in a user education rut.

I'm thinking that what we need to do, then, is build a really simple software product that is a bit like Pipes, but open source.

Basically, imagine a simple library that hackers could import that would just provide basic functionality. Firstly, it could pull data in from some API sources like Twitter, Jaiku, last.fm, digg and Flickr (using the web-based conversion services that have been or are being built), and pull in RDF data from services like dbPedia, Revyu and other participants in the Linked Open Data project. Bascially, map that stuff into simple functions like flickr.getDataForUser("tommorris") etc.

Secondly, it could pull data in a structured way into the graph (yes, the RDF graph - not the graph that all the Silicon Valley finance types are waffling about) and let you process it. This could be as simple as spidering the site and looking for things like microformats, checking for OpenIDs, running rules and so on.

Thirdly, we provide a way of chucking the data back out again in different formats. Obviously, RDF/XML, N3, N-Triples, TriX and the other RDF formats - but also some domain-specific XML formats like RSS/Atom, OPML (for subscription or reading lists), KML (for geodata), even SVG - as well as JSON and YAML, and our humble friend (X)HTML. We'd also be able to push output across networks just as easily as into files - this is the Internet after all.

The idea is that we'd eventually have a sort of standard toolset available so that developers could just download a Semantic Web plugin for their development environment. That means everything from J2EE and Ruby on Rails through to PHP hackers and even kids trying to pimp their MySpace profiles.

Much as Yahoo! makes it easy with Pipes to just connect components together, we need to build some large, 'primitive'-esque chunks that beginner RDF hackers can try out for size. And we'd keep OWL ontologies and reification well away from it all.

I've been playing around with a few ideas - mostly using RDFLib in Python. I wasn't getting very far. RDFLib is nice, but it doesn't seem to support a few nice things like unbound variable filtering (danbri: "not known to be known"), nor does it seem to support a few other SPARQL constructs like CONSTRUCT or DESCRIBE. The only tool that I see that supports the full SPARQL specification is Jena/ARQ - which is a Java library. I have something of an animosity towards Java. It may be because of indoctrination at the hands of Paul Graham, or it may be me having an exceedingly long and complex CLASSPATH. Until I find a better way of doing it, I'm sticking with Java and Jena. Sod static typing.

I'm just in the process of porting the basic infrastructure of some of my little Python hacks over to Jena. Python is great for prototyping in that way (and I've been testing Jython for the same reason...).

I'm not wild about doing it in JavaScript - even though advance is being made by people at the Decentralized Information Group about making a seriously bad-ass JavaScript RDF parser. That's fine for a hack, but if you are going to build a rock-solid site or service, you shouldn't be parsing RDF in the browser but letting the server take the strain. If you don't believe me, Google "bulletproof ajax", read Jeremy's book and then come back. I'll wait.

Blocks will not be a replacement for a good RDF library. In fact, Blocks will require a good RDF library. That's part of the reason to use Jena - because it's actually a good library. Blocks, though, is more of an introductory module - a "here's how you do it" module that you can point to in any language or on any platform.

Also, speaking of RDF libraries - I'd love to know what suggestions people have on the matter of Ruby RDF libraries. I've tried a few and they are all a bit disappointing. Suggestions welcome. Ruby is a great language, and I think decent RDF processing is someting that Ruby needs.

Links from del.icio.us

 

Login with your OpenID:
No. 689
Tom Morris
Currently in: East Sussex, England
Usually in: East Sussex, United Kingdom
AIM: tommorris
YIM: tom.morris

I am a , an , like to code in and noodle about with and the . I also have a BA in philosophy from London, and am in preparation for an MA. My philosophical interests are in Victorian-era German philosophy, Kierkegaard, Robert Nozick, hermeneutics and current approaches to the demarcation problem in the philosophy of science. Musically, I like jazz fusion, soul and P-Funk. My musical nirvana would be a mixture of Beethoven, Miles Davis and George Clinton topped with a side-serving of Erykah, Jill and Angie.

Elsewhere:

  • GPG Key
  • del.icio.us
  • Flickr
  • Twitter
  • digg
  • Jaiku
  • LinkedIn
  • ma.gnolia
  • blip.tv
  • upcoming.org
  • MetaFilter
  • LiveJournal
  • CiteULike
  • Technorati Profile

RSS Feed Subscribe:

RDF

« October 2007 »
SuMoTuWeThFrSa
 123456
78910111213
14151617181920
21222324252627
28293031 

View in month context

On this day in: 2006