2006.12.03

The Microsoft XML Team blog has some links to stuff going on at the XML 2006 conference that's going on this week in Boston. 2006-12-03T23:07:47ZUntitled entry permalink

Want to see something silly? The United States Senate is looking at the Amateur Sports Integrity Act in this session (I think) - there's more at GovTrack.us. Basically, they are trying to ban gambling on the Olympics and high school and college sports events, and spending more money on detecting drug use. Integrity? How about Iraq rather than high school football doping scandals? Smile and a wink 2006-12-03T20:46:57ZUntitled entry permalink

I've just posted my first podcast. Here's the webpage - and here's the RSS. 2006-12-03T20:22:26ZUntitled entry permalink

Wade Roush has an excellent article on the Semantic Web as Web 3.0 - or Web 2.1 at the very least (I wish someone would point me to the CVS or SVN server where these version numbers are being handed out) - but doesn't talk about one of the things which is driving this - the beautiful, beautiful Microformats. 2006-12-03T19:42:39ZUntitled entry permalink

That's a bit odd - Ian has found Tesco selling Chocolate and Banana sandwiches. 2006-12-03T10:05:17ZUntitled entry permalink

Only at Sony can a promotion be seen as a demotion. That's what has happened to Ken Kutagari, the guy who introduced two really great consoles and then fucked it all up by selling a gigantic and expensive monster of a console. So sayeth GigaOm. 2006-12-03T10:01:05ZUntitled entry permalink

Joshua Porter has a post on why iTunes' "Album Only" restrictions suck arse. 2006-12-03T09:37:56ZUntitled entry permalink

A mistake 2006-12-03T12:35:51ZTitled entry permalink

I've got this really rad idea. All the kids are doing this crazy blogging thing and hanging out on MySpace and posting their videos on YouTube.

Why don't we set up a corporate-branded version of YouTube where they can upload videos about how cool we are?

I mean, we're selling a lifestyle, and they are buying it. I forgot what was it we sell again? Was it beer? Trainers? iPod stuff? I dunno. The consumers will just love it.

They come to our site, post about how cool we are, and we'd like that.

Yep, the invoice will be in the post. Thanks.

*click*

A day in the life of a (hopefully soon to be unemployed) marketing consultant.

Comments | TrackBack

Some more SPARQL to OPML fun 2006-12-03T13:04:16ZTitled entry permalink

Last month, Danny Ayers described some sparql2opml noodling to take some data out of FOAF and turn it in to OPML. It got good reception from us OPML folks - including Adam Green, Richard Edwards, James Corbett and, er, me.

I've spent the last few weeks playing with RDF stuff, and last Monday bought Shelley Powers' book. I've also got a podcast which I recorded yesterday, which I need to post (there are complications).

Danny is using the Web to build a pipeline for his SPARQL query and the transformation to OPML. I want to do it on the server side, which is a bit tougher. There are lots of different RDF libraries for different languages. I didn't particularly want to learn Java just to chuck some RDF around, but Jena does provide a way for people who are already using Tomcat/JSP/servlets etc. to join the RDF game.

No, I decided to use RAP, the RDF API for PHP. Now, it's fully acronym compliant - it supports RDF/XML, N3 and GRDDL. It's not the most intuitive API around, but it all kind of works. I've cursed a little bit less with RAP than I have with the DOM (which must be the most cursed about framework around!).

I can now do on my server everything that Danny was doing with things bouncing around between sparql.org, w3.org and his XSLT file. And more. The first test project I've done is to try and describe the semantic relationships between public institutions - namely, universities.

I can write a whole load of RDF (either as XML or as N3) and load it in to a database. The way that I do that is that I will use oXygen on Mac to write the XML (or write triples and convert them using this converter). Then I post them as a file on my web server. Then I tell RAP to read them in to a MySQL database. It grabs the triples from them and stores them all in a table called "statements" and stores the prefix/namespace data in another table.

Then I write a SPARQL query as basically plain text or plain text with some PHP logic added on top (so, for instance, if I want to change a variable, I can specify it in the URL). This means that instead of sending a SPARQL query as really long encoded data in the URL, I can simply point to a file which will be loaded up, read in and then executed. This makes the system quite a bit more modular. The PHP is there simply so that you can specify things like search queries.

How am I using this in action? Well, I've written an example script. It is simply a list of colleges within a university. I've added two files to the database - one for the University of Oxford and one for the University of London (and, yes, I know that an Oxford college is different from a London college - that's something I've got on hand when designing the RDF schema which will go public - er - whenever).

You can see the results of the Oxford and London queries by visiting those links. Warning - they are just XML, remember. You should see a list of all of the colleges ordered alphabetically with the URL for their websites.

As Danny's XSLT shows, it's really quite easy to turn this kind of thing in to OPML for display in Grazr.

The ease of development of this approach and the fact that each piece is interchangable makes it so that developing REST APIs is very easy. Of course, there are security issues which one has to deal with, which is why I've only made a limited set of data available (the above two links).

Now, here's where it gets interesting - imagine if you've got lots of raw data that you want to make available in OPML format so that it can be included in to directories - this approach makes a lot of sense. You store the data as either flat-file RDF or in a relational database (RAP supports MySQL and Microsoft Access - other libraries offer different choices), and then just query the data out. OPML provides the structure and RDF provides the data that gets included within the structure.

It means that the OPML folks can get what they want, but you don't have to specially code anything.

The next thing one could do with SPARQL is actually use the variable names in SPARQL (in this case, ?url and ?college) with generic names. Instead of having a manual stylesheet for turning ?url in to an outline component, we define some new standard variable names - I'm thinking ?text, ?htmlUrl, ?xmlUrl, ?linkUrl and ?includeUrl (the latter being for OPML 2.0 only). Text would be required, and the use of the url names would be there instead of type. So, if you just want a type="link", you'd simply specify a ?linkUrl. If you wanted a feed, you'd bash out at least an ?xmlUrl and maybe an ?htmlUrl too.

Stylesheets or processors would then be able to pick them up without having to have any logic in the stylesheet and churn out flat OPML files.

This way, the whole process of producing OPML boils down to formulating a SPARQL query and pointing the results in the right direction. And formulating a SPARQL query need only be as complicated as coming up with one standard one and letting the user change a few variables.

People who aren't XML geeks may read this and think "so what?". I symphatise. What this means (and, again, I don't claim originality - Danny Ayers has prior art on this) is that we can build applications that bridge between the RDF space and the OPML space (or the RSS space), and we can develop them relatively quickly.

And for the mashup makers, this should mean more public data available - more APIs and the suchlike.

I'm excited by this stuff. Hopefully soon we'll have a fair few more toys to play with! Smile and a wink

Comments | TrackBack

Generic SPARQL to OPML stylesheet (alpha) 2006-12-03T16:44:02ZTitled entry permalink

I've hacked Danny Ayers' XSL transform to turn any SPARQL result set in to an OPML file. I'm not sure whether he minds.

You can take a look at the XSL file here. It uses the namespace that my SPARQL processor returns ('s'), though I'm not sure whether that is still relevant if you are using a different namespace (I don't know enough about XML namespaces, so please enlighten me).

The only problem is priority and overlap. It is possible to, say, overload the outline element with too many attributes - you can give it an xmlUrl, htmlUrl, includeUrl and linkUrl and it'll try to process all of them. In retrospect, I see it would be possible to use xsl:choose and xsl:when rather than xsl:if, which would neatly solve the problem. Again, this is just for testing - in incarnation two, I'll fix these little annoyances.

This was really pretty easy, and it seems to work with both of the SPARQL examples that I've run.

It is also relatively easy to make a SPARQL-to-RSS converter, which means you could even use an RDF persistent storage database as a very primitive and crazy blogging engine. Smile and a wink

So, let's see the result of todays work - Graze the London query and Graze the Oxford query.

Comments | TrackBack

 

Login with your OpenID:
No. 411
Tom Morris
Currently in: East Sussex, England
Usually in: East Sussex, United Kingdom
AIM: tommorris
YIM: tom.morris

I am a , an , like to code in and noodle about with and the . I also have a BA in philosophy from London, and am studying for an MA. My philosophical interests are in Victorian-era German philosophy, Kierkegaard, Robert Nozick, hermeneutics and current approaches to the demarcation problem in the philosophy of science. Musically, I like jazz fusion, soul and P-Funk. My musical nirvana would be a mixture of Beethoven, Miles Davis and George Clinton topped with a side-serving of Erykah, Jill and Angie.

I also write for the Citizendium, an online encyclopedia project. If you know about stuff, you should join in.

Elsewhere:

  • GPG Key
  • del.icio.us
  • Flickr
  • Twitter
  • digg
  • Jaiku
  • LinkedIn
  • ma.gnolia
  • blip.tv
  • upcoming.org
  • MetaFilter
  • LiveJournal
  • CiteULike
  • Technorati Profile

RSS Feed Subscribe:

RDF

« December 2006 »
SuMoTuWeThFrSa
 12
3456789
10111213141516
17181920212223
24252627282930
31 

View in month context