tommorris.org

Discussing software, the web, politics, sexuality and the unending supply of human stupidity.


Hack your context and free yourself from the bounds of digital irritation

Over the last few days, I've been working on what I am calling 'context hacks'. Context hacks are hacks that detect human presence in particular contexts in a latent way and enable that contextual information to be shared and reused. A context is, to put it very broadly, a relational state between an agent and their surroundings. Location is one of the important contexts: as is activity, state of mind, connectedness to gadgets and a whole load of other things.

Basically, by context hacking, I want to free this kind of locational and state information from the commercial enterprises that are currently using it for their own purposes.

Why context-hack? Because geolocation is expensive and explicit. Take the two popular location-based services Foursquare and Gowalla: both are built around device-based check-in applications. To use Foursquare, you need to have an iPhone (or a Blackberry, Android or Palm Pre) - a smartphone that costs a few hundred dollars to buy and a few hundred more dollars each year to own. But more than that, sites like Foursquare require you to explicitly check-in with your location. You have to say "hey, I'm at the Barbican Arts Centre right now". Except when you actually are at the Barbican Arts Centre (or wherever) you might actually want to get on with the things you went there for: to enjoy the exhibits, to peruse the library, to wonder at the combined beauty and ugliness of the concrete, to sit in the restaurant with your friends. You don't want to be sitting there opening up Foursquare, waiting for the 3G networks to work, hoping that the GPS is good enough to penetrate all that concrete and then worrying about privacy and whether or not stalkers are going to find you through Facebook.

Context hacking is finding a way of doing this better. The location-based services use mass market devices like iPhones. They are built around devices. If you have an iPhone, it obviously has GPS. But if you've got a Mac running 10.6, your computer knows roughly where it is using technology called Core Location. Core Location uses nearby wifi spots and a service called Skyhook to work out where you are. This is also built into browsers: if you use Firefox, you'll find that if you are connected in an area where Skyhook is working and it is built into your browser and/or OS, when you go to, say, Google Maps, it'll start where you are. I carry a device around with me called a MiFi which provides pay-as-you-go 3G service to me anywhere in the UK, but broadcasts it over a private cloud of wifi. But what most people don't know about the MiFi is it also has a GPS built-in. It broadcasts the location of itself to the devices that use it. If I use an iPod touch with the MiFi, when I use the built-in Google Maps application, it goes to where I am. Clever or what?

But all these things require user intervention. Foursquare requires you to check-in. W3C Geolocation-based apps require you to actually go to a web page - sorry, a web app - pages are old-school, remember. I have got enough technology on me that the technology should be able to work out where the hell I am and what I am doing.

That raises another problem: if it does that, I want to control it. I don't want Google to know my contexts. I don't want Facebook or Yahoo or Twitter or Foursquare to control my contextual information. I want to control it. I'm not a megalomaniac or anything: I just want my data.

The closest I have found to this is FireEagle. I go on and on about FireEagle like an eager fanboy, but there is an important thing that FireEagle gets right that the others do not: FireEagle is just a location broker, just like your e-mail provider is just an e-mail provider. They provide the plumbing, someone else builds the services on top. XMPP is plumbing. RSS is plumbing. What people build on top is up to them. Now, this is not to deny the use of sites like Foursquare. You can have something like Foursquare, but under my mental framework of contextual computing, it doesn't own your context, but you can share your context with it if you want to.

FireEagle is fine, then. It does a lot of what I want from contextual computing. But there is one thing it could do slightly better: it could be hosted on tommorris.org rather than on fireeagle.yahoo.net. Not because I have any particular problem with Yahoo - rather, I trust myself with my contextual information more than I trust anyone else. Just like I trust myself with my blog more than I trust anyone else. While I may suck at software - as Dave Winer says, we all build shitty software - the way the software I build is shitty in a way I like rather than shitty in a way someone else likes!

Eventually, I would like to host my own context server. Now, a context server is like a very private blog. You put your location information on there, and you can tightly control who gets to use it. What would be nice is if we could build a network of these context servers which could talk to each other. My context server would be able to tell other people where I am. A distributed network of context nodes. Of course, context servers wouldn't need to be anything more than HTTP servers. HTTP gives us all we need for context: the context data is in a machine-readable format (or a handful thereof: XML, JSON, RDF, and any other acryonym that fits). We provide access to it over authenticated HTTPS. So you might call tommorris.org/location and, if you've got the relevant permissions, you'd get back my location. We could work the details out later.

But before we build this kind of thing, we need to start collecting data. And because context needs to be about people rather than devices, we need to come up with many, many more ways of collecting clues as to context. This is what I have been working on: location hacking, context hacking. I've been doing it with the existing location-based services - specifically FireEagle and FourSquare. I've done FireEagle hacking in the past, so I am using the existing tools I have built - specifically a little command-line updater I wrote in Ruby called 'fe'.

I have built three tiny little context hacks recently that I want to share with you. I am working on more, and trying to come up with more.

One I built today in about six lines of Ruby is very simple indeed: location_ipod.rb. All it does is sits on a machine I have in my house, and every ten minutes attempts to ping my iPod touch. If it finds it, it tells FireEagle that I am at home. That is all it needs to do. If it can't find my iPod touch it doesn't mean I am not at home, but if it does find my iPod, it means I am. I rarely even walk the dog without taking my iPod with me.

This is useful for me as when I go out, I often update my location using interactive apps like Foursquare and Sparrow. But when I get home, I often forget to update my FireEagle status to say I am home again. As my blog uses my FireEagle status to put my location on my blogposts when I am not at home, having this reset my status is important.

The next hack I wrote is called simply 'train'. It is a little script I can schedule when I know I'm going on a train journey to or from London. If I decide, for instance, that I am going on a train journey at 0845 to London. I go to my computer and type echo "train_out" | at 08:45. Inside the train_out script there are more at jobs which fire off my location at all of the stops all the way to London. The train_return script does similarly but going from London to home. It schedules all the stops from London, then also schedules a FireEagle update saying I'm at home to happen about an hour after I get back to the station. That last mile can often take a variable amount of time: buses, taxis, lifts or even walking can vary in duration. As I am going up to London a lot this week, I am planning to be, err, rail-testing the scripts quite hard.

There is a class of hacks that are based around hacking context information out of other people's systems. I have an Oyster card. I plan to write a script to extract data from their website about Tube and bus journeys I take in London. I can then update my location accordingly using that data.

I found a good one the other day. Waterstone's is Britain's largest chain of booksellers - sort of like Borders or Barnes and Noble in the states. They have a loyalty card scheme - I have one. I got it after purchasing a large, hardback copy of the complete works of Plato, and they told me I'd get a few pounds off the book as it is so expensive. I recently bought an e-book reader off them too, and got a load of points for that. On the Waterstone's website, you can get details of all your recent purchases. I've written a scraper to get this information out. I need to check and make sure how fast this information gets updated - if it doesn't take very long, I can use that to get information from. The scraper just uses Celerity, the really nice JRuby abstraction on top of HtmlUnit, the Java library that lets you basically write testing scripts that prod a fast, headless browser.

At this point, I need to explain why this is useful. This may seem like a totally goofy way of tracking context. It is. But it is goofy and useful. This uses up no batteries at all. No mobile phones. No 3G signals or wifi hotspots. I buy book. Computer lists purchase on website. Another computer takes that purchase, parses the shop identifier string and infers context from it.

As a sole measure of working out identity, tracking Waterstone's loyalty card uses is useless. But the point of it is that it is one out of many measures. Sources of context are diverse: you can and should infer it from hundreds of sources, rather than relying on your smartphone's GPS signal. Put all those layers of context together and build up a personal database. This is why it needs to be on your server - a server you pay for each month, rather than some ad-ridden monstrosity where Google takes it all and sells it off to people whose only interest is selling you legal consulting services for when you get mesothelioma, or a need for a bigger penis or whatever.

What can you then do with this contextual information once you've collected it - whether it is from GPS or from goofy loyalty card hacks? Well, know thyself, for one. Write scripts that look at the information and derive useful suggestions from it. Imagine this: you write a script that works out, based on all your context data, how to spend less money on transportation. Or maybe a script that looks at the sort of restaurants you eat at in London and suggests similar restaurants you might like in Manhattan. I say 'scripts', because these would be like Greasemonkey scripts - little scripts that use a common API to your contextual information, and that you can read the source code of before applying. You aren't trusting some roach-motel-owning gangster with your data - if someone wants to use your context data, he writes a script, you run it and it gives you the information. Your data isn't just the filler for AdWords, your data is yours, and anyone who wants to run their hands through it needs to appeal to your best interests, not theirs.

Why come at this from a community angle rather than a start-up angle? Because it works. Look at blogging - we have a shared understanding of blogging, and anyone can do it on their own. We can aggregate them together, but it is decentralized. Anyone can install blog software and run their own. If they don't like what is on offer, they can hack at it. WordPress and Movable Type are both free now. If you don't want the hassle, you can use a hosted service. There is an interplay between what is available on the hosted providers (WordPress.com, Typepad, Tumblr, Posterous, Blogspot, LiveJournal) and the things people do on their own sites. Compare that with the 'attention' space that was hot a few years ago. What happened to that? Completely fizzled out. Google now own attention, and nobody does anything the least bit interesting with it. Yawn. Attention Trust, APML - err, it all failed. (It was also philosophically misguided - the idea that you share 'attention profiles' rather than the actual attention data was stupid, as I pointed out.) We could make sure that doesn't happen with context.

What now, then? I suggested on Twitter recently that we ought to have a hack day for this kind of thing. I am going to look into it - ask a few people. If you have a venue in London that wouldn't mind having a bunch of people who care about this stuff coming along, do get in touch. We could build scripts that collect context data, maybe build context servers either as separate stand alone apps or as plugins to stuff like blogging software, and build scripts that do useful things with that context data. Work out what people want from contexts, and flesh out a vocabulary for talking about context. And have some fun.