Later today, forces of nature permitting, I’ll be at the launch of the London Data Store at the GLA building with a live link up to CES in Las Vegas. It feels like a momentous occasion in many ways. The last year has been an incredible one for things opening up, The Guardian Open Platform, the UK Government’s public data initiative and more local stores like DataSF (opened in August 2009). Bizarrely it now seems almost normal that public data is becoming public and this is just a wonderful thing to behold, it opens up so many possibilities.
It feels quite monumental partly as it’s my home city. I have a vested interest in this data and what can be made of it. I also think it feels this way because we’re starting to understand what will happen when data opens up and why it’s important, what will be made and more importantly what sort of data is needed to start building data driven businesses and economies around public data. I also think they made a great start in the project and I’m excited to see progress.
Let’s be perfectly clear about this, I’m personally for the release of public data wherever it can happen regardless of business imperative. To a certain extent, it’s a moral right. It’s ours, we want it. I want to be able to inspect how my money is spent, what level of services surround me, what my elected representatives do and I want the facts to hold people to account and to change my voting behaviour if I don’t like what I see.
Professionally though I want to make things. I want to make utilities with this data that improve people’s lives. However to do that sustainably I need to make things that people will want to pay for somehow; either through clients commissioning the things, or government procuring them, or end users paying for them in either money or attention (advertising). Without this part of the cycle then these datasets will always be personal playthings and the people in organisations who open up the data will find it harder to do so in the long run as it is expensive and in lean financial times questions may be asked about why money is being spent there. There has to be a revenue stream and a business model for data to be opened in the long run. Taxation seems to be the best one. Licensing public data in arcane and lawyer driven ways seems anachronistic and largely only pays for more lawyers. I’ve paid for this data to be created, now you’re asking me to pay again before I can build a business on it. I’m not debating that data has value, I’m just saying that we need to think about the long term way of paying for opening it up and that for me is through business and personal taxation. Essentially it feels like central and local government is venturing with its data and that feels totally right.
So, given that we’re getting the environment where this data is available, what sort of things will be commercially viable and what sort of data do they need? It sounds kind of obvious, but it’s utility. With app stores the things that sell are utility. You can get The Guardian on your iPhone as a mobile site, but people in their droves are paying for the convenience and utility of The Guardian iPhone app. It’s the top paid news app in the UK iTunes store and is in the top three of the US store. Likewise the real-time data for the BART in San Francisco is available on their main website and with a little digging on their mobile site, and is integrated to a greater or lesser extent with Google Maps, but people are paying for iBARTLive as it is a utility which makes your life easier and your interaction with the transport system better, more reliable and more interactive.
What’s great about the BART case is that by them opening up is that it’s creating an ecosystem with competition and selective pressure, the evolutionary force. There are 3 free iPhone apps, 2 free Android apps, 7 paid iPhone apps and 1 paid Android apps listed on their developer site. This suggests to me that it’s not going to be a race to the bottom on price, but a race to the top on features, useability and utility. Isn’t this what we want for our interactions with things powered by our public data. In my opinion it is and it’s the thing most lacking in our current mechanism of public service provision where contracts awarded through tender are almost the equivalent of a monopoly.
Cities are complex things, living within them, navigating them is sometimes tricky, finding information on badly designed sprawling websites is not what you want to do in our real time society. If data is the new oil then utility is the new refining industry, finding the elements within the crude raw material and then shaping them and combining them to make valuable things. What is clear though from looking at what is making money commercially in the utility space is timeliness. The data needs to be as real time as it can be. So please London, transport schedules and real time transport data. I don’t want to know when my hypothetical transport is according to a published schedule, I want to know the real thing right now. I don’t want to know about only about collated stats from 2007. I’d like them, but I’d also like live data about school openings when it snows, school place availability, waiting lists for allotments, hospital satisfaction, after hours clubs in my area right now. That’s utility. Applications based on old data which is out of date are worthless, they don’t provide utility at all, they only builds customer dissatisfaction. A news app which gave you last week’s news would be a curiosity and a bad idea.
So London Data Store, here’s to an exciting new start. Developers, here’s to making businesses out of the sort of data we’ve long asked for. As Prof Raper says in his piece today “tomorrow we must get back to work to make this promised future happen”. And finally and personally here’s hoping that I don’t get real time data that my wife has gone into labour while I’m on a panel. I’m a fan of real time information, but there are other real time data streams I’d rather were made available today from the likes of TfL.