Inside TransportAPI: how open data is helping you catch the bus

The number 14 bus gets stuck in traffic on Fulham Road for 20 minutes, arriving later than anticipated at South Kensington Tube station. Just as you’re getting off the bus, Citymapper alerts you that the District line has been suspended between Earl’s Court and Embankment, meaning you’ll almost certainly miss your train from Victoria. The app directs you up the Piccadilly line to Green Park and back down to Victoria, where you can catch the next train – if you hurry.

Inside TransportAPI: how open data is helping you catch the bus

Feeding all this information to you from one app is no small feat of software engineering. It requires accurate real-time data from at least three sources (bus, Tube and train), all of which use different formats, location IDs and semantics. Enter TransportAPI, a five-year-old British company that is the only single source of public transport data in the UK. Whether you’re leaping onto a bus in Bristol or catching a cab in Canary Wharf, TransportAPI aggregates, harmonises and distributes the data, meaning you’ll know when you’ll reach your destination and how much it will cost.

TransportAPI is already a source of valuable data for travellers, but it wants to do more. Where are the empty seats on the top deck of the bus? Is it cheaper to hire a car and drive to Manchester or take the train? TransportAPI wants to make your apps – either as a consumer or a developer – much more informative.

Making the most of open data

It only takes a glance at the Tube map to realise that running a transport system is a complicated business. The morass of real-time data produced by the country’s various transport systems is even more byzantine, which is why academics Jonathan Raper and David Mountain decided someone needed to make sense of it all.

transportbuzz-disruptionTransport Buzz displays a map of transport-related tweets

In 2010, they set up TransportAPI, with the aim of knitting together of all the real-time data from the various transport companies. “A lot of open data is not well documented,” Jonathan Raper, co-founder and managing director of TransportAPI, told Alphr. “We specialise in trying to understand the syntax and semantics of it.”

“A lot of the digital infrastructure around those services is relatively poor,” added Raper. “You may have all the sensor data you can shake a stick at, but the references that tell you where those sensors are located are not in as good shape. So we maintain a lot of infrastructure data, reference lists and lookup tables of different identifiers that are used by different services.”

A key part of TransportAPI’s job is marrying overlapping data from different sources. For example, bus and Tube services may serve the same location, but the two data feeds will use different identifiers for the same stop. TransportAPI pulls this information together so partner app providers such as Citymapper don’t have to. “We do a lot of harmonisation, error checking, validation and organisation,” said Raper.

“It’s so sophisticated it’s capable of encoding interplanetary travel timetables as well as the number 46 bus.”

The journey-planner app makers could get all this information for themselves – often for free – but they would rather pay TransportAPI to provide all that data in one go. “We can aggregate,” said Raper, “which means developers don’t need to sign agreements with lots of different data providers and try to make sense of the different formats. In our service, once you’ve used one interface, you can use it in every transport area. If you build a bus app and then decide to add trains, all you have to do is change the URL to ‘/train’.”

Raper claims that making sense of the various transport companies’ data is the biggest technical challenge his team faces. The company maintains databases based on “some very complex and difficult-to-handle data formats such as TransXChange, which is the representation of transport timetables. It has exceptionally difficult semantics associated with it. One of the dev team quipped that it’s so sophisticated it’s capable of encoding interplanetary travel timetables as well as the number 46 bus.”

Transport companies even sabotage data to stop the public or their rivals having access to it. “There are situations where things are deliberately obfuscated,” said Raper, accusing some transport operators of trying to protect their monopolies. “It’s a weapon, an example of a 21st-century digital competition tool for large organisations – to try and make their data very difficult to consume. There may be regulatory reasons why they’re being forced to release it, but there are operational reasons why they don’t want other people to see what they’re doing and how they’re doing it.”

Keeping it real-time

It’s critical that the data provided by TransportAPI to app developers is both fast and reliable. Harassed commuters won’t tolerate error messages or inaccurate data any more than they would the cancellation of their train home.

Raper admits that, while you would expect 99.99% reliability from an infrastructure-as-a-service provider like Amazon or Microsoft Azure, you can’t expect high figures from data-as-a-service hosts. Why? “There are outages on the sensor networks that feed in and you can’t control that,” not to mention connections with other services “that sometimes get stressed”. He cited the recent Tube strikes as a time when data feeds are under enormous strain, and TransportAPI uses auto-scaling technology with its cloud providers to help ensure information keeps flowing to travellers during data rush hours.

transport_api_working

The company also does a lot of internal and external monitoring of its systems. Raper said the firm is constantly checking the performance of the Apache server and its Ruby on Rails application, and uses Pingdom to check how the services look to TransportAPI’s customers. “We do monitor the service pretty obsessively,” admitted Raper.

The company publishes its uptime metrics and is very upfront if there are problems with the feeds. “Ultimately, confidence in our service comes from trust,” he said. “If you’re on fire but you don’t tell anybody, and you don’t tell them afterwards either, people worry. You just can’t afford that.”

Going beyond the bus timetable

Transport data may not be a sexy topic of conversation, but the forthcoming projects TransportAPI is developing with its partners are genuinely innovative. One partner is installing Bluetooth counters at bus stops to estimate how many people are waiting at the stop. (An app that tells you how many people are in the bus queue in front of you – is there anything more quintessentially British?) Then there’s the prototype cameras that have been installed on two of London’s double-decker buses, which use face tracking to tell customers on the lower deck how many seats are available upstairs, with the precise location of the free seats shown on an LCD display at the bottom of the stairwell.

“The forthcoming projects TransportAPI is developing with its partners are genuinely innovative”

The company also has ambitions beyond traditional forms of public transport. TransportAPI is working with a company called Taxicode to provide a price-comparison site for cab journeys, allowing users to enter a pickup point and destination to see the range of prices on offer from different firms, and types of vehicle. The company is also in talks with car hire firms and car clubs to merge their data with public transport prices, so that travellers can work out if it’s cheaper to jump behind the wheel for all or part of a journey.

However, TransportAPI’s most disruptive plans concern the rail network and creating a data-mineable source of British rail fares. Raper claims online ticket sellers such as Thetrainline.com aren’t allowed to data-mine the fare information provided by different train operators. TransportAPI will use open data about rail fares to build a database that can answer queries like “can I get a ticket that costs half the price for three times the journey time?” or “what’s the most scenic route between A and B?” “We’re enabling people to find the answers to those sort of questions, and we think we can find new markets,” said Raper. Handing public transport data back to the public? Now that sounds like a “fare” deal.

While Transport API is busy mapping out Britain’s transport routes, Conscious Me is a tech startup mapping out companies that could change your life.

Disclaimer: Some pages on this site may include an affiliate link. This does not effect our editorial in any way.