Packaging Public Transport data


#1

I’ve created a data package for Public Transport GTFS data :bus: :train: :ferry: That’s the data from Transport Authorities that’s used by Google Maps and other apps to help plan your journey.

It took a while to work it out, so to save you the hassle I’ve documented the process on GitHub. Now this is probably not the simplest dataset to start with as it contains 8 related files and the GTFS specification contains a lot of options that individual Transport Authorities can choose to implement.

After you’ve done the work the data package can provide:

After reading about the process, If you’ve got any questions about data packages, json table schemas or would like to do something similar for your data, I’m happy to help. :rocket:


Creating and using JSON schemas to validate data
#2

Nice work! Couple of comments:

You might want to make the intro to the README a bit clearer about exactly what is contained, and what its purpose is. It seems to be a mix of scripts, not-updated data, and blog post. For clarity, I would suggest having one repo which is just the (reusable) scripts that can turn raw GTFS into a data package, a separate location (not necessarily Github) to host a sample output of that process, which DataPackage Viewer can point to - and put the blog post material somewhere else.

For the benefit of other readers, what you’ve made is technically a Tabular Data Package, not just any ordinary garden-variety data package :slight_smile: (I did a bit of work on that spec, especially around making it more readable). A Tabular Data Package contains a JSON Table Schema to define all the fields of each of the packaged CSV files.

Next question: how do data packages fit into your plans at the moment? Do you have any tools that can consume them? What benefit are you getting (or hoping to get) out of bundling GTFS as a data package?