What should be done to make data comparable and interoperable?


The greatest value from data usually comes when multiple datasets are combined together.

What should be done to make it easy to compare and combine data?


Data can only (safely) be comparable if:
1 - the semantic definitions of the properties are publicly accessible
2 - there is a consistent overarching information model
3 - relationships between properties are clearly expressed
4 - provenance information is available to determine the quality/status of the data
Following open data standards (and formats) helps in this direction.


To me this starts with Meta Data.

What is it intended for, and what do the individual fields contain and why. Ideally with an example.


Much goodness can be found here https://github.com/datagovsg/data-quality/blob/master/README.md


While I agree @Gonzo that there is much to like in the Singapore work, I am always disappointed when people hack a standard.

The Frictionless data standard by Open Knowledge is being built into CKAN and many programming languages.

There is a good summary of W3C data standards work at https://www.w3.org/blog/2016/09/just-how-should-we-share-data-on-the-web/

So now we’re left which a choice of community driven standard with tools or international standards with limited support.

For me I’ll be using Frictionless data until the international movement catches up.