What should be done to make data comparable and interoperable?


#1

The greatest value from data usually comes when multiple datasets are combined together.

What should be done to make it easy to compare and combine data?


#2

Data can only (safely) be comparable if:
1 - the semantic definitions of the properties are publicly accessible
2 - there is a consistent overarching information model
3 - relationships between properties are clearly expressed
4 - provenance information is available to determine the quality/status of the data
Following open data standards (and formats) helps in this direction.


#3

To me this starts with Meta Data.

What is it intended for, and what do the individual fields contain and why. Ideally with an example.


#4

Much goodness can be found here https://github.com/datagovsg/data-quality/blob/master/README.md


#5

While I agree @Gonzo that there is much to like in the Singapore work, I am always disappointed when people hack a standard.

The Frictionless data standard by Open Knowledge is being built into CKAN and many programming languages.

There is a good summary of W3C data standards work at https://www.w3.org/blog/2016/09/just-how-should-we-share-data-on-the-web/

So now we’re left which a choice of community driven standard with tools or international standards with limited support.

For me I’ll be using Frictionless data until the international movement catches up.


#6