Are CSV's published on Australia's open data portals any good?


I really love this post by Ulrich Atz at the ODI. He analysed more than 20,000 links to CSV files on and found only around one third turned out to be machine-readable.

I often wonder what the scores would be for Australia. I was reminded of the analysis by this tweet from @craig.thomler.

The ODI analysis was done using the R programming language (the code is on GitHub).

Is anyone interested in repeating the analysis for Australia?

@dave perhaps this could be a challenge for the upcoming Tableau Public course?

IMHO the issue here is that in general, we’ve all been far too permissive in what we accept from government, and too grateful for “anything”.

My view is that the vast majority of open data should be provided as either:

  • CSV (commas, 1 line header, double quoted strings where necessary, no preamble)
  • GeoJSON

So, yes, the scores would be bad for Australia, in the same way that average quality on a lot of different metrics would be pretty poor at the moment.


From memory just about all of the ABS data in CSV has many header rows and ‘side’ columns. Guess they are out. :slight_smile: