CSV on the Web

CSVW is a standard for describing and clarifying the content of CSV tables

This site explains CSVW and suggests some tools you can use for working with it.

The UK Government Digital Service recommends CSVW:

Use the CSV on the Web (CSVW) standard to add metadata to describe the contents and structure of comma-separated values (CSV) data files.

GDS Recommended Open Standards for Government

Use CSVW to annotate CSV files with JSON metadata

Here the metadata provides column names and datatypes. It also declares that the geo-coordinates use the Ordnance Survey National Grid.

grit_bins.csv

42, 425584, 439562
43, 425301, 439519
44, 425379, 439596
45, 425024, 439663
46, 424915, 439697
48, 425157, 440347
49, 424784, 439681
50, 424708, 439759
51, 424913, 440642
52, 425342, 440376
... ... ...

grit_bins.json

{
  "@context": ["http://www.w3.org/ns/csvw", {"@language": "en"}],
  "tables": [{
    "url": "http://opendata.leeds.gov.uk/downloads/gritting/grit_bins.csv",
    "tableSchema": {
      "columns": [
      {
        "name": "location",
        "datatype": "integer"
      },
      {
        "name": "easting",
        "datatype": "decimal",
        "propertyUrl": "http://data.ordnancesurvey.co.uk/ontology/spatialrelations/easting"
      },
      {
        "name": "northing",
        "datatype": "decimal",
        "propertyUrl": "http://data.ordnancesurvey.co.uk/ontology/spatialrelations/northing"
      }
      ],
      "aboutUrl": "#{location}"
    }
  }],
  "dialect": {
    "header": false
  }
}
            

Standardise dialects

CSVW adopts the CSV Dialect standard to tell CSV parsers whether the table is delimited with commas or tabs, or whether to expect double quotes etc.

Express types

Table cells may be annotated with a datatype to indicate whether they contain numbers or dates etc. This makes parsing trivial. Users waste less time preparing the data and can instead get started using it right away.

Machine-readable footnotes

Don't treat data tables like spreadsheets, cluttering them with footnotes that can only be read by humans! Having a separate metadata file means that both the annotations and the data table itself are cleaner and easier to process. Indeed using a standard format makes it practically effortless.

Relate tables together

Declare indicies and foreign key constraints between tables letting CSV work like a relational database. You can share multiple tables in a neat package without having to squash everything into one page.

Disambiguate resources

Identify the resources described a CSV file using URI templates. Understanding the data model helps users to interpret the data - for example to distinguish which columns refer to which entities.

Create Linked-data

Create 5 Star Open Data. The CSVW standard describes a csv2rdf procedure for translating CSV into linked-data. Enjoy a default translation out of the box or customise your model with the three URI templates that apply to each cell.