Grid Schema¶
This document describes the schema to which gridded Earth system data is cast in Argovis.
How to read this schema¶
Each entry in the schema fragments below contain a few keys:
type: the primitive type, format, or object description of a valid entry for this field
description: short comment on what this variable is
fill value (optional): what this should be filled by if absent
current vocabulary (optional): current set of possible values for this key, with explanations as required.
Schema enforcement & population¶
All these schema are enforced via mongodb’s built-in schema validation; those schema validation rules are defined in the grids.py and grids-meta.py scripts at https://github.com/argovis/db-schema.
Once empty collections are generated by the script above, it is populated by pipelines that translate from the formats of upstream data providers, to mongodb-appropriate JSON that matches these schemas. Each section below contains links to the code backing these pipelines. Note that each gridded product lives in its own mongo collection at this time.
Grid Metadata Schema¶
In order to avoid repetition of information that is the same for all grid points, each grid in collection <grid name> will have a record in collection grids-meta with _id=<grid name> that carries all the metadata for that grid. The schema of grids-meta documents are described in this section.
Scripts for translating upstream data to these schema:
Roemmich-Gilson temperature and salinity: https://github.com/argovis/grid-sync/blob/main/translate-rg-grid.py
Required keys¶
Note all keys are required for metadata docs.
_idtype: string
description: a string that matches the collection name that contains the data that should be associated with this metadata record
current vocabulary:
rgTempTotal,rgPsalTotal,ohc
unitstype: string
description: the units of measure by which the
dvariables in the corresponding grids should be interpreted.
levelstype: array of floats
description: the pressures in dbar of all levels of this grid, in the same order as the
drecord of the corresponding grid data.
date_addedtype: ISO 8601 UTC datestring, example
1999-12-31T23:59:59Zdescription: time the record was added to Argovis
lonrangetype: array of floats
description: minimum and maximum longitudes, in degrees, found in this grid
latrangetype: array of floats
description: minimum and maximum latitudes, in degrees, found in this grid
timerangetype: array of ISO 8601 UTC datestring
descriptionL minimum and maximum dates found in this grid
loncelltype: float
description: grid spacing in degrees in longitude
latcelltype: float
description: grid spacing in degrees in latitude
Grid Data Schema¶
Each gridded product is stored in its own collection, the name of which has a corresponding entry in the grids-meta collection. Documents in the gridded data collections obey the following schema, which describe a profile-like column of a grid.
Scripts for translating upstream data to these schema:
Roemmich-Gilson temperature and salinity: https://github.com/argovis/grid-sync/blob/main/translate-rg-grid.py
Required keys¶
Note all keys are required for gridded docs.
gtype: geojson Point object
description: geojson Point tagging the lon/lat of this record.
ttype: ISO 8601 UTC datestring, example
1999-12-31T23:59:59Zdescription: time this data corresponds to.
dtype: array of floats
description: the variable data for this column of the grid. These entries correspond in order to the
levelsarray in the corresponding metadata document which describes their depth, and carry the units of theunitsentry in the same metadata document.