Eratos Dataset Ontology

> 📘 This page describes the use of schemas for datasets and the accompanying resources (units, schedules etc.)

# Eratos Datasets:

An Eratos Dataset is wrapped in a metadata structure that enriches the user's understanding of the underlying data, empowering users to assess and ensure the dataset they are using for a given model is fit for purpose.

![](https://files.readme.io/b840c5f-57.png)

## Schema | Dataset

```yaml
@id: ern:e-pn.io:resource:eratos.dataset.<DATASETNAME>
@type: ern:e-pn.io:schema:dataset
@public: true or false
properties:
  - key: name
    type: string
    description: |
      User readable name for the dataset.
    required: true
    textualIndex: true
  - key: description
    type: string
    description: |
      User readable description for the dataset.
    required: true
    textualIndex: true
  - key: type
    type: resource
    description: |
      Defines the type of the dataset.
    required: true
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:dataset.type
  - key: variables
    type: array
    description: |
      Defines the dependant variables that this dataset contains.
    required: false
    items:
      type: object
      properties:
        - key: key
          type: string
          required: true
          conditionalIndex: true
        - key: name
          type: string
          required: false
          textualIndex: true
        - key: description
          type: string
          required: false
          textualIndex: true
        - key: is
          type: resource
          required: true
          conditionalIndex: true
          resourceTypes:
            - ern:e-pn.io:schema:variable
        - key: unit
          type: resource
          required: false
          conditionalIndex: true
          resourceTypes:
            - ern:e-pn.io:schema:unit
        - key: aggregate
          type: resource
          required: false
          conditionalIndex: true
          resourceTypes:
            - ern:e-pn.io:schema:aggregate
  - key: updateSchedule
    type: resource
    description: |
      Defines the update schedule for the dataset.
    required: true
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:schedule
# General properties.
  - key: model
    type: resource
    description: |
      Defines the model that generated this result.
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:model
  - key: experiment
    type: resource
    description: |
      Defines the experiment that generated this result.
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:experiment
  - key: scenario
    type: resource
    description: |
      Defines the scenario that generated this result.
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:scenario
  - key: region
    type: resource
    description: |
      Defines the region which this dataset covers.
    conditionalIndex: true
    resourceTypes:
      - ern:e-pn.io:schema:location
  - key: temporalRange
    type: object
    description: |
      Defines the range of time for the dataset.
    required: false
    properties:
      - key: start
        type: string
        required: true
        conditionalIndex: true
      - key: end
        type: string
        required: true
        conditionalIndex: true
  - key: temporalFrequency
    type: string
    description: |
      If the dataset is dependant on time, this can specify the sample frequency.
    conditionalIndex: true
    enum:
      - Hourly
      - 3Hourly
      - Daily
      - Weekly
      - Monthly
      - Yearly
      - Custom
  - key: temporalFrequencyCustom
    type: string
    description: |
      The custom temporal frequency if temporalFrequency is Custom.
    conditionalIndex: true
  - key: spatialResolution
    type: string
    description: |
      The amount of spatial detail in an observation/dataset.
    conditionalIndex: true
  - key: spatialRange
    type: string
    description: |
      The bottomLeft and topRight corner of the grid
    conditionalIndex: true
# Grid specific properties.
  - key: grid
    type: object
    description: |
      Defines the grid properties for gridded datasets.
    required: false
    properties:
      - key: type
        type: string
        conditionalIndex: true
        enum:
         - Rectilinear
         - Curvilinear
      - key: dimensions
        type: array
        items:
          type: object
          properties:
            - key: key
              type: string
              required: true
              conditionalIndex: true
            - key: spacing
              type: string
              required: true
              conditionalIndex: true
              enum:
              - Uniform
              - Varying
            - key: is
              type: resource
              required: true
              conditionalIndex: true
              resourceTypes:
                - ern:e-pn.io:schema:variable
            - key: unit
              type: resource
              required: false
              conditionalIndex: true
              resourceTypes:
                - ern:e-pn.io:schema:unit

Dataset Example | SILO Daily Maximum Temperature

"@id": ern:e-pn.io:resource:eratos.dataset.silo.maxtemperature
"@type": ern:e-pn.io:schema:dataset
type: ern:e-pn.io:resource:eratos.dataset.type.gridded
name: SILO Historical Maximum Temperature (Daily)
description: SILO daily historical maximum temperature data.
variables: 
  - key: max_temp
    name: Temperature in Celsius
    is: ern:e-pn.io:resource:eratos.variable.temperature
    unit: ern:e-pn.io:resource:eratos.unit.celsius
    aggregate: ern:e-pn.io:resource:eratos.aggregate.daily.max
updateSchedule: ern:e-pn.io:resource:eratos.schedule.03daily
temporalRange:
  start: '1889'
  end: Yesterday
temporalFrequency : Daily
grid:
  type: Rectilinear
  dimensions:
    - key: time
      spacing: Uniform
      is: ern:e-pn.io:resource:eratos.variable.time
      unit: ern:e-pn.io:resource:eratos.unit.utcseconds
    - key: lat
      spacing: Uniform
      is: ern:e-pn.io:resource:eratos.variable.latitude
      unit: ern:e-pn.io:resource:eratos.unit.degrees
    - key: lon
      spacing: Uniform
      is: ern:e-pn.io:resource:eratos.variable.longitude
      unit: ern:e-pn.io:resource:eratos.unit.degrees
  geo:
    proj: "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs"
    coords: ["lon", "lat"]


Schema | Variable

Introducing variables

@id: ern:e-pn.io:resource:eratos.variable.<VARIABLENAME>
@type: ern:e-pn.io:schema:variable
@public: true or false
properties:
  - key: name
    type: string
    description: |
      User readable name for the variable.
    required: true
    textualIndex: true
    
  - key: description
    type: string
    description: |
      User readable description for the variable.
    required: true
    textualIndex: true
    
  - key: units
    type: array
    description: |
      A list of units that this variable can be defined as.
    items:
      type: resource
      conditionalIndex: true
      resourceTypes:
        - ern:e-pn.io:schema:unit

Variable Example | Temperature

"@id": ern:e-pn.io:resource:eratos.variable.temperature
"@type": ern:e-pn.io:schema:variable
"@public": true
name: Surface temperature
description: Surface temperature value
units: 
  - ern:e-pn.io:resource:eratos.unit.celsius
  - ern:e-pn.io:resource:eratos.unit.kelvin
  - ern:e-pn.io:resource:eratos.unit.celsiusperday


Schema | Unit

@id: ern:e-pn.io:resource:eratos.unit.<UNITNAME>
@type: ern:e-pn.io:schema:unit
@public: true or false
properties:
  - key: name
    type: string
    description: |
      User readable name for the unit.
    required: true
    textualIndex: true
    
  - key: description
    type: string
    description: |
      User readable description for the unit.
    required: true
    textualIndex: true

Unit Example | Celsius

"@id": ern:e-pn.io:resource:eratos.unit.celsius
"@type": ern:e-pn.io:schema:unit
"@public": true
name : Degrees celsius
description : A measure of temperature, degrees celsius.


Schema | Aggregate

@id: ern:e-pn.io:resource:eratos.aggregate.<AGGREGATENAME>
@type: ern:e-pn.io:schema:aggregate
@public: true or false
properties:
  - key: name
    type: string
    description: |
      User readable name for the aggregate.
    required: true
    textualIndex: true
    
  - key: description
    type: string
    description: |
      User readable description for the aggregate.
    required: true
    textualIndex: true

Aggregate Example | Max

"@id": ern:e-pn.io:resource:eratos.aggregate.max
"@type": ern:e-pn.io:schema:aggregate
"@public": true
name: Maximum
description: The greatest amount recorded.

Aggregate Example | Daily Max

"@id": ern:e-pn.io:resource:eratos.aggregate.daily.max
"@type": ern:e-pn.io:schema:aggregate
"@public": true
name: Daily Maximum
description: The greatest amount or extent, recorded per day (period of time equal to 24 hours).


Schema | Schedule

@id: ern:e-pn.io:resource:eratos.schedule.<SCHEDULENAME>
@type: ern:e-pn.io:schema:schedule
@public: true or false
properties:
  - key: name
    type: string
    description: |
      User readable name for the schedule.
    required: true
    textualIndex: true
  - key: description
    type: string
    description: |
      User readable description for the schedule.
    required: true
    textualIndex: true
  - key: cron
    type: string
    description: |
      The schedule as defined using a cron expression (See https://en.wikipedia.org/wiki/Cron).

Schedule Example | No Update

"@id": ern:e-pn.io:resource:eratos.schedule.noupdate
"@type": ern:e-pn.io:schema:schedule
name: No update
description: Dataset will not update, is static
cron: "* * * * *"

Schedule Example | 03 Daily

"@id": ern:e-pn.io:resource:eratos.schedule.03daily
"@type": ern:e-pn.io:schema:schedule
"@public": true
name: 13:00 UTC Daily
description: The schedule will update daily at 13:00 UTC which is 3am AEST
cron: "* 3 * * *"