Saved file formats
Currently, it is possible to export three different types of files: configuration files, Numpy and NetCDF files.
.conf file format (Providentia)
Users can download their configurations (.conf) and use the corresponding files to launch Providentia again. You can use your configuration files in the dashboard, report, interpolation and download modes through the command line or make use of the load button.
Numeric file formats
The following table summarizes the variables that are available after exporting data into numpy (.npz) files and NetCDF (.nc) files.
As Providentia is capable of loading multiple network and species simultaneously, some of these variables are provided per [network]-[species].
Numpy (.npz file format)
Variables
Variable |
Description |
|---|---|
[network]-[species]_data |
Values of the desired species for both observations and models |
[network]-[species]_ghost_data |
GHOST data variables used for additional filtering |
[network]-[species]_metadata |
Metadata of the observations which varies per month gives as a multidimensional array |
time |
Time in given resolution from the start date |
data_labels |
Labels associated with each data array, e.g. observations, model_1, etc. |
ghost_data_variables |
The names of the GHOST data variables used for additional filtering |
resolution |
Temporal resolution of data |
start_date |
Start date of data |
end_date |
End date of data |
temporal_colocation |
Boolean stating if observations and models have been temporally colocated |
spatial_colocation |
Boolean stating if data has been spatially colocated across [network]-[species] |
filter_species |
Data ranges per species used filter read data |
ghost_version |
Version of GHOST |
Loading the data
Loading a .npz file in python is done simply by:
In [1]: import numpy as np
In [2]: obs = np.load("/home/bsc32/bsc32099/PRV_sconco3_20160101_20160601.npz", allow_pickle=True)
Note it is necessary that the allow_pickle argument is set as True.
To investigate the variables that the loaded .npz has inside it, we can use the “files” method:
In [3]: obs.files
Out[3]: ['EBAS-sconco3_ghost_data', 'EBAS-sconco3_data', 'EBAS-sconco3_metadata'...]
Values for a data variable are returned by:
In [4]: data = obs['EBAS-sconco3_data']
Metadata access is special in the .npz files. The metadata variable names can be returned by:
In [5]: metadata_vars = obs['metadata'].dtype.names
Any specific metadata field can be accessed by using one of the metadata variable names:
In [6]: latitude = obs['metadata']['latitude']
NetCDF (.nc file format)
Variables
Variable |
Description |
|---|---|
[network]-[species]_data |
Values of the desired species for both observations and models |
[network]-[species]_ghost_data |
GHOST data variables used for additional filtering |
[network]-[species]_[metadata_var] |
Metadata of the observations which varies per month given per variable |
[network]-[species]_qa |
Quality assurance flags, GHOST performed quality control checks |
[network]-[species]_flags |
Data flags, standardised flags taken from the data provider |
time |
Time in given resolution from the start date |
data_labels |
Labels associated with each data array, e.g. observations, model_1, etc. |
ghost_data_variables |
The names of the GHOST data variables used for additional filtering |
resolution, start_date, end_date, temporal_colocation, spatial_colocation, filter_species and ghost_version are stored as attributes of [network]-[species]_data.
Loading the data
You can read these files as you would usually do, typically using the library netCDF4:
from netCDF4 import Dataset
dataset = Dataset("PRV_sconco3_20160101_20170101.nc")
Or xarray:
import xarray as xr
dataset = xr.open_dataset("PRV_sconco3_20160101_20170101.nc")