Baseline Models#
Six different baseline models were created and trained:
Convolutional neural network (CNN)
Encoderdecoder (ED)
Heteroskedastic regression (HSR)
Multilayer perceptron (MLP)
Randomized prior network (RPN)
Conditional variational autoencoder (cVAE)
There are Jupyter Notebooks that describe how to load and train the simple CNN and MLP models. The environments and code used to train each model, as well as the pretrained models, are found in the baseline_models/
folder on GitHub.
The dataset used for the baseline models corresponds to the LowResolution Real Geography dataset. The subset of variables used to train our models is shown below:
Input 
Target 
Variable 
Description 
Units 
Dimensions 

X 
T 
Air temperature 
K 
(lev, ncol) 

X 
q 
Specific humidity 
kg/kg 
(lev, ncol) 

X 
PS 
Surface pressure 
Pa 
(ncol) 

X 
SOLIN 
Solar insolation 
W/m² 
(ncol) 

X 
LHFLX 
Surface latent heat flux 
W/m² 
(ncol) 

X 
SHFLX 
Surface sensible heat flux 
W/m² 
(ncol) 

X 
dT/dt 
Heating tendency 
K/s 
(lev, ncol) 

X 
dq/dt 
Moistening tendency 
kg/kg/s 
(lev, ncol) 

X 
NETSW 
Net surface shortwave flux 
W/m² 
(ncol) 

X 
FLWDS 
Downward surface longwave flux 
W/m² 
(ncol) 

X 
PRECSC 
Snow rate 
m/s 
(ncol) 

X 
PRECC 
Rain rate 
m/s 
(ncol) 

X 
SOLS 
Visible direct solar flux 
W/m² 
(ncol) 

X 
SOLL 
NearIR direct solar flux 
W/m² 
(ncol) 

X 
SOLSD 
Visible diffuse solar flux 
W/m² 
(ncol) 

X 
SOLLD 
NearIR diffuse solar flux 
W/m² 
(ncol) 