Top-level workflow in LMRt¶
Expected time to run through: 3 mins
In this tutorial, we demonstrate how to reproduce our 1st LMR reconstruction with the top-level workflow. This top-level workflow is essentailly a shortcut of the job.prepare()
and job.run()
we have seen in the previous notebook on the high-level workflow. With the high-level workflow, users still have the flexibilty to perform intermediate steps, while with the top-level workflow, users can only execute the one-line command to reproduce an experiment totally based on the given
configuration YAML file, without any chance to do any intermediate modification of the experiment.
We will first generate a configuration YAML file with a different job_dirpath
so as to store our results to a different location, and then we use the job.run_cfg()
method to reproduce the reconstruction directly!
Test data preparation¶
Again, if you haven’t done yet, please prepare test data following the steps: 1. Download the test case named “PAGES2k_CCSM4_GISTEMP” with this link. 2. Create a directory named “testcases” in the same directory of this notebook. 3. Put the unzipped direcotry “PAGES2k_CCSM4_GISTEMP” into “testcases”.
Below, we first load some useful packages, including our LMRt
.
[1]:
%load_ext autoreload
%autoreload 2
import LMRt
import os
import numpy as np
import pandas as pd
import xarray as xr
Run!¶
The top-level workflow allows the users to conduct the reconstruction purely based on a configuration YAML file. However, it still allows runtime modification of the job_dirpath
and recon_seeds
in the call for flexibility.
[2]:
LMRt.ReconJob().run_cfg(
cfg_path='./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml',
job_dirpath='./testcases/PAGES2k_CCSM4_GISTEMP/recon_top',
recon_seeds=np.arange(1),
verbose=True,
)
LMRt: job.load_configs() >>> loading reconstruction configurations from: ./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml
LMRt: job.load_configs() >>> job.configs created
LMRt: job.load_configs() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon
LMRt: job.load_configs() >>> /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon created
{'anom_period': [1951, 1980],
'job_dirpath': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon',
'job_id': 'LMRt_quickstart',
'obs_path': {'tas': './data/obs/gistemp1200_ERSSTv4.nc'},
'obs_varname': {'tas': 'tempanomaly'},
'prior_path': {'tas': './data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'},
'prior_regrid_ntrunc': 42,
'prior_season': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
'prior_varname': {'tas': 'TREFHT'},
'proxy_frac': 0.75,
'proxydb_path': './data/proxy/pages2k_dataset.pkl',
'psm_calib_period': [1850, 2015],
'ptype_psm': {'coral.SrCa': 'linear',
'coral.calc': 'linear',
'coral.d18O': 'linear'},
'ptype_season': {'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]},
'recon_loc_rad': 25000,
'recon_nens': 100,
'recon_period': [0, 2000],
'recon_seeds': [0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19],
'recon_timescale': 1,
'recon_vars': 'tas'}
LMRt: job.load_configs() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top
LMRt: job.load_configs() >>> /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top created
LMRt: job.run() >>> job.configs["recon_seeds"] = [0]
LMRt: job.prepare() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top
LMRt: job.load_proxydb() >>> job.configs["proxydb_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl
LMRt: job.load_proxydb() >>> 692 records loaded
LMRt: job.load_proxydb() >>> job.proxydb created
LMRt: job.filter_proxydb() >>> job.configs["ptype_season"] = {'coral.d18O': 'linear', 'coral.SrCa': 'linear', 'coral.calc': 'linear'}
LMRt: job.filter_proxydb() >>> filtering proxy records according to: ['coral.d18O', 'coral.SrCa', 'coral.calc']
LMRt: job.filter_proxydb() >>> 95 records remaining
LMRt: job.seasonalize_proxydb() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_proxydb() >>> seasonalizing proxy records according to: {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_proxydb() >>> 95 records remaining
LMRt: job.seasonalize_proxydb() >>> job.proxydb updated
LMRt: job.load_prior() >>> job.configs["prior_path"] = {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'}
LMRt: job.load_prior() >>> job.configs["anom_period"] = [1951, 1980]
LMRt: job.load_prior() >>> loading model prior fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'}
Time axis not overlap with the reference period [1951, 1980]; use its own time period as reference [850.08, 1850.00].
LMRt: job.load_prior() >>> raw prior
Dataset Overview
-----------------------
Name: tas
Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
Shape: time:12000, lat:96, lon:144
LMRt: job.load_prior() >>> job.prior created
LMRt: job.load_obs() >>> job.configs["anom_period"] = [1951, 1980]
LMRt: job.load_obs() >>> loading instrumental observation fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/obs/gistemp1200_ERSSTv4.nc'}
LMRt: job.load_obs() >>> job.obs created
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["seasonalized_prior_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/seasonalized_prior.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["seasonalized_obs_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/seasonalized_obs.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["prior_loc_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/prior_loc.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["obs_loc_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/obs_loc.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["calibed_psm_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/calibed_psm.pkl
LMRt: job.seasonalize_ds_for_psm() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from prior with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Searching nearest location: 12%|█▏ | 11/95 [00:00<00:00, 104.80it/s]
LMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_prior created
Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 110.35it/s]
/Users/fzhu/Github/LMRt/LMRt/utils.py:243: RuntimeWarning: Mean of empty slice
tmp = np.nanmean(var[inds, ...], axis=0)
LMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.prior_lat_idx & job.proxydb.prior_lon_idx created
LMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].prior_time & job.proxydb.records[pid].prior_value created
LMRt: job.seasonalize_ds_for_psm() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from obs with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Searching nearest location: 11%|█ | 10/95 [00:00<00:00, 98.52it/s]
LMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_obs created
Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 99.58it/s]
Calibrating PSM: 9%|▉ | 9/95 [00:00<00:00, 86.26it/s]
LMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.obs_lat_idx & job.proxydb.obs_lon_idx created
LMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].obs_time & job.proxydb.records[pid].obs_value created
LMRt: job.proxydb.init_psm() >>> job.proxydb.records[pid].psm initialized
LMRt: job.calibrate_psm() >>> job.configs["psm_calib_period"] = [1850, 2015]
LMRt: job.calibrate_psm() >>> PSM calibration period: [1850, 2015]
Calibrating PSM: 68%|██████▊ | 65/95 [00:00<00:00, 86.59it/s]
The number of overlapped data points is 0 < 25. Skipping ...
Calibrating PSM: 100%|██████████| 95/95 [00:01<00:00, 86.98it/s]
Forwarding PSM: 100%|██████████| 95/95 [00:00<00:00, 1519.37it/s]
LMRt: job.proxydb.calib_psm() >>> job.proxydb.records[pid].psm calibrated
LMRt: job.proxydb.calib_psm() >>> job.proxydb.calibed created
LMRt: job.proxydb.forward_psm() >>> job.proxydb.records[pid].psm forwarded
LMRt: job.seasonalize_prior() >>> job.configs["prior_season"] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
LMRt: job.seasonalize_prior() >>> seasonalized prior w/ season [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Dataset Overview
-----------------------
Name: tas
Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
Shape: time:1001, lat:96, lon:144
LMRt: job.seasonalize_ds_for_psm() >>> job.prior updated
LMRt: job.regrid_prior() >>> regridded prior
Dataset Overview
-----------------------
Name: tas
Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
Shape: time:1001, lat:42, lon:63
LMRt: job.regrid_prior() >>> job.prior updated
LMRt: job.prepare() >>> Prepration data saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.prepare() >>> job.configs["precalc"]["prep_savepath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
KF updating: 3%|▎ | 56/2001 [00:00<00:03, 552.02it/s]
LMRt: job.save_job() >>> Prepration data saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.save_job() >>> job.configs["precalc"]["prep_savepath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.run() >>> job.configs["recon_seeds"] = [0]
LMRt: job.run() >>> job.configs["recon_vars"] = tas
LMRt: job.run() >>> job.configs["recon_nens"] = 100
LMRt: job.run() >>> job.configs["proxy_frac"] = 0.75
LMRt: job.run() >>> job.configs["recon_period"] = [0, 2000]
LMRt: job.run() >>> job.configs["recon_timescale"] = 1
LMRt: job.run() >>> job.configs["recon_loc_rad"] = 25000
LMRt: job.run() >>> job.configs["save_settings"] = {'compress_dict': {'zlib': True, 'least_significant_digit': 1}, 'output_geo_mean': False, 'target_lats': [], 'target_lons': [], 'output_full_ens': False, 'dtype': 32}
LMRt: job.run() >>> job.configs saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_configs.yml
LMRt: job.run() >>> seed: 0 | max: 0
LMRt: job.run() >>> randomized indices for prior and proxies saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_r00_idx.pkl
Proxy Database Overview
-----------------------
Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl
Size: 70
Proxy types: {'coral.calc': 6, 'coral.SrCa': 19, 'coral.d18O': 45}
KF updating: 100%|██████████| 2001/2001 [01:08<00:00, 29.36it/s]
LMRt: job.save_recon() >>> Reconstructed fields saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_r00_recon.nc
LMRt: job.run() >>> DONE!
Once done, we will get the struture below in the “recon_top” directory:
.
├── calibed_psm.pkl
├── job_configs.yml
├── job_r00_idx.pkl
├── job_r00_recon.nc
├── job.pkl
├── obs_loc.pkl
├── prior_loc.pkl
├── seasonalized_obs.pkl
└── seasonalized_prior.pkl
For the visualization of the results, please move on to the tutorial regarding visualizations.