Top-level workflow in LMRt

Expected time to run through: 3 mins

In this tutorial, we demonstrate how to reproduce our 1st LMR reconstruction with the top-level workflow. This top-level workflow is essentailly a shortcut of the job.prepare() and job.run() we have seen in the previous notebook on the high-level workflow. With the high-level workflow, users still have the flexibilty to perform intermediate steps, while with the top-level workflow, users can only execute the one-line command to reproduce an experiment totally based on the given configuration YAML file, without any chance to do any intermediate modification of the experiment.

We will first generate a configuration YAML file with a different job_dirpath so as to store our results to a different location, and then we use the job.run_cfg() method to reproduce the reconstruction directly!

Test data preparation

Again, if you haven’t done yet, please prepare test data following the steps: 1. Download the test case named “PAGES2k_CCSM4_GISTEMP” with this link. 2. Create a directory named “testcases” in the same directory of this notebook. 3. Put the unzipped direcotry “PAGES2k_CCSM4_GISTEMP” into “testcases”.

Below, we first load some useful packages, including our LMRt.

[1]:
%load_ext autoreload
%autoreload 2

import LMRt
import os
import numpy as np
import pandas as pd
import xarray as xr

Run!

The top-level workflow allows the users to conduct the reconstruction purely based on a configuration YAML file. However, it still allows runtime modification of the job_dirpath and recon_seeds in the call for flexibility.

[2]:
LMRt.ReconJob().run_cfg(
    cfg_path='./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml',
    job_dirpath='./testcases/PAGES2k_CCSM4_GISTEMP/recon_top',
    recon_seeds=np.arange(1),
    verbose=True,
)
LMRt: job.load_configs() >>> loading reconstruction configurations from: ./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml
LMRt: job.load_configs() >>> job.configs created
LMRt: job.load_configs() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon
LMRt: job.load_configs() >>> /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon created
{'anom_period': [1951, 1980],
 'job_dirpath': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon',
 'job_id': 'LMRt_quickstart',
 'obs_path': {'tas': './data/obs/gistemp1200_ERSSTv4.nc'},
 'obs_varname': {'tas': 'tempanomaly'},
 'prior_path': {'tas': './data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'},
 'prior_regrid_ntrunc': 42,
 'prior_season': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
 'prior_varname': {'tas': 'TREFHT'},
 'proxy_frac': 0.75,
 'proxydb_path': './data/proxy/pages2k_dataset.pkl',
 'psm_calib_period': [1850, 2015],
 'ptype_psm': {'coral.SrCa': 'linear',
               'coral.calc': 'linear',
               'coral.d18O': 'linear'},
 'ptype_season': {'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
                  'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],
                  'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]},
 'recon_loc_rad': 25000,
 'recon_nens': 100,
 'recon_period': [0, 2000],
 'recon_seeds': [0,
                 1,
                 2,
                 3,
                 4,
                 5,
                 6,
                 7,
                 8,
                 9,
                 10,
                 11,
                 12,
                 13,
                 14,
                 15,
                 16,
                 17,
                 18,
                 19],
 'recon_timescale': 1,
 'recon_vars': 'tas'}
LMRt: job.load_configs() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top
LMRt: job.load_configs() >>> /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top created
LMRt: job.run() >>> job.configs["recon_seeds"] = [0]
LMRt: job.prepare() >>> job.configs["job_dirpath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top
LMRt: job.load_proxydb() >>> job.configs["proxydb_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl
LMRt: job.load_proxydb() >>> 692 records loaded
LMRt: job.load_proxydb() >>> job.proxydb created
LMRt: job.filter_proxydb() >>> job.configs["ptype_season"] = {'coral.d18O': 'linear', 'coral.SrCa': 'linear', 'coral.calc': 'linear'}
LMRt: job.filter_proxydb() >>> filtering proxy records according to: ['coral.d18O', 'coral.SrCa', 'coral.calc']
LMRt: job.filter_proxydb() >>> 95 records remaining
LMRt: job.seasonalize_proxydb() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_proxydb() >>> seasonalizing proxy records according to: {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_proxydb() >>> 95 records remaining
LMRt: job.seasonalize_proxydb() >>> job.proxydb updated
LMRt: job.load_prior() >>> job.configs["prior_path"] = {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'}
LMRt: job.load_prior() >>> job.configs["anom_period"] = [1951, 1980]
LMRt: job.load_prior() >>> loading model prior fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'}
Time axis not overlap with the reference period [1951, 1980]; use its own time period as reference [850.08, 1850.00].
LMRt: job.load_prior() >>> raw prior
Dataset Overview
-----------------------

     Name:  tas
   Source:  /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
    Shape:  time:12000, lat:96, lon:144
LMRt: job.load_prior() >>> job.prior created
LMRt: job.load_obs() >>> job.configs["anom_period"] = [1951, 1980]
LMRt: job.load_obs() >>> loading instrumental observation fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/obs/gistemp1200_ERSSTv4.nc'}
LMRt: job.load_obs() >>> job.obs created
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["seasonalized_prior_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/seasonalized_prior.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["seasonalized_obs_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/seasonalized_obs.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["prior_loc_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/prior_loc.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["obs_loc_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/obs_loc.pkl
LMRt: job.calibrate_psm() >>> job.configs["precalc"]["calibed_psm_path"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/calibed_psm.pkl
LMRt: job.seasonalize_ds_for_psm() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from prior with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Searching nearest location:  12%|█▏        | 11/95 [00:00<00:00, 104.80it/s]
LMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_prior created
Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 110.35it/s]
/Users/fzhu/Github/LMRt/LMRt/utils.py:243: RuntimeWarning: Mean of empty slice
  tmp = np.nanmean(var[inds, ...], axis=0)
LMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.prior_lat_idx & job.proxydb.prior_lon_idx created
LMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].prior_time & job.proxydb.records[pid].prior_value created
LMRt: job.seasonalize_ds_for_psm() >>> job.configs["ptype_season"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}
LMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from obs with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Searching nearest location:  11%|█         | 10/95 [00:00<00:00, 98.52it/s]
LMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_obs created
Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 99.58it/s]
Calibrating PSM:   9%|▉         | 9/95 [00:00<00:00, 86.26it/s]
LMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.obs_lat_idx & job.proxydb.obs_lon_idx created
LMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].obs_time & job.proxydb.records[pid].obs_value created
LMRt: job.proxydb.init_psm() >>> job.proxydb.records[pid].psm initialized
LMRt: job.calibrate_psm() >>> job.configs["psm_calib_period"] = [1850, 2015]
LMRt: job.calibrate_psm() >>> PSM calibration period: [1850, 2015]
Calibrating PSM:  68%|██████▊   | 65/95 [00:00<00:00, 86.59it/s]
The number of overlapped data points is 0 < 25. Skipping ...
Calibrating PSM: 100%|██████████| 95/95 [00:01<00:00, 86.98it/s]
Forwarding PSM: 100%|██████████| 95/95 [00:00<00:00, 1519.37it/s]
LMRt: job.proxydb.calib_psm() >>> job.proxydb.records[pid].psm calibrated
LMRt: job.proxydb.calib_psm() >>> job.proxydb.calibed created
LMRt: job.proxydb.forward_psm() >>> job.proxydb.records[pid].psm forwarded
LMRt: job.seasonalize_prior() >>> job.configs["prior_season"] = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
LMRt: job.seasonalize_prior() >>> seasonalized prior w/ season [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
Dataset Overview
-----------------------

     Name:  tas
   Source:  /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
    Shape:  time:1001, lat:96, lon:144
LMRt: job.seasonalize_ds_for_psm() >>> job.prior updated
LMRt: job.regrid_prior() >>> regridded prior
Dataset Overview
-----------------------

     Name:  tas
   Source:  /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc
    Shape:  time:1001, lat:42, lon:63
LMRt: job.regrid_prior() >>> job.prior updated
LMRt: job.prepare() >>> Prepration data saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.prepare() >>> job.configs["precalc"]["prep_savepath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
KF updating:   3%|▎         | 56/2001 [00:00<00:03, 552.02it/s]
LMRt: job.save_job() >>> Prepration data saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.save_job() >>> job.configs["precalc"]["prep_savepath"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job.pkl
LMRt: job.run() >>> job.configs["recon_seeds"] = [0]
LMRt: job.run() >>> job.configs["recon_vars"] = tas
LMRt: job.run() >>> job.configs["recon_nens"] = 100
LMRt: job.run() >>> job.configs["proxy_frac"] = 0.75
LMRt: job.run() >>> job.configs["recon_period"] = [0, 2000]
LMRt: job.run() >>> job.configs["recon_timescale"] = 1
LMRt: job.run() >>> job.configs["recon_loc_rad"] = 25000
LMRt: job.run() >>> job.configs["save_settings"] = {'compress_dict': {'zlib': True, 'least_significant_digit': 1}, 'output_geo_mean': False, 'target_lats': [], 'target_lons': [], 'output_full_ens': False, 'dtype': 32}
LMRt: job.run() >>> job.configs saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_configs.yml
LMRt: job.run() >>> seed: 0 | max: 0
LMRt: job.run() >>> randomized indices for prior and proxies saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_r00_idx.pkl
Proxy Database Overview
-----------------------
     Source:        /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl
       Size:        70
Proxy types:        {'coral.calc': 6, 'coral.SrCa': 19, 'coral.d18O': 45}
KF updating: 100%|██████████| 2001/2001 [01:08<00:00, 29.36it/s]
LMRt: job.save_recon() >>> Reconstructed fields saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_top/job_r00_recon.nc
LMRt: job.run() >>> DONE!

Once done, we will get the struture below in the “recon_top” directory:

.
├── calibed_psm.pkl
├── job_configs.yml
├── job_r00_idx.pkl
├── job_r00_recon.nc
├── job.pkl
├── obs_loc.pkl
├── prior_loc.pkl
├── seasonalized_obs.pkl
└── seasonalized_prior.pkl

For the visualization of the results, please move on to the tutorial regarding visualizations.