{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# High-level workflow in LMRt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Expected time to run through: 5 mins**\n", "\n", "\n", "In this tutorial, we demonstrate how to reproduce our 1st LMR reconstruction in the previous notebook with the high-level workflow." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test data preparation\n", "\n", "Again, if you haven't done yet, please prepare test data following the steps:\n", "1. Download the test case named \"PAGES2k_CCSM4_GISTEMP\" with this [link](https://drive.google.com/drive/folders/1UGn-LNd_tGSjPUKa52E6ffEM-ms2VD-N?usp=sharing).\n", "2. Create a directory named \"testcases\" in the same directory of this notebook.\n", "3. Put the unzipped direcotry \"PAGES2k_CCSM4_GISTEMP\" into \"testcases\".\n", "\n", "Below, we first load some useful packages, including our `LMRt`." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%load_ext autoreload\n", "%autoreload 2\n", "\n", "import LMRt\n", "import os\n", "import numpy as np\n", "import pandas as pd\n", "import xarray as xr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load configurations" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's again create the `job` object and load the configuration YAML file.\n", "To distinguish between the 1st reconstruction, let's specify the `job_dirpath` in the call, which indicates the path where we store our results, to another location." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mLMRt: job.load_configs() >>> loading reconstruction configurations from: ./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.load_configs() >>> job.configs created\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.load_configs() >>> job.configs[\"job_dirpath\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.load_configs() >>> /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high created\u001b[0m\n", "{'anom_period': [1951, 1980],\n", " 'job_dirpath': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high',\n", " 'job_id': 'LMRt_quickstart',\n", " 'obs_path': {'tas': './data/obs/gistemp1200_ERSSTv4.nc'},\n", " 'obs_varname': {'tas': 'tempanomaly'},\n", " 'prior_path': {'tas': './data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'},\n", " 'prior_regrid_ntrunc': 42,\n", " 'prior_season': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],\n", " 'prior_varname': {'tas': 'TREFHT'},\n", " 'proxy_frac': 0.75,\n", " 'proxydb_path': './data/proxy/pages2k_dataset.pkl',\n", " 'psm_calib_period': [1850, 2015],\n", " 'ptype_psm': {'coral.SrCa': 'linear',\n", " 'coral.calc': 'linear',\n", " 'coral.d18O': 'linear'},\n", " 'ptype_season': {'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],\n", " 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12],\n", " 'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]},\n", " 'recon_loc_rad': 25000,\n", " 'recon_nens': 100,\n", " 'recon_period': [0, 2000],\n", " 'recon_seeds': [0,\n", " 1,\n", " 2,\n", " 3,\n", " 4,\n", " 5,\n", " 6,\n", " 7,\n", " 8,\n", " 9,\n", " 10,\n", " 11,\n", " 12,\n", " 13,\n", " 14,\n", " 15,\n", " 16,\n", " 17,\n", " 18,\n", " 19],\n", " 'recon_timescale': 1,\n", " 'recon_vars': 'tas'}\n" ] } ], "source": [ "job = LMRt.ReconJob()\n", "job.load_configs(\n", " job_dirpath='./testcases/PAGES2k_CCSM4_GISTEMP/recon_high',\n", " cfg_path='./testcases/PAGES2k_CCSM4_GISTEMP/configs.yml',\n", " verbose=True,\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's prepare our `job` with the `.prepare()` method, which will take care of the steps prior to the \"Data assimilation\" section in the previous notebook." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prepare the `job`" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mLMRt: job.load_proxydb() >>> job.configs[\"proxydb_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.load_proxydb() >>> 692 records loaded\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.load_proxydb() >>> job.proxydb created\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.filter_proxydb() >>> filtering proxy records according to: ['coral.d18O', 'coral.SrCa', 'coral.calc']\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.filter_proxydb() >>> 95 records remaining\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.seasonalize_proxydb() >>> seasonalizing proxy records according to: {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.seasonalize_proxydb() >>> 95 records remaining\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.seasonalize_proxydb() >>> job.proxydb updated\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.load_prior() >>> loading model prior fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc'}\u001b[0m\n", "Time axis not overlap with the reference period [1951, 1980]; use its own time period as reference [850.08, 1850.00].\n", "\u001b[1m\u001b[30mLMRt: job.load_prior() >>> raw prior\u001b[0m\n", "Dataset Overview\n", "-----------------------\n", "\n", " Name: tas\n", " Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc\n", " Shape: time:12000, lat:96, lon:144\n", "\u001b[1m\u001b[32mLMRt: job.load_prior() >>> job.prior created\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.load_obs() >>> loading instrumental observation fields from: {'tas': '/Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/obs/gistemp1200_ERSSTv4.nc'}\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.load_obs() >>> job.obs created\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> job.configs[\"precalc\"][\"seasonalized_prior_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/seasonalized_prior.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> job.configs[\"precalc\"][\"seasonalized_obs_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/seasonalized_obs.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> job.configs[\"precalc\"][\"prior_loc_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/prior_loc.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> job.configs[\"precalc\"][\"obs_loc_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/obs_loc.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> job.configs[\"precalc\"][\"calibed_psm_path\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/calibed_psm.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.seasonalize_ds_for_psm() >>> job.configs[\"ptype_season\"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from prior with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Searching nearest location: 13%|█▎ | 12/95 [00:00<00:00, 112.97it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mLMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_prior created\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 107.38it/s]\n", "/Users/fzhu/Github/LMRt/LMRt/utils.py:243: RuntimeWarning: Mean of empty slice\n", " tmp = np.nanmean(var[inds, ...], axis=0)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mLMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.prior_lat_idx & job.proxydb.prior_lon_idx created\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].prior_time & job.proxydb.records[pid].prior_value created\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.seasonalize_ds_for_psm() >>> job.configs[\"ptype_season\"] = {'coral.d18O': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.SrCa': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12], 'coral.calc': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]}\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.seasonalize_ds_for_psm() >>> Seasonalizing variables from obs with season: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Searching nearest location: 11%|█ | 10/95 [00:00<00:00, 97.04it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mLMRt: job.seasonalize_ds_for_psm() >>> job.seasonalized_obs created\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Searching nearest location: 100%|██████████| 95/95 [00:00<00:00, 99.05it/s] \n", "Calibrating PSM: 9%|▉ | 9/95 [00:00<00:00, 86.19it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mLMRt: job.proxydb.find_nearest_loc() >>> job.proxydb.obs_lat_idx & job.proxydb.obs_lon_idx created\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.proxydb.get_var_from_ds() >>> job.proxydb.records[pid].obs_time & job.proxydb.records[pid].obs_value created\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.proxydb.init_psm() >>> job.proxydb.records[pid].psm initialized\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.calibrate_psm() >>> PSM calibration period: [1850, 2015]\u001b[0m\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Calibrating PSM: 62%|██████▏ | 59/95 [00:00<00:00, 92.65it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "The number of overlapped data points is 0 < 25. Skipping ...\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "Calibrating PSM: 100%|██████████| 95/95 [00:01<00:00, 91.55it/s]\n", "Forwarding PSM: 100%|██████████| 95/95 [00:00<00:00, 1516.82it/s]\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[32mLMRt: job.proxydb.calib_psm() >>> job.proxydb.records[pid].psm calibrated\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.proxydb.calib_psm() >>> job.proxydb.calibed created\u001b[0m\n", "\u001b[1m\u001b[32mLMRt: job.proxydb.forward_psm() >>> job.proxydb.records[pid].psm forwarded\u001b[0m\n", "\u001b[1m\u001b[30mLMRt: job.seasonalize_prior() >>> seasonalized prior w/ season [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]\u001b[0m\n", "Dataset Overview\n", "-----------------------\n", "\n", " Name: tas\n", " Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc\n", " Shape: time:1001, lat:96, lon:144\n", "\u001b[1m\u001b[32mLMRt: job.seasonalize_ds_for_psm() >>> job.prior updated\u001b[0m\n", "\u001b[1m\u001b[30mLMRt: job.regrid_prior() >>> regridded prior\u001b[0m\n", "Dataset Overview\n", "-----------------------\n", "\n", " Name: tas\n", " Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/prior/b.e11.BLMTRC5CN.f19_g16.001.cam.h0.TREFHT.085001-184912.nc\n", " Shape: time:1001, lat:42, lon:63\n", "\u001b[1m\u001b[32mLMRt: job.regrid_prior() >>> job.prior updated\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.prepare() >>> Prepration data saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/job.pkl\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.prepare() >>> job.configs[\"precalc\"][\"prep_savepath\"] = /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/job.pkl\u001b[0m\n" ] } ], "source": [ "job.prepare(verbose=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Run the `job`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Okay, now we are ready to run the job." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "KF updating: 2%|▏ | 49/2001 [00:00<00:04, 482.46it/s]" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mLMRt: job.run() >>> job.configs[\"recon_seeds\"] = [0]\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.run() >>> job.configs[\"save_settings\"] = {'compress_dict': {'zlib': True, 'least_significant_digit': 1}, 'output_geo_mean': False, 'target_lats': [], 'target_lons': [], 'output_full_ens': False, 'dtype': 32}\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.run() >>> job.configs saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/job_configs.yml\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.run() >>> seed: 0 | max: 0\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.run() >>> randomized indices for prior and proxies saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/job_r00_idx.pkl\u001b[0m\n", "Proxy Database Overview\n", "-----------------------\n", " Source: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/data/proxy/pages2k_dataset.pkl\n", " Size: 70\n", "Proxy types: {'coral.calc': 6, 'coral.SrCa': 19, 'coral.d18O': 45}\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "KF updating: 100%|██████████| 2001/2001 [01:09<00:00, 28.68it/s] \n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "\u001b[1m\u001b[36mLMRt: job.save_recon() >>> Reconstructed fields saved to: /Users/fzhu/Github/LMRt/docsrc/tutorial/testcases/PAGES2k_CCSM4_GISTEMP/recon_high/job_r00_recon.nc\u001b[0m\n", "\u001b[1m\u001b[36mLMRt: job.run() >>> DONE!\u001b[0m\n" ] } ], "source": [ "job.run(recon_seeds=np.arange(1), verbose=True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Once done, we will get the struture below in the \"recon_high\" directory:\n", "```\n", ".\n", "├── calibed_psm.pkl\n", "├── job_configs.yml\n", "├── job_r00_idx.pkl\n", "├── job_r00_recon.nc\n", "├── job.pkl\n", "├── obs_loc.pkl\n", "├── prior_loc.pkl\n", "├── seasonalized_obs.pkl\n", "└── seasonalized_prior.pkl\n", "```\n", "\n", "For the visualization of the results, please move on to the tutorial regarding visualizations." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" } }, "nbformat": 4, "nbformat_minor": 4 }