Basic Operations#

In this section, we introduce the basic operations of a cfr.ClimateField.

Required data to complete this tutorial:

Due to the new data fetching feature, the above datasets are not required to be downloaded manually.

[1]:
%load_ext autoreload
%autoreload 2

import cfr
print(cfr.__version__)
import numpy as np

Load the test netCDF file as a ClimateField#

[5]:
fd = cfr.ClimateField().fetch(
    'gistemp1200_GHCNv4_ERSSTv5',  # URL points to the netCDF file OR a supported dataset name
    vn='tempanomaly',  # specify the name of the variable to load
)

fd.da  # check the loaded `xarray.DataArray`
Fetching data: 100%|██████████| 24.1M/24.1M [00:26<00:00, 960kiB/s]
>>> Downloaded file saved at: ./data/gistemp1200_GHCNv4_ERSSTv5.nc.gz

[5]:
<xarray.DataArray 'tempanomaly' (time: 1720, lat: 90, lon: 180)>
[27864000 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-01-15 00:00:00 ... 2023-04-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[7]:
fd = cfr.ClimateField().load_nc(
    './data/gistemp1200_GHCNv4_ERSSTv5.nc.gz',  # path to the netCDF file; compressed data (.nc.gz) is also supported
    vn='tempanomaly',  # specify the name of the variable to load
)

fd.da  # check the loaded `xarray.DataArray`
[7]:
<xarray.DataArray 'tempanomaly' (time: 1720, lat: 90, lon: 180)>
[27864000 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-01-15 00:00:00 ... 2023-04-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean

Time slicing#

We may slice a ClimateField as like with xarray.DataArray:

[43]:
# get the 1st time point
fd[0].da
[43]:
<xarray.DataArray 'tas' (lat: 90, lon: 180)>
[16200 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
    time     object 1880-01-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[44]:
# get the first 5 time points
fd[:5].da
[44]:
<xarray.DataArray 'tas' (time: 5, lat: 90, lon: 180)>
[81000 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-01-15 00:00:00 ... 1880-05-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[55]:
# get discrete data points
fd[[1, 5]].da
<class 'list'>
[55]:
<xarray.DataArray 'tas' (time: 2, lat: 90, lon: 180)>
[32400 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-02-15 00:00:00 1880-06-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[53]:
# get all the months in a year
fd['1990'].da
<class 'str'>
[53]:
<xarray.DataArray 'tas' (time: 12, lat: 90, lon: 180)>
[194400 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1990-01-15 00:00:00 ... 1990-12-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[65]:
# get all the months in a list of years
fd[['1990', '1995']].da
[65]:
<xarray.DataArray 'tas' (time: 24, lat: 90, lon: 180)>
array([[[ 0.35      ,  0.35      ,  0.35      , ...,  0.35      ,
          0.35      ,  0.35      ],
        [ 0.35      ,  0.35      ,  0.35      , ...,  0.35      ,
          0.35      ,  0.35      ],
        [ 0.35      ,  0.35      ,  0.35      , ...,  0.35      ,
          0.35      ,  0.35      ],
        ...,
        [ 3.56      ,  3.56      ,  3.56      , ...,  3.56      ,
          3.56      ,  3.56      ],
        [ 3.56      ,  3.56      ,  3.56      , ...,  3.56      ,
          3.56      ,  3.56      ],
        [ 3.56      ,  3.56      ,  3.56      , ...,  3.56      ,
          3.56      ,  3.56      ]],

       [[-0.69      , -0.69      , -0.69      , ..., -0.69      ,
         -0.69      , -0.69      ],
        [-0.69      , -0.69      , -0.69      , ..., -0.69      ,
         -0.69      , -0.69      ],
        [-0.69      , -0.69      , -0.69      , ..., -0.69      ,
         -0.69      , -0.69      ],
...
        [ 0.5       ,  0.5       ,  0.5       , ...,  0.5       ,
          0.5       ,  0.5       ],
        [ 0.5       ,  0.5       ,  0.5       , ...,  0.5       ,
          0.5       ,  0.5       ],
        [ 0.5       ,  0.5       ,  0.5       , ...,  0.5       ,
          0.5       ,  0.5       ]],

       [[ 0.97999996,  0.97999996,  0.97999996, ...,  0.97999996,
          0.97999996,  0.97999996],
        [ 0.97999996,  0.97999996,  0.97999996, ...,  0.97999996,
          0.97999996,  0.97999996],
        [ 0.97999996,  0.97999996,  0.97999996, ...,  0.97999996,
          0.97999996,  0.97999996],
        ...,
        [-2.98      , -2.98      , -2.98      , ..., -2.98      ,
         -2.98      , -2.98      ],
        [-2.98      , -2.98      , -2.98      , ..., -2.98      ,
         -2.98      , -2.98      ],
        [-2.98      , -2.98      , -2.98      , ..., -2.98      ,
         -2.98      , -2.98      ]]], dtype=float32)
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1990-01-15 00:00:00 ... 1995-12-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[68]:
# get a time period
fd['1990':'2000'].da
[68]:
<xarray.DataArray 'tas' (time: 120, lat: 90, lon: 180)>
[1944000 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1990-01-15 00:00:00 ... 1999-12-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[67]:
# get a time period with steps
fd['1990':'2000':5].da
[67]:
<xarray.DataArray 'tas' (time: 24, lat: 90, lon: 180)>
[388800 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1990-01-15 00:00:00 ... 1999-08-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
[72]:
# get a time period with steps
fd['1990':'2001':5*12].da
[72]:
<xarray.DataArray 'tas' (time: 3, lat: 90, lon: 180)>
[48600 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1990-01-15 00:00:00 ... 2000-01-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean

Rename the variable#

By renaming the variable, we are able to load some presets for visualization.

[29]:
fd = fd.rename('tas')
fd.da
[29]:
<xarray.DataArray 'tas' (time: 1720, lat: 90, lon: 180)>
[27864000 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-01-15 00:00:00 ... 2023-04-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean

Plot a field map at a time point#

[30]:
fig, ax = fd['1990-11'].plot()
../_images/notebooks_climate-ops_18_0.png

The default colorbar is not zero-centered. We may adjust as below:

[31]:
fig, ax = fd['1990-11'].plot(
    levels=np.linspace(-5, 5, 21),       # set levels for the colors
    cbar_labels=np.linspace(-5, 5, 11),  # set the labels for the bar
)
../_images/notebooks_climate-ops_20_0.png

Get the anomaly field#

We may call the .get_anom() method, specifying a reference period, to calculate the anomaly field:

[32]:
fd_anom = fd.get_anom(ref_period=(1951, 2000))
fig, ax = fd_anom['1990-11'].plot(
    levels=np.linspace(-5, 5, 21),       # set levels for the colors
    cbar_labels=np.linspace(-5, 5, 11),  # set the labels for the bar
    extend='both',                       # extend the bar
)
../_images/notebooks_climate-ops_22_0.png

Center the field#

Similar to calculating the anomaly, sometimes we just want to center the values of a field against a reference period. With a cfr.ClimateField, we may call the .center() method to achieve that, again speciying a reference period:

[33]:
fd_center = fd.center(ref_period=(1951, 2000))
fig, ax = fd_center['1990-11'].plot(
    levels=np.linspace(-5, 5, 21),       # set levels for the colors
    cbar_labels=np.linspace(-5, 5, 11),  # set the labels for the bar
    extend='both',                       # extend the bar
)
../_images/notebooks_climate-ops_24_0.png

Annualize/Seasonalize a ClimateField#

It is common that we need to annualize/seasonalize a monthly data. With a ClimateField, we may call the .annualize() method with months as the argument to achieve the goal.

For instance, if we’d like to calculate the DJF annual data, simply:

[34]:
fd_djf = fd.annualize(months=[12, 1, 2])
fd_djf.da.time.values
[34]:
array([1880, 1881, 1882, 1883, 1884, 1885, 1886, 1887, 1888, 1889, 1890,
       1891, 1892, 1893, 1894, 1895, 1896, 1897, 1898, 1899, 1900, 1901,
       1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910, 1911, 1912,
       1913, 1914, 1915, 1916, 1917, 1918, 1919, 1920, 1921, 1922, 1923,
       1924, 1925, 1926, 1927, 1928, 1929, 1930, 1931, 1932, 1933, 1934,
       1935, 1936, 1937, 1938, 1939, 1940, 1941, 1942, 1943, 1944, 1945,
       1946, 1947, 1948, 1949, 1950, 1951, 1952, 1953, 1954, 1955, 1956,
       1957, 1958, 1959, 1960, 1961, 1962, 1963, 1964, 1965, 1966, 1967,
       1968, 1969, 1970, 1971, 1972, 1973, 1974, 1975, 1976, 1977, 1978,
       1979, 1980, 1981, 1982, 1983, 1984, 1985, 1986, 1987, 1988, 1989,
       1990, 1991, 1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000,
       2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011,
       2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022,
       2023])
[35]:
fd_djf['1881'].da
[35]:
<xarray.DataArray 'tas' (time: 1, lat: 90, lon: 180)>
array([[[nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        ...,
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan],
        [nan, nan, nan, ..., nan, nan, nan]]], dtype=float32)
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) int64 1881
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
    annualized:    1
[36]:
fd_djf[-1].da
[36]:
<xarray.DataArray 'tas' (lat: 90, lon: 180)>
array([[-1.8566667, -1.8566667, -1.8566667, ..., -1.8566667, -1.8566667,
        -1.8566667],
       [-1.8566667, -1.8566667, -1.8566667, ..., -1.8566667, -1.8566667,
        -1.8566667],
       [-1.8566667, -1.8566667, -1.8566667, ..., -1.8566667, -1.8566667,
        -1.8566667],
       ...,
       [ 5.4666667,  5.4666667,  5.4666667, ...,  5.4666667,  5.4666667,
         5.4666667],
       [ 5.4666667,  5.4666667,  5.4666667, ...,  5.4666667,  5.4666667,
         5.4666667],
       [ 5.4666667,  5.4666667,  5.4666667, ...,  5.4666667,  5.4666667,
         5.4666667]], dtype=float32)
Coordinates:
  * lat      (lat) float32 -89.0 -87.0 -85.0 -83.0 -81.0 ... 83.0 85.0 87.0 89.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
    time     int64 2023
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean
    annualized:    1

Regrid a ClimateField#

Regridding is also a common task. With a ClimateField, we may easily modify the spatial resolution of a field.

[37]:
fd_regrid = fd.regrid(np.linspace(-90, 90, 41), np.linspace(0, 360, 81))
fd_regrid.da
[37]:
<xarray.DataArray 'tas' (time: 1720, lat: 41, lon: 81)>
array([[[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        ...,
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
...
        [        nan,  1.33999991,  1.24000001, ...,  1.30999994,
          1.33999991,         nan],
        [        nan,  4.13999987,  4.13999987, ...,  4.13999987,
          4.13999987,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]],

       [[        nan,         nan,         nan, ...,         nan,
                 nan,         nan],
        [        nan,  1.56999993,  1.56999993, ...,  1.56999993,
          1.56999993,         nan],
        [        nan,  3.11999989,  2.74000001, ...,  4.02999973,
          3.75      ,         nan],
        ...,
        [        nan,  4.80999994,  4.82999992, ...,  4.46999979,
          4.5999999 ,         nan],
        [        nan,  3.        ,  3.        , ...,  3.        ,
          3.        ,         nan],
        [        nan,         nan,         nan, ...,         nan,
                 nan,         nan]]])
Coordinates:
  * time     (time) object 1880-01-15 00:00:00 ... 2023-04-15 00:00:00
  * lat      (lat) float64 -90.0 -85.5 -81.0 -76.5 -72.0 ... 76.5 81.0 85.5 90.0
  * lon      (lon) float64 0.0 4.5 9.0 13.5 18.0 ... 346.5 351.0 355.5 360.0
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean

Crop a cfr.ClimateField#

With a cfr.ClimateField, we may crop a spatial domain by calling the .crop() method, specifying the arguments lat_min, lat_max, lon_min, lon_max:

[38]:
fd_crop = fd.crop(-35, 35, 0, 360)
fd_crop.da
[38]:
<xarray.DataArray 'tas' (time: 1720, lat: 36, lon: 180)>
[11145600 values with dtype=float32]
Coordinates:
  * lat      (lat) float32 -35.0 -33.0 -31.0 -29.0 -27.0 ... 29.0 31.0 33.0 35.0
  * lon      (lon) float32 1.0 3.0 5.0 7.0 9.0 ... 351.0 353.0 355.0 357.0 359.0
  * time     (time) object 1880-01-15 00:00:00 ... 2023-04-15 00:00:00
Attributes:
    long_name:     Surface temperature anomaly
    units:         K
    cell_methods:  time: mean

Calculate the weighted mean of a sptial area#

Lots of climatic indices require to calculate the weighted mean of a spatial area, e.g., the global mean surface temperature (GMST), NINO3.4, etc.

With a cfr.ClimateField, we may call the .geo_mean() method to calculate that quantity, again specifying the arguments lat_min, lat_max, lon_min, lon_max.

[39]:
gmst = fd.annualize().geo_mean()  # by default, the area is global
fig, ax = gmst.plot(ylabel='GMST [K]', linewidth=2, ylim=(-1, 1.5))
../_images/notebooks_climate-ops_34_0.png
[40]:
nino34 = fd.annualize(months=[12, 1, 2]).geo_mean(
    lat_min=-5, lat_max=5,
    lon_min=np.mod(-170, 360), lon_max=np.mod(-120, 360)
)
fig, ax = nino34.plot(ylabel='NINO3.4 [K]', linewidth=2, ylim=(-3, 3))
../_images/notebooks_climate-ops_35_0.png

Validate a field against another#

We may compare two cfr.ClimateFields by calling the .validate() method. Here as an example, we compare the calendar year annualized field with the JJA annualized field, with three different metrics:

[42]:
for stat in ['corr', 'R2', 'CE']:
    valid_fd = fd['1880':'2000'].annualize().compare(
        fd['1880':'2000'].annualize(months=[6, 7, 8]), stat=stat)
    fig, ax = valid_fd.plot()
/Users/fengzhu/Apps/miniconda3/envs/cfr-env/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1878: RuntimeWarning: Degrees of freedom <= 0 for slice.
  var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/Users/fengzhu/Apps/miniconda3/envs/cfr-env/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1878: RuntimeWarning: Degrees of freedom <= 0 for slice.
  var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/Users/fengzhu/Apps/miniconda3/envs/cfr-env/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1878: RuntimeWarning: Degrees of freedom <= 0 for slice.
  var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/Users/fengzhu/Apps/miniconda3/envs/cfr-env/lib/python3.9/site-packages/numpy/lib/nanfunctions.py:1878: RuntimeWarning: Degrees of freedom <= 0 for slice.
  var = nanvar(a, axis=axis, dtype=dtype, out=out, ddof=ddof,
/Users/fengzhu/Github/cfr/cfr/utils.py:519: RuntimeWarning: Mean of empty slice
  denom = np.nansum(np.power(ref-np.nanmean(ref,axis=0),2),axis=0)
/Users/fengzhu/Github/cfr/cfr/utils.py:520: RuntimeWarning: invalid value encountered in divide
  CE    = 1. - np.divide(numer,denom)
../_images/notebooks_climate-ops_37_1.png
../_images/notebooks_climate-ops_37_2.png
../_images/notebooks_climate-ops_37_3.png
[ ]: