Finding/opening ACCESS-OM3 output using intake¶
This notebook demonstrates how to using intake-esm to find and load data data from an ACCESS-OM3 run.
For more information about using intake-esm to find and load data, see:
- the intake-esm documentation
- this section of the access-nri-intake-catalog documentation
# These first two cells must be in all notebooks!
# It allows us to run all the notebooks at once, this cell has a tag "parameters" which allows us to pass in
# arguments externally using papermill (see mkfigs.sh for details)
# Set esm_file to the datastore for the main experiment of interest
esm_file = "/g/data/ol01/outputs/access-om3-25km/MC_25km_jra_iaf+wombatlite-test3v2-00532b88/datastore.json"
# papermill settings. *No need to modify these if running interactively.*
papermill = False # `cwd` and `nbname` will be populated by papermill.
cwd = None # current working directory
nbname = None # notebook name
# Parameters
esm_file = "/g/data/ol01/outputs/access-om3-25km/MC_25km_jra_iaf-1.0-beta-5165c0f8/datastore.json"
papermill = True
cwd = "/g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/"
nbname = "00_template_notebook.ipynb"
if not papermill:
import nci_ipynb, os # requires conda/analysis3-26.03 or later
cwd = nci_ipynb.dir()
nbname = nci_ipynb.name()
os.chdir(cwd)
from mkfigs_configdoc import MkmdWriter
mkmd = MkmdWriter(esm_file, nbname, str(cwd), pm=papermill)
import xarray as xr
import cf_xarray
import intake
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
from distributed import Client
import intake
client = Client(threads_per_worker=1)
print(client.dashboard_link)
http://127.0.0.1:8787/status
IAF = esm_file.find('iaf') > 0
IAF
True
Open the intake-esm datastore¶
COLUMNS_WITH_ITERABLES = [
"variable",
"variable_long_name",
"variable_standard_name",
"variable_cell_methods",
"variable_units"
]
datastore = intake.open_esm_datastore(
esm_file,
columns_with_iterables=COLUMNS_WITH_ITERABLES
)
What ocean variables are available at monthly frequency?¶
def available_variables(datastore):
"""Return a pandas dataframe summarising the variables in a datastore"""
variable_columns = [col for col in datastore.df.columns if "variable" in col]
return (
datastore.df[variable_columns]
.explode(variable_columns)
.drop_duplicates()
.set_index("variable")
.sort_index()
)
datastore_filtered = datastore.search(realm="ocean", frequency="1mon", variable="umo")
available_variables(datastore_filtered)
| variable_long_name | variable_standard_name | variable_cell_methods | variable_units | |
|---|---|---|---|---|
| variable | ||||
| average_DT | Length of average period | days | ||
| average_T1 | Start time for average period | days since 1900-01-01 00:00:00 | ||
| average_T2 | End time for average period | days since 1900-01-01 00:00:00 | ||
| nv | vertex number | |||
| rho2_i | Target Potential Density at interface | kg m-3 | ||
| rho2_l | Target Potential Density at cell center | kg m-3 | ||
| time | time | days since 1900-01-01 00:00:00 | ||
| time_bnds | time axis boundaries | days since 1900-01-01 00:00:00 | ||
| umo | Ocean Mass X Transport | ocean_mass_x_transport | rho2_l:sum yh:sum xq:point time: mean | kg s-1 |
| xq | q point nominal longitude | degrees_east | ||
| yh | h point nominal latitude | degrees_north |
Load monthly sea surface height (zos) and plot the field at the last available time¶
zos = datastore.search(variable="zos", frequency="1mon").to_dask(
xarray_combine_by_coords_kwargs = dict( # These kwargs can make things faster
compat="override",
data_vars="minimal",
coords="minimal",
),
xarray_open_kwargs = dict(
chunks={"yh": -1, "xh": -1}, # Good for spatial operations, but not temporal
decode_timedelta=True
)
)
proj = ccrs.PlateCarree()
fig, ax = plt.subplots(figsize=(15,6), subplot_kw=dict(projection=proj))
zos["zos"].isel(time=-1).plot(ax=ax)
ax.coastlines()
_ = ax.gridlines()
Fixing the plot axes¶
Notice that the white land-masked regions are distorted away from the coastlines in the Arctic in the above plot. This is because a tripolar grid is used, so the grid lines are not zonal and meridional north of 65N, and consequently the nominal 1D coordinates xh and yh are incorrect. To fix this we need to use 2d coordinates geolon and geolat.
We can get these coorindates from the intake-esm datastore. However, note that in some cases geolon and geolat contain NaNs in regions where processors were masked over land. Below we replace these NaNs with zeros so that the coordinates can be used for plotting.
try:
# preferred source, if available (no processor masking holes)
# path is a workaround - see https://forum.access-hive.org.au/t/issues-in-loading-ht-in-latest-conds-envs/6278
path = datastore.search(variable=["geolat", "geolon"], filename="access-om3.mom6.geometry.nc").df.loc[0, 'path']
coords = datastore.search(variable=["geolat", "geolon"], path=path).to_dask().compute()
coords = coords.rename({ "lonh": "xh", "lath": "yh" })
except:
# these contain NaNs in processor masks, so set NaNs to zero so they can be used for plotting
# path is a workaround - see https://forum.access-hive.org.au/t/issues-in-loading-ht-in-latest-conds-envs/6278
path = datastore.search(variable=["geolat", "geolon"], filename="access-om3.mom6.static.nc").df.loc[0, 'path']
coords = datastore.search(variable=["geolat", "geolon"], path=path).to_dask().compute()
coords = coords.fillna(0.0)
zos = zos.assign_coords(coords)
proj = ccrs.PlateCarree()
fig, ax = plt.subplots(figsize=(15,6), subplot_kw=dict(projection=proj))
zos["zos"].isel(time=-1).plot(ax=ax, x="geolon", y="geolat")
ax.coastlines()
_ = ax.gridlines()
For diagnostics that will be run routinely on new OM3 output via papermill¶
Please include a key figure(s) or table(s) for output with the mkfigs.sh papermill workflow
# mkmd.savefig(title, caption, dpi=100)
# Save figure and append to markdown summary.
# title: title of figure
# caption: caption of figure
# dpi (optional; default: 100): dpi for figure
mkmd.savefig(fig,"Template notebook", "Example figure of ACCESS-OM3 sea surface height (m).", dpi=150)
Saved /g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/mkmd/00_template_notebook_01.png Adding entry to per-notebook markdown: /g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/mkmd/00_template_notebook.md
Lines appended to /g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/mkmd/00_template_notebook.md successfully.
# mkmd.table(title, table)
# Append table to markdown summary.
# title: title of table
# table: markdown table strings (a list of strings where each string is a new line)
mkmd.table("example title", available_variables(datastore_filtered).to_markdown().split('\n'))
Adding entry to per-notebook markdown: /g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/mkmd/00_template_notebook.md Lines appended to /g/data/tm70/cyb561/repos/access-om3-paper-1/notebooks/mkfigs_output_MC_25km_jra_iaf-1.0-beta-5165c0f8/mkmd/00_template_notebook.md successfully.
client.close()
2026-06-29 16:47:53,499 - distributed.worker - ERROR - Failed to communicate with scheduler during heartbeat.
Traceback (most recent call last):
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/comm/tcp.py", line 226, in read
frames_nosplit_nbytes_bin = await stream.read_bytes(fmt_size)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
tornado.iostream.StreamClosedError: Stream is closed
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/worker.py", line 1273, in heartbeat
response = await retry_operation(
^^^^^^^^^^^^^^^^^^^^^^
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/utils_comm.py", line 416, in retry_operation
return await retry(
^^^^^^^^^^^^
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/utils_comm.py", line 395, in retry
return await coro()
^^^^^^^^^^^^
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/core.py", line 1259, in send_recv_from_rpc
return await send_recv(comm=comm, op=key, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/core.py", line 1018, in send_recv
response = await comm.read(deserializers=deserializers)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/comm/tcp.py", line 237, in read
convert_stream_closed_error(self, e)
File "/g/data/xp65/public/apps/med_conda/envs/analysis3-26.05/lib/python3.12/site-packages/distributed/comm/tcp.py", line 137, in convert_stream_closed_error
raise CommClosedError(f"in {obj}: {exc}") from exc
distributed.comm.core.CommClosedError: in <TCP (closed) ConnectionPool.heartbeat_worker local=tcp://127.0.0.1:59294 remote=tcp://127.0.0.1:34127>: Stream is closed