Lazy loading¶

When reading data from a data entry (using DBEntry.get, or DBEntry.get_slice), by default all data is read immediately from the lowlevel Access Layer backend. This may take a long time to complete if the data entry has a lot of data stored for the requested IDS.

Instead of reading data immediately, IMAS-Python can also lazy load the data when you need it. This will speed up your program in cases where you are interested in a subset of all the data stored in an IDS.

Enable lazy loading of data¶

You can enable lazy loading of data by supplying the keyword argument lazy=True to DBEntry.get, or DBEntry.get_slice. The returned IDS object will fetch the data from the backend at the moment that you want to access it. See below example:

Example with lazy loading of data¶

import os

import matplotlib
import numpy

# To avoid possible display issues when Matplotlib uses a non-GUI backend
if "DISPLAY" not in os.environ:
    matplotlib.use("agg")
else:
    matplotlib.use("TKagg")

from matplotlib import pyplot as plt

import imas
from imas.ids_defs import MDSPLUS_BACKEND

database, pulse, run, user = "ITER", 134173, 106, "public"
data_entry = imas.DBEntry(
    MDSPLUS_BACKEND, database, pulse, run, user, data_version="3"
)
data_entry.open()
# Enable lazy loading with `lazy=True`:
core_profiles = data_entry.get("core_profiles", lazy=True)

# No data has been read from the lowlevel backend yet
# The time array is loaded only when we access it on the following lines:
time = core_profiles.time
print(f"Time has {len(time)} elements, between {time[0]} and {time[-1]}")

# Find the electron temperature at rho=0 for all time slices
electon_temperature_0 = numpy.array(
    [p1d.electrons.temperature[0] for p1d in core_profiles.profiles_1d]
)

# Plot the figure
fig, ax = plt.subplots()
ax.plot(time, electon_temperature_0)
ax.set_ylabel("$T_e$")
ax.set_xlabel("$t$")
plt.show()

In this example, using lazy loading with the MDSPLUS backend is about 12 times faster than a regular get(). When using the HDF5 backend, lazy loading is about 300 times faster for this example.

Caveats of lazy loaded IDSs¶

Lazy loading of data may speed up your programs, but also comes with some limitations.

Some functionality is not implemented or works differently for lazy-loaded IDSs:
- Iterating over non-empty nodes works differently, see API documentation: imas.ids_structure.IDSStructure.iter_nonempty_().
- has_value() is not implemented for lazy-loaded structure elements.
- validate() will only validate loaded data. Additional data might be loaded from the backend to validate coordinate sizes.
- imas.util.print_tree() will only print data that is loaded when hide_empty_nodes is True.
- imas.util.visit_children():
  - When visit_empty is False (default), this method uses iter_nonempty_(). This raises an error for lazy-loaded IDSs, unless you set accept_lazy to True.
  - When visit_empty is True, this will iteratively load all data from the backend. This is effectively a full, but less efficient, get()/get_slice(). It will be faster if you don’t use lazy loading in this case.
- IDS conversion through imas.convert_ids is not implemented for lazy loaded IDSs. Note that Automatic conversion between DD versions also applies when lazy loading.
- Lazy loaded IDSs are read-only, setting or changing values, resizing arrays of structures, etc. is not allowed.
- You cannot put(), put_slice() or serialize() lazy-loaded IDSs.
- Copying lazy-loaded IDSs (through copy.deepcopy()) is not implemented.
IMAS-Python assumes that the underlying data entry is not modified.

When you (or another user) overwrite or add data to the same data entry, you may end up with a mix of old and new data in the lazy loaded IDS.

After you close the data entry, no new elements can be loaded.

>>> core_profiles = data_entry.get("core_profiles", lazy=True)
>>> data_entry.close()
>>> print(core_profiles.time)
Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
RuntimeError: Cannot lazy load the requested data: the data entry is no longer
available for reading. Hint: did you close() the DBEntry?

Lazy loading has more overhead for reading data from the lowlevel: it is therefore more efficient to do a full get() or get_slice() when you intend to use most of the data stored in an IDS.
When using IMAS-Python with remote data access (i.e. the UDA backend), a full get() or get_slice() may be more efficient than using lazy loading.

It is recommended to add the parameter ;cache_mode=none [1] to the end of a UDA IMAS URI when using lazy loading: otherwise the UDA backend will still load the full IDS from the remote server.

Last update: 2026-01-28