Using Data Dictionary metadata

IMAS-Python provides convenient access to Data Dictionary metadata of any IDS node through the metadata attribute:

>>> import imas
>>> core_profiles = imas.IDSFactory().core_profiles()
>>> core_profiles.metadata
<IDSMetadata for 'core_profiles'>
>>> core_profiles.time.metadata
<IDSMetadata for 'time'>
>>> # etc.

In this lesson we will show how to work with this metadata by exploring a couple of use cases.

Overview of available metadata

The data dictionary metadata that is parsed by IMAS-Python is listed in the API documentation for IDSMetadata.

Note that not all metadata from the IMAS Data Dictionary is parsed by IMAS-Python. This metadata is still accessible on the metadata attribute. You can use imas.util.inspect() to get an overview of all metadata associated to an element in an IDS.

Example showing all metadata for some core_profiles elements.
>>> import imas
>>> core_profiles = imas.IDSFactory().core_profiles()
>>> imas.util.inspect(core_profiles.metadata)
╭---- <class 'imas.ids_metadata.IDSMetadata'> -----╮
│ Container for IDS Metadata                         │
│                                                    │
│ ╭------------------------------------------------╮ │
│ │ <IDSMetadata for 'core_profiles'>              │ │
│ ╰------------------------------------------------╯ │
│                                                    │
│   alternative_coordinates = ()                     │
│               coordinates = ()                     │
│       coordinates_same_as = ()                     │
│                 data_type = None                   │
│             documentation = 'Core plasma profiles' │
│     lifecycle_last_change = '3.39.0'               │
│          lifecycle_status = 'active'               │
│         lifecycle_version = '3.1.0'                │
│                  maxoccur = 15                     │
│                      name = 'core_profiles'        │
│                      ndim = 0                      │
│                      path = IDSPath('')            │
│                  path_doc = ''                     │
│               path_string = ''                     │
│ specific_validation_rules = 'yes'                  │
│              timebasepath = ''                     │
│                      type = <IDSType.NONE: None>   │
│                     units = ''                     │
╰----------------------------------------------------╯
>>> imas.util.inspect(core_profiles.time.metadata)
╭------ <class 'imas.ids_metadata.IDSMetadata'> -------╮
│ Container for IDS Metadata                             │
│                                                        │
│ ╭----------------------------------------------------╮ │
│ │ <IDSMetadata for 'time'>                           │ │
│ ╰----------------------------------------------------╯ │
│                                                        │
│ alternative_coordinates = ()                           │
│             coordinate1 = IDSCoordinate('1...N')       │
│             coordinates = (IDSCoordinate('1...N'),)    │
│     coordinates_same_as = (IDSCoordinate(''),)         │
│               data_type = <IDSDataType.FLT: 'FLT'>     │
│           documentation = 'Generic time'               │
│                maxoccur = None                         │
│                    name = 'time'                       │
│                    ndim = 1                            │
│                    path = IDSPath('time')              │
│                path_doc = 'time(:)'                    │
│             path_string = 'time'                       │
│            timebasepath = 'time'                       │
│                    type = <IDSType.DYNAMIC: 'dynamic'> │
│                   units = 's'                          │
╰--------------------------------------------------------╯

Coordinate metadata

The Data Dictionary has coordinate information on all non-scalar nodes: arrays of structures and data nodes that are not 0D. These coordinate descriptions can become quite complicated, but summarized they come in two categories:

  1. Coordinates are indices.

    This is indicated by the Data Dictionary as coordinate = 1...{x}. Here {x} can be a number (e.g. 1...3), which means that this dimension should have exactly x elements. {x} can also be a literal N: 1...N, meaning that the size of this dimension does not have a predetermined size.

    Sometimes multiple variables have index variables, but they are still linked. For example, image sensors could have one variable indicating raw observed values per pixel, and another variable storing some processed quantities per pixel. In this case, the coordinates are indices (line / column index of the pixel), but these must be the same for both quantities. This information is stored in the coordinates_same_as metadata.

  2. Coordinates are other quantities in the Data Dictionary.

    This is indicated by the Data Dictionary by specifying the path to the coordinate. There are multiple scenarios here, which are described in more detail in the section Using coordinates of quantities.

For most use cases it is not necessary to become an expert in all intricacies of Data Dictionary coordinates. Instead, you can use the coordinates attribute of array of structures and data nodes. For example <ids node>.coordinates[0] will give you the data to use for the first coordinate.

Exercise 1: Using coordinates

  1. Load the training data for the core_profiles IDS. You can refresh how to do this in the following section of the basic training material: Open an IMAS database entry.

    1. Print the coordinate of profiles_1d[0].electrons.temperature. This is a 1D array, so there is only one coordinate. It can be accessed with <node>.coordinates[0]. Do you recognize the coordinate?

    2. Print the coordinate of the profiles_1d array of structures. What do you notice?

    3. Change the time mode of the IDS from homogeneous time to heterogeneous time. You do this by setting ids_properties.homogeneous_time = imas.ids_defs.IDS_TIME_MODE_HETEROGENEOUS. Print the coordinate of the profiles_1d array of structure again. What has changed?

  2. Load the training data for the equilibrium IDS.

    1. What is the coordinate of time_slice[0]/profiles_2d?

    2. What are the coordinates of time_slice[0]/profiles_2d[0]/b_field_r?

import imas.training

# 1. Load the training data for the core_profiles IDS:
entry = imas.training.get_training_db_entry()
core_profiles = entry.get("core_profiles")

# 1a. Print the coordinate of profiles_1d[0].electrons.temperature
print(core_profiles.profiles_1d[0].electrons.temperature.coordinates[0])
# Do you recognize the coordinate? Yes, as shown in the first line of the output, this
# is "profiles_1d[0]/grid/rho_tor_norm".

# 1b. Print the coordinate of profiles_1d:
print(core_profiles.profiles_1d.coordinates[0])
# What do you notice? This prints the core_profiles.time array:
#   <IDSNumericArray (IDS:core_profiles, time, FLT_1D)>
#   numpy.ndarray([  3.98722186, 432.93759781, 792.        ])

# 1c. Change the time mode and print again
core_profiles.ids_properties.homogeneous_time = \
    imas.ids_defs.IDS_TIME_MODE_HETEROGENEOUS
print(core_profiles.profiles_1d.coordinates[0])
# What has changed? Now we get a numpy array with values -9e+40:
#   [-9.e+40 -9.e+40 -9.e+40]
#
# In heterogeneous time, the coordinate of profiles_1d is profiles_1d/time, which is a
# scalar. IMAS-Python will construct a numpy array for you where
#   array[i] := profiles_1d[i]/time
# Since we didn't set these values, they are set to the default EMPTY_FLOAT, which is
# -9e+40.

# 2. Load the training data for the equilibrium IDS:
equilibrium = entry.get("equilibrium")

# 2a. What is the coordinate of time_slice/profiles_2d?
slice0 = equilibrium.time_slice[0]
print(slice0.profiles_2d.metadata.coordinates)
# This will output:
#   (IDSCoordinate('1...N'),)
# The coordinate of profiles_2d is an index. When requesting the coordinate values,
# IMAS-Python will generate an index array for you:
print(slice0.profiles_2d.coordinates[0])
# -> array([0])

# 2b. What are the coordinates of ``time_slice/profiles_2d/b_field_r``?
print(slice0.profiles_2d[0].b_field_r.metadata.coordinates)
# This is a 2D array and therefore there are two coordinates:
#   (IDSCoordinate('time_slice(itime)/profiles_2d(i1)/grid/dim1'),
#    IDSCoordinate('time_slice(itime)/profiles_2d(i1)/grid/dim2'))

Exercise 2: Alternative coordinates

  1. Create an empty distributions IDS.

  2. Use the metadata attribute to find the coordinates of distribution[]/profiles_2d[]/density. What do you notice?

    Hint

    distribution and profiles_2d are arrays of structures. When creating an empty IDS, these arrays of structures are empty as well.

    To access the metadata of the structures inside, you have two options:

    1. Resize the array of structures so you can access the metadata of the elements.

    2. Use the indexing operator on IDSMetadata. For example, distributions.metadata["distribution/wave"] to get the metadata of the distribution[]/wave array of structures.

  3. Resize the distribution and distribution[0].profiles_2d arrays of structures. Retrieve the coordinate values through the distribution[0].profiles_2d[0].density.coordinates attribute. What do you notice?

  4. You can still use the metadata to go to the coordinate node options:

    1. Use the references attribute of the IDSCoordinate objects in the metadata to get the paths to each of the coordinate options. This will give you the IDSPath objects for each coordinate option.

    2. Then, use IDSPath.goto to go to the corresponding IDS node.

import imas

# 1. Create an empty distributions IDS
distributions = imas.IDSFactory().distributions()

# 2. Use the metadata attribute to find the coordinates of
#    distribution/profiles_2d/density
print(distributions.metadata["distribution/profiles_2d/density"].coordinates)
# Alternative, by resizing the Arrays of Structures:
distributions.distribution.resize(1)
distributions.distribution[0].profiles_2d.resize(1)
p2d = distributions.distribution[0].profiles_2d[0]
print(p2d.density.metadata.coordinates)
# This outputs (newlines added for clarity):
#  (IDSCoordinate('distribution(i1)/profiles_2d(itime)/grid/r
#                  OR distribution(i1)/profiles_2d(itime)/grid/rho_tor_norm'),
#   IDSCoordinate('distribution(i1)/profiles_2d(itime)/grid/z
#                  OR distribution(i1)/profiles_2d(itime)/grid/theta_geometric
#                  OR distribution(i1)/profiles_2d(itime)/grid/theta_straight'))
#
# What do you notice: in both dimensions there are multiple options for the coordinate.

# 3. Retrieve the coordinate values through the ``coordinates`` attribute.
# This will raise a coordinate lookup error because IMAS-Python cannot choose which of the
# coordinates to use:
try:
    print(p2d.density.coordinates[0])
except Exception as exc:
    print(exc)

# 4a. Use the IDSCoordinate.references attribute:
# Example for the first dimension:
coordinate_options = p2d.density.metadata.coordinates[0].references
# 4b. Use IDSPath.goto:
for option in coordinate_options:
    coordinate_node = option.goto(p2d.density)
    print(coordinate_node)
# This will print:
#   <IDSNumericArray (IDS:distributions, distribution[0]/profiles_2d[0]/grid/r, empty FLT_1D)>
#   <IDSNumericArray (IDS:distributions, distribution[0]/profiles_2d[0]/grid/rho_tor_norm, empty FLT_1D)>

Units and dimensional analysis with Pint

Note

This section uses the python package Pint to perform calculations with units. This package can be installed by following the instructions on their website.

The Data Dictionary specifies the units of stored quantities. This metadata is accessible in IMAS-Python via metadata.units. In most cases, these units are in a format that pint can understand (for example T, Wb, m^-3, m.s^-1).

There are some exceptions to that, with the main ones - (indicating a quantity is dimensionless), Atomic Mass Unit and Elementary Charge Unit. There are also cases when units are dependent on the context that a quantity is used, but we will not go into that in this lesson.

For conversion of units from the Data Dictionary format to pint units, we recommend creating a custom function, such as the following:

Convert DD units to Pint Units
# Create pint UnitRegistry
ureg = pint.UnitRegistry()

# Convert DD units to Pint Units
_dd_to_pint = {
    "-": ureg("dimensionless"),
    "Atomic Mass Unit": ureg("unified_atomic_mass_unit"),
    "Elementary Charge Unit": ureg("elementary_charge"),
}
def dd_to_pint(dd_unit):
    if dd_unit in _dd_to_pint:
        return _dd_to_pint[dd_unit]
    return ureg(dd_unit)

Exercise 3: Calculate the mass density from core_profiles/profiles_1d

  1. Load the training data for the core_profiles IDS.

  2. Select the first time slice of profiles_1d for the calculation.

  3. Create a pint.UnitRegistry and conversion function from DD units to pint units.

  4. Calculate the mass density:

    1. Create the result variable with the correct unit (kg.m^-3): mass_density = ureg("0 kg.m^-3").

    2. Loop over all ion and neutral species in profiles_1d. For each one, calculate the mass of the species (the sum of the masses of the elements that comprise the species) and multiply it with the species density to get the mass density of the species.

      Use the metadata.units and dd_to_pint conversion function to get the correct units during the calculation.

    3. Print the total mass density (the sum of all species mass densities) in SI units (kg.m^-3).

import itertools  # python standard library iteration tools

import imas
import imas.training
import pint

# 1. Load core_profiles IDS from training DBEntry
entry = imas.training.get_training_db_entry()
cp = entry.get("core_profiles")

# 2. Select the first time slice of profiles_1d
p1d = cp.profiles_1d[0]

# 3.
# Create pint UnitRegistry
ureg = pint.UnitRegistry()

# Convert DD units to Pint Units
_dd_to_pint = {
    "-": ureg("dimensionless"),
    "Atomic Mass Unit": ureg("unified_atomic_mass_unit"),
    "Elementary Charge Unit": ureg("elementary_charge"),
}
def dd_to_pint(dd_unit):
    if dd_unit in _dd_to_pint:
        return _dd_to_pint[dd_unit]
    return ureg(dd_unit)
# End of translation

# 4. Calculate mass density:
# 4a. Create mass_density variable with units:
mass_density = ureg("0 kg.m^-3")
# 4b. Loop over all ion and neutral species
for species in itertools.chain(p1d.ion, p1d.neutral):
    mass = sum(
        element.a * dd_to_pint(element.a.metadata.units)
        for element in species.element
    )
    density = species.density * dd_to_pint(species.density.metadata.units)
    mass_density += mass * density

# 4c. Print the total mass density
print(mass_density)
# Note that the species mass is given in Atomic Mass Units, but pint
# automatically converted this to kilograms for us, because we defined
# mass_density in kg/m^3!

Last update: 2026-01-28