Using Data Dictionary metadata¶
IMAS-Python provides convenient access to Data Dictionary metadata of any IDS node through
the metadata attribute:
>>> import imas
>>> core_profiles = imas.IDSFactory().core_profiles()
>>> core_profiles.metadata
<IDSMetadata for 'core_profiles'>
>>> core_profiles.time.metadata
<IDSMetadata for 'time'>
>>> # etc.
In this lesson we will show how to work with this metadata by exploring a couple of use cases.
Overview of available metadata¶
The data dictionary metadata that is parsed by IMAS-Python is listed in the API
documentation for IDSMetadata.
Note that not all metadata from the IMAS Data Dictionary is parsed by IMAS-Python.
This metadata is still accessible on the metadata attribute. You can use
imas.util.inspect() to get an overview of all metadata associated to an
element in an IDS.
core_profiles elements.¶>>> import imas
>>> core_profiles = imas.IDSFactory().core_profiles()
>>> imas.util.inspect(core_profiles.metadata)
╭---- <class 'imas.ids_metadata.IDSMetadata'> -----╮
│ Container for IDS Metadata │
│ │
│ ╭------------------------------------------------╮ │
│ │ <IDSMetadata for 'core_profiles'> │ │
│ ╰------------------------------------------------╯ │
│ │
│ alternative_coordinates = () │
│ coordinates = () │
│ coordinates_same_as = () │
│ data_type = None │
│ documentation = 'Core plasma profiles' │
│ lifecycle_last_change = '3.39.0' │
│ lifecycle_status = 'active' │
│ lifecycle_version = '3.1.0' │
│ maxoccur = 15 │
│ name = 'core_profiles' │
│ ndim = 0 │
│ path = IDSPath('') │
│ path_doc = '' │
│ path_string = '' │
│ specific_validation_rules = 'yes' │
│ timebasepath = '' │
│ type = <IDSType.NONE: None> │
│ units = '' │
╰----------------------------------------------------╯
>>> imas.util.inspect(core_profiles.time.metadata)
╭------ <class 'imas.ids_metadata.IDSMetadata'> -------╮
│ Container for IDS Metadata │
│ │
│ ╭----------------------------------------------------╮ │
│ │ <IDSMetadata for 'time'> │ │
│ ╰----------------------------------------------------╯ │
│ │
│ alternative_coordinates = () │
│ coordinate1 = IDSCoordinate('1...N') │
│ coordinates = (IDSCoordinate('1...N'),) │
│ coordinates_same_as = (IDSCoordinate(''),) │
│ data_type = <IDSDataType.FLT: 'FLT'> │
│ documentation = 'Generic time' │
│ maxoccur = None │
│ name = 'time' │
│ ndim = 1 │
│ path = IDSPath('time') │
│ path_doc = 'time(:)' │
│ path_string = 'time' │
│ timebasepath = 'time' │
│ type = <IDSType.DYNAMIC: 'dynamic'> │
│ units = 's' │
╰--------------------------------------------------------╯
Coordinate metadata¶
The Data Dictionary has coordinate information on all non-scalar nodes: arrays of structures and data nodes that are not 0D. These coordinate descriptions can become quite complicated, but summarized they come in two categories:
Coordinates are indices.
This is indicated by the Data Dictionary as coordinate =
1...{x}. Here{x}can be a number (e.g.1...3), which means that this dimension should have exactlyxelements.{x}can also be a literalN:1...N, meaning that the size of this dimension does not have a predetermined size.Sometimes multiple variables have index variables, but they are still linked. For example, image sensors could have one variable indicating raw observed values per pixel, and another variable storing some processed quantities per pixel. In this case, the coordinates are indices (line / column index of the pixel), but these must be the same for both quantities. This information is stored in the
coordinates_same_asmetadata.Coordinates are other quantities in the Data Dictionary.
This is indicated by the Data Dictionary by specifying the path to the coordinate. There are multiple scenarios here, which are described in more detail in the section Using coordinates of quantities.
For most use cases it is not necessary to become an expert in all
intricacies of Data Dictionary coordinates. Instead, you can use the coordinates
attribute of array of structures and data nodes. For example <ids
node>.coordinates[0] will give you the data to use for the first coordinate.
Exercise 1: Using coordinates¶
Load the training data for the
core_profilesIDS. You can refresh how to do this in the following section of the basic training material: Open an IMAS database entry.Print the coordinate of
profiles_1d[0].electrons.temperature. This is a 1D array, so there is only one coordinate. It can be accessed with<node>.coordinates[0]. Do you recognize the coordinate?Print the coordinate of the
profiles_1darray of structures. What do you notice?Change the time mode of the IDS from homogeneous time to heterogeneous time. You do this by setting
ids_properties.homogeneous_time = imas.ids_defs.IDS_TIME_MODE_HETEROGENEOUS. Print the coordinate of theprofiles_1darray of structure again. What has changed?
Load the training data for the
equilibriumIDS.What is the coordinate of
time_slice[0]/profiles_2d?What are the coordinates of
time_slice[0]/profiles_2d[0]/b_field_r?
import imas.training
# 1. Load the training data for the core_profiles IDS:
entry = imas.training.get_training_db_entry()
core_profiles = entry.get("core_profiles")
# 1a. Print the coordinate of profiles_1d[0].electrons.temperature
print(core_profiles.profiles_1d[0].electrons.temperature.coordinates[0])
# Do you recognize the coordinate? Yes, as shown in the first line of the output, this
# is "profiles_1d[0]/grid/rho_tor_norm".
# 1b. Print the coordinate of profiles_1d:
print(core_profiles.profiles_1d.coordinates[0])
# What do you notice? This prints the core_profiles.time array:
# <IDSNumericArray (IDS:core_profiles, time, FLT_1D)>
# numpy.ndarray([ 3.98722186, 432.93759781, 792. ])
# 1c. Change the time mode and print again
core_profiles.ids_properties.homogeneous_time = \
imas.ids_defs.IDS_TIME_MODE_HETEROGENEOUS
print(core_profiles.profiles_1d.coordinates[0])
# What has changed? Now we get a numpy array with values -9e+40:
# [-9.e+40 -9.e+40 -9.e+40]
#
# In heterogeneous time, the coordinate of profiles_1d is profiles_1d/time, which is a
# scalar. IMAS-Python will construct a numpy array for you where
# array[i] := profiles_1d[i]/time
# Since we didn't set these values, they are set to the default EMPTY_FLOAT, which is
# -9e+40.
# 2. Load the training data for the equilibrium IDS:
equilibrium = entry.get("equilibrium")
# 2a. What is the coordinate of time_slice/profiles_2d?
slice0 = equilibrium.time_slice[0]
print(slice0.profiles_2d.metadata.coordinates)
# This will output:
# (IDSCoordinate('1...N'),)
# The coordinate of profiles_2d is an index. When requesting the coordinate values,
# IMAS-Python will generate an index array for you:
print(slice0.profiles_2d.coordinates[0])
# -> array([0])
# 2b. What are the coordinates of ``time_slice/profiles_2d/b_field_r``?
print(slice0.profiles_2d[0].b_field_r.metadata.coordinates)
# This is a 2D array and therefore there are two coordinates:
# (IDSCoordinate('time_slice(itime)/profiles_2d(i1)/grid/dim1'),
# IDSCoordinate('time_slice(itime)/profiles_2d(i1)/grid/dim2'))
Exercise 2: Alternative coordinates¶
Create an empty
distributionsIDS.Use the
metadataattribute to find the coordinates ofdistribution[]/profiles_2d[]/density. What do you notice?Hint
distributionandprofiles_2dare arrays of structures. When creating an empty IDS, these arrays of structures are empty as well.To access the metadata of the structures inside, you have two options:
Resize the array of structures so you can access the metadata of the elements.
Use the indexing operator on
IDSMetadata. For example,distributions.metadata["distribution/wave"]to get the metadata of thedistribution[]/wavearray of structures.
Resize the
distributionanddistribution[0].profiles_2darrays of structures. Retrieve the coordinate values through thedistribution[0].profiles_2d[0].density.coordinatesattribute. What do you notice?You can still use the metadata to go to the coordinate node options:
Use the
referencesattribute of theIDSCoordinateobjects in themetadatato get the paths to each of the coordinate options. This will give you theIDSPathobjects for each coordinate option.Then, use
IDSPath.gototo go to the corresponding IDS node.
import imas
# 1. Create an empty distributions IDS
distributions = imas.IDSFactory().distributions()
# 2. Use the metadata attribute to find the coordinates of
# distribution/profiles_2d/density
print(distributions.metadata["distribution/profiles_2d/density"].coordinates)
# Alternative, by resizing the Arrays of Structures:
distributions.distribution.resize(1)
distributions.distribution[0].profiles_2d.resize(1)
p2d = distributions.distribution[0].profiles_2d[0]
print(p2d.density.metadata.coordinates)
# This outputs (newlines added for clarity):
# (IDSCoordinate('distribution(i1)/profiles_2d(itime)/grid/r
# OR distribution(i1)/profiles_2d(itime)/grid/rho_tor_norm'),
# IDSCoordinate('distribution(i1)/profiles_2d(itime)/grid/z
# OR distribution(i1)/profiles_2d(itime)/grid/theta_geometric
# OR distribution(i1)/profiles_2d(itime)/grid/theta_straight'))
#
# What do you notice: in both dimensions there are multiple options for the coordinate.
# 3. Retrieve the coordinate values through the ``coordinates`` attribute.
# This will raise a coordinate lookup error because IMAS-Python cannot choose which of the
# coordinates to use:
try:
print(p2d.density.coordinates[0])
except Exception as exc:
print(exc)
# 4a. Use the IDSCoordinate.references attribute:
# Example for the first dimension:
coordinate_options = p2d.density.metadata.coordinates[0].references
# 4b. Use IDSPath.goto:
for option in coordinate_options:
coordinate_node = option.goto(p2d.density)
print(coordinate_node)
# This will print:
# <IDSNumericArray (IDS:distributions, distribution[0]/profiles_2d[0]/grid/r, empty FLT_1D)>
# <IDSNumericArray (IDS:distributions, distribution[0]/profiles_2d[0]/grid/rho_tor_norm, empty FLT_1D)>
Units and dimensional analysis with Pint¶
Note
This section uses the python package Pint to perform calculations with units. This package can be installed by following the instructions on their website.
The Data Dictionary specifies the units of stored quantities. This metadata is
accessible in IMAS-Python via metadata.units. In most cases, these units are in a format
that pint can understand (for example T, Wb, m^-3, m.s^-1).
There are some exceptions to that, with the main ones - (indicating a quantity is
dimensionless), Atomic Mass Unit and Elementary Charge Unit. There are also
cases when units are dependent on the context that a quantity is used, but we will not
go into that in this lesson.
For conversion of units from the Data Dictionary format to pint units, we recommend creating a custom function, such as the following:
# Create pint UnitRegistry
ureg = pint.UnitRegistry()
# Convert DD units to Pint Units
_dd_to_pint = {
"-": ureg("dimensionless"),
"Atomic Mass Unit": ureg("unified_atomic_mass_unit"),
"Elementary Charge Unit": ureg("elementary_charge"),
}
def dd_to_pint(dd_unit):
if dd_unit in _dd_to_pint:
return _dd_to_pint[dd_unit]
return ureg(dd_unit)
Exercise 3: Calculate the mass density from core_profiles/profiles_1d¶
Load the training data for the
core_profilesIDS.Select the first time slice of
profiles_1dfor the calculation.Create a
pint.UnitRegistryand conversion function from DD units to pint units.Calculate the mass density:
Create the result variable with the correct unit (
kg.m^-3):mass_density = ureg("0 kg.m^-3").Loop over all ion and neutral species in profiles_1d. For each one, calculate the mass of the species (the sum of the masses of the elements that comprise the species) and multiply it with the species density to get the mass density of the species.
Use the
metadata.unitsanddd_to_pintconversion function to get the correct units during the calculation.Print the total mass density (the sum of all species mass densities) in SI units (
kg.m^-3).
import itertools # python standard library iteration tools
import imas
import imas.training
import pint
# 1. Load core_profiles IDS from training DBEntry
entry = imas.training.get_training_db_entry()
cp = entry.get("core_profiles")
# 2. Select the first time slice of profiles_1d
p1d = cp.profiles_1d[0]
# 3.
# Create pint UnitRegistry
ureg = pint.UnitRegistry()
# Convert DD units to Pint Units
_dd_to_pint = {
"-": ureg("dimensionless"),
"Atomic Mass Unit": ureg("unified_atomic_mass_unit"),
"Elementary Charge Unit": ureg("elementary_charge"),
}
def dd_to_pint(dd_unit):
if dd_unit in _dd_to_pint:
return _dd_to_pint[dd_unit]
return ureg(dd_unit)
# End of translation
# 4. Calculate mass density:
# 4a. Create mass_density variable with units:
mass_density = ureg("0 kg.m^-3")
# 4b. Loop over all ion and neutral species
for species in itertools.chain(p1d.ion, p1d.neutral):
mass = sum(
element.a * dd_to_pint(element.a.metadata.units)
for element in species.element
)
density = species.density * dd_to_pint(species.density.metadata.units)
mass_density += mass * density
# 4c. Print the total mass density
print(mass_density)
# Note that the species mass is given in Atomic Mass Units, but pint
# automatically converted this to kilograms for us, because we defined
# mass_density in kg/m^3!