IMAS-Python architecture¶
This document provides a brief overview of the components of IMAS-Python, grouped into different functional areas.
We don’t aim to give detailed explanations of the code or the algorithms in it. These should be annotated in more detail in docstrings and inline comments.
Data Dictionary metadata¶
These classes are used to parse and represent IDS metadata from the Data Dictionary. Metadata objects are generated from a Data Dictionary XML and are (supposed to be) immutable.
imas.ids_metadatacontains the main metadata classIDSMetadata. This class is generated from an<IDS>or<field>element in the Data Dictionary XML and contains all (parsed) data belonging to that<IDS>or<field>. Most of the (Python) attributes correspond directly to an attribute of the XML element.This module also contains the
IDSTypeenum. This enum corresponds to the Data Dictionary notion oftypewhich can bedynamic,constant,staticor unavailable on a Data Dictionary element.imas.ids_coordinatescontains two classes:IDSCoordinate, which handles the parsing of coordinate identifiers from the Data Dictionary, andIDSCoordinates, which handles coordinate retrieval and validation of IDS nodes.IDSCoordinates are created for each coordinate attribute of a Data Dictionary element:coordinate1,coordinate2, …coordinate1_same_as, etc.IDSCoordinatesis created and assigned ascoordinatesattribute ofIDSStructArrayandIDSPrimitiveobjects. This class is responsible for retrieving coordinate values and for checking the coordinate consistency invalidate().imas.ids_data_typehandles parsing Data Dictionarydata_typeattributes (see methodparse()) to anIDSDataTypeand number of dimensions.IDSDataTypealso has attributes for default values and mappings to Python / Numpy / Access Layer type identifiers.imas.ids_pathhandles parsing of IDS paths toIDSPathobjects. Paths can occur as thepathattribute of Data Dictionary elements, and inside coordinate identifiers.Caution
Although an
IDSPathin IMAS-Python implements roughly the same concept as the “IDS Path syntax” in the Data Dictionary, they are not necessarily the same thing!At the moment of writing this (January 2024), the IDS path definition in the Data Dictionary is not yet finalized. Be aware that the syntax of IMAS-Python’s
IDSPathmay differ slightly and might be incompatible with the definition from the Data Dictionary.
Data Dictionary building and loading¶
The following submodules are responsible for building the Data Dictionary and loading DD definitions at runtime.
imas.dd_ziphandles loading the Data Dictionary definitions at run time.
IDS nodes¶
The following submodules and classes represent IDS nodes.
imas.ids_basedefines the base class for all IDS nodes:IDSBase. This class is an abstract class and shouldn’t be instantiated directly.Several useful properties are defined in this class, which are therefore available on any IDS node:
_time_modereturns theids_properties/homogeneous_timenode_parentreturns the parent object. Some examples:>>> core_profiles = imas.IDSFactory().core_profiles() >>> core_profiles._parent <imas.ids_factory.IDSFactory object at 0x7faa06bfac70> >>> core_profiles.ids_properties._parent <IDSToplevel (IDS:core_profiles)> >>> core_profiles.ids_properties.homogeneous_time._parent <IDSStructure (IDS:core_profiles, ids_properties)> >>> core_profiles.profiles_1d.resize(1) >>> core_profiles.profiles_1d[0]._parent <IDSStructArray (IDS:core_profiles, profiles_1d with 1 items)> >>> core_profiles.profiles_1d[0].time._parent <IDSStructure (IDS:core_profiles, profiles_1d[0])>_dd_parentreturns the “data-dictionary” parent. This is usually the same as the_parent, except for Arrays of Structures:>>> core_profiles = imas.IDSFactory().core_profiles() >>> core_profiles._dd_parent <imas.ids_factory.IDSFactory object at 0x7faa06bfac70> >>> core_profiles.ids_properties._dd_parent <IDSToplevel (IDS:core_profiles)> >>> core_profiles.ids_properties.homogeneous_time._dd_parent <IDSStructure (IDS:core_profiles, ids_properties)> >>> core_profiles.profiles_1d.resize(1) >>> # Note: _dd_parent for this structure is different from its parent: >>> core_profiles.profiles_1d[0]._dd_parent <IDSStructure (IDS:core_profiles, ids_properties)> >>> core_profiles.profiles_1d[0].time._dd_parent <IDSStructure (IDS:core_profiles, profiles_1d[0])>_pathgives the path to this IDS node, including Array of Structures indices._lazyindicates if the IDS is lazy loaded._versionis the Data Dictionary version of this node._toplevelis a shortcut to theIDSToplevelelement that this node is a decendent of.
imas.ids_primitivecontains all data node classes, which are child classes ofIDSPrimitive.IDSPrimitiveimplements all functionality that is common for every data type, whereas the classes in below list are specific per data type.Assignment-time data type checking is handled by the setter of the
valueproperty and the_cast_valuemethods on each of the type specialization classes.IDSString0Dis the type specialization for 0D strings. It can be used as if it is a pythonstrobject.IDSString1Dis the type specialization for 1D strings. It behaves as if it is a pythonlistofstr.IDSNumeric0Dis the base class for 0D numerical types:IDSComplex0Dis the type specialization for 0D complex numbers. It can be used as if it is a pythoncomplex.IDSFloat0Dis the type specialization for 0D floating point numbers. It can be used as if it is a pythonfloat.IDSInt0Dis the type specialization for 0D whole numbers. It can be used as if it is a pythonint.
IDSNumericArrayis the type specialization for any numeric type with at least one dimension. It can be used as if it is anumpy.ndarray.
imas.ids_struct_arraycontains theIDSStructArrayclass, which models Arrays of Structures. It also contains some Lazy loading logic.imas.ids_structurecontains theIDSStructureclass, which models Structures. It contains the Lazy instantiation logic and some of the Lazy loading logic.imas.ids_toplevelcontains theIDSToplevelclass, which is a subclass ofIDSStructureand models toplevel IDSs.It implements some API methods that are only available on IDSs, such as
validateand(de)serialize, and overwrites implementations of some properties.
Lazy instantiation¶
IDS nodes are instantiated only when needed. This is handled by
IDSStructure.__getattr__. When a new IDS Structure is created, it initially doesn’t
have any IDS child nodes instantiated:
>>> import imas
>>> # Create an empty IDS
>>> cp = imas.IDSFactory().core_profiles()
>>> # Show which elements are already created:
>>> list(cp.__dict__)
['_lazy', '_children', '_parent', 'metadata', '__doc__', '_lazy_context']
>>> # When we request a child element, it is automatically created:
>>> cp.time
<IDSNumericArray (IDS:core_profiles, time, empty FLT_1D)>
>>> list(cp.__dict__)
['_lazy', '_children', '_parent', 'metadata', '__doc__', '_lazy_context',
'time', '_toplevel']
This improves performance by creating fewer python objects: in most use cases, only a subset of the nodes in an IDS will be used. These use cases benefit a lot from lazy instantiation.
Lazy loading¶
Lazy loading defers reading the data from the backend in a
get() or get_slice()
until the data is requested. This is handled in two places:
IDSStructure.__getattr__implements the lazy loading alongside the lazy instantiation. When a new element is created by lazy instantiation, it will callimas.db_entry_helpers._get_childto lazy load this element:When the element is a data node (
IDSPrimitivesubclass), the data for this element is loaded from the backend.When the element is another structure, nothing needs to be loaded from the backend. Instead, we store the
contexton the createdIDSStructureand data loading is handled recursively when needed.When the element is an Array of Structures, we also only store the
contexton the createdIDSStructArray. Loading is handled as described in point 2.
IDSStructArray._loadimplements the lazy loading of array of structures and their elements. This is triggered whenever an element is accessed (__getitem__) or the size of the Array of Structures is requested (__len__).
Creating and loading IDSs¶
imas.db_entrycontains theDBEntryclass. This class represents an on-disk Data Entry and can be used to store (put(),put_slice()) or load (get(),get_slice()) IDSs. The actual implementation of data storage and retrieval is handled by the backends in theimas.backends.*subpackages.DBEntryhandles the autoconversion between IDS versions as described in Automatic conversion between DD versions.imas.ids_factorycontains theIDSFactoryclass. This class is responsible for creating IDS toplevels from a given Data Dictionary definition, and can list all IDS names inside a DD definition.
Access Layer interfaces¶
imas.backends.imas_core.al_contextprovides an object-oriented interface when working with Lowlevel contexts. The contexts returned by the lowlevel are an integer identifier and need to be provided to several LL methods (e.g.read_data), some of which may create new contexts.The
ALContextclass implements this object oriented interface.A second class (
LazyALContext) implements the same interface, but is used when Lazy loading.imas.ids_defsprovides access to Access Layer constantsimas.backends.imas_core.imas_interfaceprovides a version-independent interface to the Access Layer throughLowlevelInterface. It defines all known methods of the Access Layer and defers to the correct implementation if it is available in the loaded AL version (and raises a descriptive exception if the function is not available).
MDSplus support¶
imas.backends.imas_core.mdsplus_modelis responsible for creating MDSplus models. These models are specific to a DD version and are required when using the MDSplus backend for creating new Data Entries.See also
Versioning¶
IMAS-Python uses setuptools-scm for
versioning. An IMAS-Python release has a corresponding tag (which sets the version).
The imas._version module is generated by setuptools-scm and implements this logic
for editable installs. This module is generated by setuptools-scm when building python
packages.
Conversion between Data Dictionary versions¶
imas.ids_convert contains logic for converting an IDS between DD versions.
The DDVersionMap class creates and contains mappings for
an IDS between two Data Dictionary versions. It creates two mappings: one to be used
when converting from the newer version of the two to the older version (new_to_old)
and a map for the reverse (old_to_new). These mappings are of type
NBCPathMap. See its API documentation for more details.
convert_ids() is the main API method for converting IDSs
between versions. It works as follows:
It builds a
DDVersionMapbetween the two DD versions version and selects the correctNBCPathMap(new_to_oldorold_to_new).If needed, it creates a target IDS of the destination DD version.
It then uses the
NBCPathMapto convert data and store it in the target IDS.
DBEntry can also handle automatic DD version conversion. It
uses the same DDVersionMap and NBCPathMap as
convert_ids(). When reading data from the backends, the
NBCPathMap is used to translate between the old and the new DD version. See the
implementation in imas.backends.imas_core.db_entry_helpers.
Miscelleneous¶
The following is a list of miscelleneous modules, which don’t belong to any of the other categories on this page.
imas.exceptioncontains all Exception classes that IMAS-Python may raise.imas.setup_logginginitializes a logging handler for IMAS-Python.imas.trainingcontains helper methods for making training data available.imas.utilcontains useful utility methods. It is imported automatically.All methods requiring third party libraries (
richandscipy) are implemented inimas._util. This avoids importing these libraries immediately when a user importsimas(which can take a couple hundred milliseconds). Instead, this module is only loaded when a user needs this functionality.