Working with multiple data dictionary versions¶
Contrary to most high level interface for IMAS, IMAS-Python code is not tied to a specific version of the Data Dictionary. In this lesson we will explore how IMAS-Python handles different DD versions (including development builds of the DD), and how we can convert IDSs between different versions of the Data Dictionary.
Note
Most of the time you won’t need to worry about DD versions and the default IMAS-Python behaviour should be fine.
The default Data Dictionary version¶
In the other training lessons, we didn’t explicitly work with Data Dictionary versions. Therefore IMAS-Python was always using the default DD version. Let’s find out what that version is:
Exercise 1: The default DD version¶
Create an
imas.IDSFactory().Print the version of the DD that is used.
Create an empty IDS with this IDSFactory (any IDS is fine) and print the DD version of the IDS, see
get_data_dictionary_version(). What do you notice?Create an
imas.DBEntry, you may use theMEMORY_BACKEND. Print the DD version that is used. What do you notice?
import imas
from imas.util import get_data_dictionary_version
# 1. Create an IDSFactory
default_factory = imas.IDSFactory()
# 2. Print the DD version used by the IDSFactory
#
# This factory will use the default DD version, because we didn't explicitly indicate
# which version of the DD we want to use:
print("Default DD version:", default_factory.version)
# 3. Create an empty IDS
pf_active = default_factory.new("pf_active")
print("DD version used for pf_active:", get_data_dictionary_version(pf_active))
# What do you notice? This is the same version as the IDSFactory that was used to create
# it.
# 4. Create a new DBEntry
default_entry = imas.DBEntry(imas.ids_defs.MEMORY_BACKEND, "test", 0, 0)
default_entry.create()
# Alternative URI syntax when using AL5.0.0:
# default_entry = imas.DBEntry("imas:memory?path=.")
print("DD version used for the DBEntry:", get_data_dictionary_version(default_entry))
# What do you notice? It is the same default version again.
Okay, so now you know what your default DD version is. But how is it determined? IMAS-Python
first checks if you have an IMAS environment loaded by checking the environment variable
IMAS_VERSION. If you are on a cluster and have used module load IMAS or similar,
this environment variable will indicate what data dictionary version this module is
using. IMAS-Python will use that version as its default.
If the IMAS_VERSION environment is not set, IMAS-Python will take the newest version of
the Data Dictionary that came bundled with it. Which brings us to the following topic:
Bundled Data Dictionary definitions¶
IMAS-Python comes bundled [1] with many versions of the Data Dictionary definitions.
You can find out which versions are available by calling
imas.dd_zip.dd_xml_versions.
Converting an IDS between Data Dictionary versions¶
Newer versions of the Data Dictionary may introduce changes in IDS definitions. Some things that could change:
Introduce a new IDS node
Remove an IDS node
Change the data type of an IDS node
Rename an IDS node
IMAS-Python can convert between different versions of the DD and will migrate the data as much as possible. Let’s see how this works in the following exercise.
Exercise 2: Convert an IDS between DD versions¶
In this exercise we will work with a really old version of the data dictionary
for the pulse_schedule IDS because a number of IDS nodes were renamed for
this IDS.
Create an
imas.IDSFactory()for DD version3.25.0.Create a
pulse_scheduleIDS with this IDSFactory and verify that it is using DD version3.25.0.Fill the IDS with some test data:
pulse_schedule.ids_properties.homogeneous_time = \ imas.ids_defs.IDS_TIME_MODE_HOMOGENEOUS pulse_schedule.ids_properties.comment = \ "Testing renamed IDS nodes with IMAS-Python" pulse_schedule.time = [1., 1.1, 1.2] pulse_schedule.ec.antenna.resize(1) antenna = pulse_schedule.ec.antenna[0] antenna.name = "ec.antenna[0].name in DD 3.25.0" antenna.launching_angle_pol.reference_name = \ "ec.antenna[0].launching_angle_pol.reference_name in DD 3.25.0" antenna.launching_angle_pol.reference.data = [2.1, 2.2, 2.3] antenna.launching_angle_tor.reference_name = \ "ec.antenna[0].launching_angle_tor.reference_name in DD 3.25.0" antenna.launching_angle_tor.reference.data = [3.1, 3.2, 3.3]Use
imas.convert_idsto convert the IDS to DD version 3.39.0. Theantennastructure that we filled in the old version of the DD has since been renamed tolauncher, and thelaunching_angle_*structures tosteering_angle. Check that IMAS-Python has converted the data successfully (for example withimas.util.print_tree()).By default, IMAS-Python creates a shallow copy of the data, which means that the underlying data arrays are shared between the IDSs of both versions. Update the
timedata of the original IDS (for example:pulse_schedule.time[1] = 3) and print thetimedata of the converted IDS. Are they the same?Note
imas.convert_idshas an optional keyword argumentdeep_copy. If you set this toTrue, the converted IDS will not share data with the original IDS.Update the
ids_properties/commentin one version and print it in the other version. What do you notice?Sometimes data cannot be converted, for example when a node was added or removed, or when data types have changed. For example, set
pulse_schedule.ec.antenna[0].phase.reference_name = "Test refname"and perform the conversion to DD 3.39.0 again. What do you notice?
import imas
from imas.util import get_data_dictionary_version
# 1. Create an IDSFactory for DD 3.25.0
factory = imas.IDSFactory("3.25.0")
# 2. Create a pulse_schedule IDS
pulse_schedule = factory.new("pulse_schedule")
print(get_data_dictionary_version(pulse_schedule)) # This should print 3.25.0
# 3. Fill the IDS with some test data
pulse_schedule.ids_properties.homogeneous_time = \
imas.ids_defs.IDS_TIME_MODE_HOMOGENEOUS
pulse_schedule.ids_properties.comment = \
"Testing renamed IDS nodes with IMAS-Python"
pulse_schedule.time = [1., 1.1, 1.2]
pulse_schedule.ec.antenna.resize(1)
antenna = pulse_schedule.ec.antenna[0]
antenna.name = "ec.antenna[0].name in DD 3.25.0"
antenna.launching_angle_pol.reference_name = \
"ec.antenna[0].launching_angle_pol.reference_name in DD 3.25.0"
antenna.launching_angle_pol.reference.data = [2.1, 2.2, 2.3]
antenna.launching_angle_tor.reference_name = \
"ec.antenna[0].launching_angle_tor.reference_name in DD 3.25.0"
antenna.launching_angle_tor.reference.data = [3.1, 3.2, 3.3]
# 4. Convert the IDS from version 3.25.0 to 3.39.0
pulse_schedule_3_39 = imas.convert_ids(pulse_schedule, "3.39.0")
# Check that the data is converted
imas.util.print_tree(pulse_schedule_3_39)
# 5. Update time data
pulse_schedule.time[1] = 3
# Yes, the time array of the converted IDS is updated as well:
print(pulse_schedule_3_39.time) # [1., 3., 1.2]
# 6. Update ids_properties/comment
pulse_schedule.ids_properties.comment = "Updated comment"
print(pulse_schedule_3_39.ids_properties.comment)
# What do you notice?
# This prints the original value of the comment ("Testing renamed IDS
# nodes with IMAS-Python").
# This is actually the same that you get when creating a shallow copy
# with ``copy.copy`` of a regular Python dictionary:
import copy
dict1 = {"a list": [1, 1.1, 1.2], "a string": "Some text"}
dict2 = copy.copy(dict1)
print(dict2) # {"a list": [1, 1.1, 1.2], "a string": "Some text"}
# dict2 is a shallow copy, so dict1["a_list"] and dict2["a_list"] are
# the exact same object, and updating it is reflected in both dicts:
dict1["a list"][1] = 3
print(dict2) # {"a list": [1, 3, 1.2], "a string": "Some text"}
# Replacing a value in one dict doesn't update the other:
dict1["a string"] = "Some different text"
print(dict2) # {"a list": [1, 3, 1.2], "a string": "Some text"}
# 7. Set phase.reference_name:
pulse_schedule.ec.antenna[0].phase.reference_name = "Test refname"
# And convert again
pulse_schedule_3_39 = imas.convert_ids(pulse_schedule, "3.39.0")
imas.util.print_tree(pulse_schedule_3_39)
# What do you notice?
# Element 'ec/antenna/phase' does not exist in the target IDS. Data is not copied.
Automatic conversion between DD versions¶
When loading data (with get() or
get_slice()) or storing data (with
put() or
put_slice()), IMAS-Python automatically converts the DD
version for you. In this section we will see how that works.
The DBEntry DD version¶
A DBEntry object is tied to a specific version of the Data
Dictionary. We have already briefly seen this in Exercise 1: The default DD version.
The DD version can be selected when constructing a new DBEntry object, through the
dd_version or
xml_path (see also Using custom builds of the Data Dictionary) parameters. If you provide neither, the default DD
version is used.
When storing IDSs (put or put_slice), the DBEntry always converts the data
to its version before writing it to the backend. When loading IDSs (get or
get_slice) an option exists to disable autoconversion. Let’s see in the following
two exercises how this works exactly.
Exercise 3: Automatic conversion when storing IDSs¶
Load the training data for the
core_profilesIDS. You can refresh how to do this in the following section of the basic training material: Open an IMAS database entry.Print the DD version for the loaded
core_profilesIDS.Create a new
DBEntrywith DD version3.37.0.new_entry = imas.DBEntry( imas.ids_defs.MEMORY_BACKEND, "test", 0, 0, dd_version="3.37.0" )Put the
core_profilesIDS in the newDBEntry.Print the
core_profiles.ids_properties.version_put.data_dictionary. What do you notice?
import imas
import imas.training
from imas.util import get_data_dictionary_version
# 1. Load the training data for the ``core_profiles`` IDS
entry = imas.training.get_training_db_entry()
core_profiles = entry.get("core_profiles")
# 2. Print the DD version:
print(get_data_dictionary_version(core_profiles))
# 3. Create a new DBEntry with DD version 3.37.0
new_entry = imas.DBEntry(
imas.ids_defs.MEMORY_BACKEND, "test", 0, 0, dd_version="3.37.0"
)
new_entry.create()
# 4. Put the core_profiles IDS in the new DBEntry
new_entry.put(core_profiles)
# 5. Print version_put.data_dictionary
print(core_profiles.ids_properties.version_put.data_dictionary)
# -> 3.37.0
# What do you notice?
# The IDS was converted to the DD version of the DBEntry (3.37.0) when writing the
# data to the backend.
Exercise 4: Automatic conversion when loading IDSs¶
For this exercise we will first create some test data:
# Create an IDSFactory for DD 3.25.0 factory = imas.IDSFactory("3.25.0") # Create a pulse_schedule IDS pulse_schedule = factory.new("pulse_schedule") # Fill the IDS with some test data pulse_schedule.ids_properties.homogeneous_time = IDS_TIME_MODE_HOMOGENEOUS pulse_schedule.ids_properties.comment = "Testing renamed IDS nodes with IMAS-Python" pulse_schedule.time = [1.0, 1.1, 1.2] pulse_schedule.ec.antenna.resize(1) antenna = pulse_schedule.ec.antenna[0] antenna.name = "ec.antenna[0].name in DD 3.25.0" antenna.launching_angle_pol.reference_name = ( "ec.antenna[0].launching_angle_pol.reference_name in DD 3.25.0" ) antenna.launching_angle_pol.reference.data = [2.1, 2.2, 2.3] antenna.launching_angle_tor.reference_name = ( "ec.antenna[0].launching_angle_tor.reference_name in DD 3.25.0" ) antenna.launching_angle_tor.reference.data = [3.1, 3.2, 3.3] antenna.phase.reference_name = "Phase reference name" # And store the IDS in a DBEntry using DD 3.25.0 entry = imas.DBEntry(ASCII_BACKEND, "autoconvert", 1, 1, dd_version="3.25.0") entry.create() entry.put(pulse_schedule) entry.close()Reopen the
DBEntrywith the default DD version.getthe pulse schedule IDS. Print itsversion_put/data_dictionaryand Data Dictionary version (withget_data_dictionary_version()). What do you notice?Use
imas.util.print_treeto print all data in the loaded IDS. What do you notice?Repeat steps 3 and 4, but set
autoconverttoFalse. What do you notice this time?
import imas
from imas.ids_defs import ASCII_BACKEND, IDS_TIME_MODE_HOMOGENEOUS
from imas.util import get_data_dictionary_version
# 1. Create test data
# Create an IDSFactory for DD 3.25.0
factory = imas.IDSFactory("3.25.0")
# Create a pulse_schedule IDS
pulse_schedule = factory.new("pulse_schedule")
# Fill the IDS with some test data
pulse_schedule.ids_properties.homogeneous_time = IDS_TIME_MODE_HOMOGENEOUS
pulse_schedule.ids_properties.comment = "Testing renamed IDS nodes with IMAS-Python"
pulse_schedule.time = [1.0, 1.1, 1.2]
pulse_schedule.ec.antenna.resize(1)
antenna = pulse_schedule.ec.antenna[0]
antenna.name = "ec.antenna[0].name in DD 3.25.0"
antenna.launching_angle_pol.reference_name = (
"ec.antenna[0].launching_angle_pol.reference_name in DD 3.25.0"
)
antenna.launching_angle_pol.reference.data = [2.1, 2.2, 2.3]
antenna.launching_angle_tor.reference_name = (
"ec.antenna[0].launching_angle_tor.reference_name in DD 3.25.0"
)
antenna.launching_angle_tor.reference.data = [3.1, 3.2, 3.3]
antenna.phase.reference_name = "Phase reference name"
# And store the IDS in a DBEntry using DD 3.25.0
entry = imas.DBEntry(ASCII_BACKEND, "autoconvert", 1, 1, dd_version="3.25.0")
entry.create()
entry.put(pulse_schedule)
entry.close()
# 2. Reopen the DBEntry with DD 3.42.0:
entry = imas.DBEntry(ASCII_BACKEND, "autoconvert", 1, 1, dd_version="3.42.0")
entry.open()
# 3. Get the pulse schedule IDS
ps_autoconvert = entry.get("pulse_schedule")
print(f"{ps_autoconvert.ids_properties.version_put.data_dictionary=!s}")
print(f"{get_data_dictionary_version(ps_autoconvert)=!s}")
# What do you notice?
# version_put: 3.25.0
# get_data_dictionary_version: 3.40.0 -> the IDS was automatically converted
# 4. Print the data in the loaded IDS
imas.util.print_tree(ps_autoconvert)
# What do you notice?
# 1. The antenna AoS was renamed
# 2. Several nodes no longer exist!
print()
print("Disable autoconvert:")
print("====================")
# 5. Repeat steps 3 and 4 with autoconvert disabled:
ps_noconvert = entry.get("pulse_schedule", autoconvert=False)
print(f"{ps_noconvert.ids_properties.version_put.data_dictionary=!s}")
print(f"{get_data_dictionary_version(ps_noconvert)=!s}")
# What do you notice?
# version_put: 3.25.0
# get_data_dictionary_version: 3.25.0 -> the IDS was not converted!
# Print the data in the loaded IDS
imas.util.print_tree(ps_noconvert)
# What do you notice?
# All data is here exactly as it was put at the beginnning of this exercise.
Use cases for disabling autoconvert¶
As you could see in the exercise, disabling autoconvert enables you to retrieve all data exactly as it was stored. This can be useful, especially for non-active IDSs which may contain large changes between DD versions, such as:
Interactive plotting tools
Exploration of all stored data in a Data Entry
Etc.
Caution
The convert_ids() method warns you when data is not
converted. Due to technical constraints, the autoconvert logic doesn’t log any
such warnings.
You can work around this by explicitly converting the IDS:
>>> # Continuing with the example from Exercise 4:
>>> ps_noconvert = entry.get("pulse_schedule", autoconvert=False)
>>> imas.convert_ids(ps_noconvert, "3.40.0")
15:32:32 INFO Parsing data dictionary version 3.40.0 @dd_zip.py:129
15:32:32 INFO Starting conversion of IDS pulse_schedule from version 3.25.0 to version 3.40.0. @ids_convert.py:350
15:32:32 INFO Element 'ec/antenna/phase' does not exist in the target IDS. Data is not copied. @ids_convert.py:396
15:32:32 INFO Element 'ec/antenna/launching_angle_pol/reference/data' does not exist in the target IDS. Data is not copied. @ids_convert.py:396
15:32:32 INFO Element 'ec/antenna/launching_angle_tor/reference/data' does not exist in the target IDS. Data is not copied. @ids_convert.py:396
15:32:32 INFO Conversion of IDS pulse_schedule finished. @ids_convert.py:366
<IDSToplevel (IDS:pulse_schedule)>
Using custom builds of the Data Dictionary¶
In the previous sections we showed how you can direct IMAS-Python to use a specific released version of the Data Dictionary definitions. Sometimes it is useful to work with unreleased (development or custom) versions of the data dictionaries as well.
Caution
Unreleased versions of the Data Dictionary should only be used for testing.
Do not use an unreleased Data Dictionary version for long-term storage: data might not be read properly in the future.
If you build the Data Dictionary, a file called IDSDef.xml is created. This file
contains all IDS definitions. To work with a custom DD build, you need to point IMAS-Python
to this IDSDef.xml file:
my_idsdef_file = "path/to/IDSDef.xml" # Replace with the actual path
# Point IDSFactory to this path:
my_factory = imas.IDSFactory(xml_path=my_idsdef_file)
# Now you can create IDSs using your custom DD build:
my_ids = my_factory.new("...")
# If you need a DBEntry to put / get IDSs in the custom version:
my_entry = imas.DBEntry("imas:hdf5?path=my-testdb", "w", xml_path=my_idsdef_file)
Once you have created the IDSFactory and/or DBEntry pointing to your custom DD
build, you can use them like you normally would.
Footnotes