Calculating hashes of IMAS data

IMAS-Python can calculate hashes of IMAS data. As Wikipedia explains better than I could do:

A hash function is any function that can be used to map data of arbitrary size to fixed-size values, […]. The values returned by a hash function are called hash values, hash codes, hash digests, digests, or simply hashes.

IMAS-Python is using the XXH3 hash function from the xxHash project. This is a non-cryptographic hash and returns 64-bit hashes.

Use cases

Hashes of IMAS data are probably most useful as checksums: when the hashes of two IDSs match, there is a very decent chance that they contain identical data. [1] This can be useful to verify data integrity, and detect whether data has been accidentally corrupted or altered.

Exercise 1: Calculate some hashes

In this exercise we will use imas.util.calc_hash() to calculate hashes of some IDSs. Use bytes.hex() to show a more readable hexidecimal format of the hash.

  1. Create an empty equilibrium IDS and print its hash.

  2. Now fill ids_properties.homogeneous_time and print the hash. Did it change?

  3. Resize the time_slice Array of Structures to size 2. Calculate the hash of time_slice[0] and time_slice[1]. What do you notice?

  4. Resize time_slice[0].profiles_2d to size 1. For convenience, you can create a variable p2d = time_slice[0].profiles_2d[0].

  5. Fill p2d.r = [[1., 2.]] and p2d.z = p2d.r, then calculate their hashes. What do you notice?

  6. del p2d.z and calculate the hash of p2d. Then set p2d.z = p2d.r and del p2d.r. What do you notice?

import imas

# 1. Create IDS
eq = imas.IDSFactory().equilibrium()
print(imas.util.calc_hash(eq).hex(' ', 2))  # 2d06 8005 38d3 94c2

# 2. Update homogeneous_time
eq.ids_properties.homogeneous_time = 0
print(imas.util.calc_hash(eq).hex(' ', 2))  # 3b9b 9297 56a2 42fd
# Yes: the hash changed (significantly!). This was expected, because the data is no
# longer the same

# 3. Resize time_slice
eq.time_slice.resize(2)
print(imas.util.calc_hash(eq.time_slice[0]).hex(' ', 2))  # 2d06 8005 38d3 94c2
print(imas.util.calc_hash(eq.time_slice[1]).hex(' ', 2))  # 2d06 8005 38d3 94c2
# What do you notice?
#
#   The hashes of both time_slice[0] and time_slice[1] are identical, because both
#   contain no data.
#
#   The hashes are also identical to the empty IDS hash from step 1. An IDS, or a
#   structure within an IDS, that has no fields filled will always have this hash value.

# 4. Resize profiles_2d
eq.time_slice[0].profiles_2d.resize(1)
p2d = eq.time_slice[0].profiles_2d[0]

# 5. Fill data
p2d.r = [[1., 2.]]
p2d.z = p2d.r
print(imas.util.calc_hash(p2d.r).hex(' ', 2))  # 352b a6a6 b40c 708d
print(imas.util.calc_hash(p2d.z).hex(' ', 2))  # 352b a6a6 b40c 708d
# These hashes are identical, because they contain the same data

# 6. Only r or z
del p2d.z
print(imas.util.calc_hash(p2d).hex(' ', 2))  # 0dcb ddaa 78ea 83a3
p2d.z = p2d.r
del p2d.r
print(imas.util.calc_hash(p2d).hex(' ', 2))  # f86b 8ea8 9652 3768
# Although the data inside `r` and `z` is identical, we get different hashes because the
# data is in a different attribute.

Properties of IMAS-Python’s hashes

The implementation of the hash function has the following properties:

  • Only fields that are filled are included in the hash.

    If a newer version of the Data Dictionary introduces additional data fields, then this won’t affect the hash of your data.

    As long as there are no Non Backwards Compatible changes in the Data Dictionary for the filled fields, the data hashes should not change.

  • The ids_properties/version_put structure is not included in the hash.

    This means that the precise Access Layer version, Data Dictionary version or high level interface that was used to store the data, does not affect the hash of the data.

  • Hashes are different for ND arrays with different shapes that share the same underlying data.

    For example, the following arrays are stored the same way in your RAM, but they result in different hashes:

    array1 = [1, 2]
    array2 = [[1, 2]]
    array3 = [[1],
              [2]]
    

Technical details and specification

You can find the technical details, and a specification for calculating the hashes, in the documentation of imas.util.calc_hash().


Last update: 2026-01-28