Season 2
Navigating MOM6 as a non-oceanographer: a MOM6 mini-app for GPU work.
Date: 7/05/2026.
Presenter: Jorge Gálvez Vallejo (@JorgeG94).
Link to repo: hackathon_mom6_miniapp
Brief Intro
Joining a new project in a field you did not study can be extremely daunting! But fear not, for fortune favors the brave. I'd say there are three options here:
- You are an oceanographer that has little knowledge of high performance computing
- You are a high performance computing person that has no knowledge of oceanography
- You are a new student and don't know either
The third one naturally has the most to overcome, but they also have the most time. The first one is simply "we have to learn how to code", the second one is "I have to learn oceanography". However, the second one might not need to learn a lot of oceanography since their contributions don't depend on them understanding the physics of the ocean dynamics.
I am a stubborn person that doesn't like to write code I don't understand, so I basically like to create more work for myself.
Detailed notes (flow of consciousness)
As an HPC person whose goal is to either optimize code or port it to different architectures the key aspect to understand is what are the bottlenecks? in the current code. In ocean simulations, to my annoyance, there is no one routine that dominates 80% of the time. This makes my work harder, usually one can quickly say "ah yes, the calculation of this physical thing takes most of the time" so one spends the most time there. Then naturally you go ask the people that work on the codebase about the bottlenecks and because everyone is using it for a different thing you will hear different things, this is where the oceanography knowledge becomes useful!
What is the best way to understand a gigantic codebase with thousands of lines of code? Simplify it! I.e. the creation of a mini-app. This is also the best thing to do for a hackathon, because you can capture algorithmic complexity, main pain points, and you end up with a smaller application that is easy to build, easy to verify, and low stakes if you completely break it!
How does one create a mini-app? First, ask the oceanographers what are the pain points! They'll quickly tell you about the
"dynamical core", which is where the basic physics are solved, i.e. the fluid mechanics part of the code. The crucial bit
here is to adopt a "how hard can it be?" mentality, be positive. I was quickly pointed to the file MOM_dyn_split_RK2.F90,
which contains the main RK2 timestepping procedure. Fortran is our friend here, there is not a lot of object orientation
shenanigans happening, i.e. what you read is what you get (most of the time). The RK2 step can be summarized as:
!! The split RK2 scheme:
!! 1. Predictor phase:
!! - Compute horizontal viscosity
!! - Compute Coriolis/momentum advection (CAu, CAv)
!! - Apply vertical viscosity
!! - Advance barotropic mode (fast 2D dynamics)
!! - Update layer thicknesses via continuity
!! 2. Corrector phase:
!! - Recompute tendencies with updated state
!! - Apply vertical viscosity
!! - Final barotropic step
!! - Final continuity update
!!
Don't get too distracted by initialization procedures, those might look ugly but once you pay attention they are simply setting arrays to something. The above algorithm has done most of the work for us now! We now know what routines we need to look for:
- Horizontal/Vertical viscosities
- Coriolis
- Barotropic solver
- Continuity
A key concept here can be extracted from MOM, the first M means "modular". Each physics engine should be a module, i.e. it should be able to be run as standalone. Why? Because if I only want to optimize the Coriolis routines, why should I have to run everything else? So when creating a mini-app always try to have a separation of concerns.
We go into the magical place that is MOM6/src/core, which is where most of the routines we're interested in are. The viscosities are parametrizations, i.e.
approximations to extremely annoying physics. If you're feeling adventurous code up a non-hydrostatic solver that tries to resolve the vertical physics. It is
_ a w f u l_. Here we have:
MOM_barotropic.F90
MOM_continuity_PPM.F90
MOM_CoriolisAdv.F90
So then we follow the MOM_dyn_split_RK2.F90 to find the function names that are the most interesting, for simplicity we will do the call btstep(..). Inside
a MOM6 subroutine, we can think of them like:
subroutine btstep(....)
real :: x
integer :: y
logical :: a
! indices mapping
is = G%isc ; ie = G%iec ; js = G%jsc ; je = G%jec ; nz = GV%ke
! bookkeeping (timing, profiling)
if(OBC) then
endif
! halo updates (look for do_group_pass)
call create_group_pass(...)
! begin real work
...
end subroutine btstep
OBCs etc. will make your life less overwhelming. Once in, the magic is just realizing the loops are just loops!
do j=js,je ; do I=is-1,ie
uhbt0(I,j) = uhbt(I,j) - find_uhbt(dt*ubt(I,j), BTCL_u(I,j)) * Idt
enddo ; enddo
As you go into the routine you'll realize that there are patterns. Certain routines loop over certain indices, there's us and vs and you start quickly
noticing how the algorithms are structured. This is the way, you simply just continue doing this with a lot of patience.
The mini-app is a work of hackathon and obsession, trying to compare lots of stuff (serial, openmp, openacc, cuda-fortran) just to get knowledge about the advantages and disadvantages of each approach. Testing performance without having to worry too much about breaking MOM6.
Fortran 101 - control structure, where parameters are defined, some keywords e.g. private, intent, submodule, use module etc
Presenter: @edoyango (Edward Yang).
Date: 14/05/2026
Note
Today we'll be building a simple Fortran example program that uses some of the concepts that MOM6 is built with. Next week, we'll apply these ideas to some real MOM6 code. Our program today, takes a 3d array (first two indices represent the lateral domain, and the last represents the columns), and performs either a sum or max operation along the column, reducing the result to a 2d array that represents the lateral domain only. If you are comfortable with Python and would like to see today's end goal, the final version of the code is given in Python below.
Contributing to MOM6 can be extra daunting if you're not used to programming in Fortran. These notes aim to introduce Fortran to someone who might already be familiar with Python. And thankfully, most of the Fortran features exercised in MOM6 have a Python equivalent. Here, we won't be looking at MOM6 code directly, because the code itself is quite long -- even if the language features used aren't too complicated.
Python equivalent of the example program to be built
"""
Python code that emulates the example Fortran program.
It uses Python functions and classes to represent the equivalent Fortran
subroutines and derived types.
"""
import numpy as np
class Grid:
def __init__(self, is_: int, js: int, ie: int, je: int, nz: int):
self.is_ = is_
self.js = js
self.ie = ie
self.je = je
self.nz = nz
class ControlStructure:
def __init__(self, initialized: bool = False, which_op: int = 1):
self.initialized = initialized
self.which_op = which_op
def do_something_along_column(cs: ControlStructure, g: Grid, arr1: np.ndarray) -> np.ndarray:
if not cs.initialized:
raise RuntimeError("Control structure not initialized!")
if cs.which_op == 1:
return _sum_along_column(g, arr1)
if cs.which_op == 2:
return _max_along_column(g, arr1)
raise ValueError("Invalid operation provided! must be either 1 or 2")
def _sum_along_column(g: Grid, arr1: np.ndarray) -> np.ndarray:
arr2 = arr1[:, :, 0].copy()
for j in range(g.je - g.js + 1):
for k in range(1, g.nz):
for i in range(g.ie - g.is_ + 1):
arr2[i, j] += arr1[i, j, k]
return arr2
def _max_along_column(g: Grid, arr1: np.ndarray) -> np.ndarray:
arr2 = arr1[:, :, 0].copy()
for j in range(g.je - g.js + 1):
for k in range(1, g.nz):
for i in range(g.ie - g.is_ + 1):
arr2[i, j] = max(arr2[i, j], arr1[i, j, k])
return arr2
import numpy as np
from my_module import ControlStructure, Grid, do_something_along_column
if __name__ == "__main__":
g = Grid(is_=1, js=2, ie=3, je=4, nz=5)
cs = ControlStructure(initialized=True, which_op=1)
input_array = np.ones((g.ie - g.is_ + 1, g.je - g.js + 1, g.nz))
output_array = do_something_along_column(cs, g, input_array)
print(np.sum(output_array))
Programs and modules
Most of the code in MOM6 is organised into "modules" which usually relate to a certain area of the ocean physics. For example,
MOM_barotropic.F90 contains the barotropic solver code,
and MOM_tracer_advect.F90 contains the tracer advection code
and so on. Modules contains code that can be reused in other modules or "programs". Modules cannot be directly compiled and run, and
so modules' code must be "used" from a program. The skeleton of this arrangement can look like:
module my_module ! my_module is the name of the module
implicit none
end module my_module
program my_program ! my_program is the name of the program
use my_module
implicit none
end program my_program
The module and program's boundaries in the code are deliniated by program/module and end program/module couples. The
program/module's name must follow the first program/module and matching end program/module (including the program/module name is
optional, but it's a requirement to include the name in modules in).
implicit none is a "quirk" of Fortran. It says that all variables' type must be declared. Otherwise, the compiler can make an
"educated guess" as to what the type it is, which can result in unexpected behaviour. Hence, it is best practice to always include
implicit none in every module and program.
Subroutines and declaring variables
Programs can have runnable code. But as your codebase gets larger, it's likely that you will:
-
want to organise the code in some way to make it easier to understand and maintain (e.g. group code related to certain physics together), and
-
store code that is reused in multiple places. Subroutines help facilitate this. Subroutines are similar to Python functions except that subroutines don't return anything, and instead, they modify its arguments instead. Fortran does have functions also, but are used less frequently in MOM6.
Like programs and modules, subroutines are bounded by subroutine <name> and end subroutine <name>. The subroutine's arguments
follow the name, followed by the type declaration of the arguments and local variables. Unlike Python, the types of all variables
must be declared and cannot change. In the example below, the variable attributes commonly used in MOM6 are shown (type, dimension,
and intent). Variables can be declared in any order, but a common convention (that MOM6 follows) is to declare the arguments first,
followed by local variables.
The example above is trivial as both the program or module have no code. So let's create our first subroutine (Fortran comments are
prefixed with !):
module my_module
implicit none
! says that subroutines/functions are declared after
contains
! this is declaring the subroutine's signature
subroutine sum_along_column(is, ie, js, je, nz, arr1, arr2)
! each argument's type must be declared. Here we have:
! * type (integer/real)
! * real is equivalent to np.float32. However, MOM6 opts to control the precision at compile
! time.
! * dimension aka shape. No dimension means that variable is scalar. dimension(...) means the
! variable is an array.
! * dimension(a:b) means that for the given index, only indices a to b are defined.
! This a unique feature of Fortran that is utilised heavily in MOM6.
! * intent
! * `in`: the variable will only be read
! * `out`: the variable will be written to
! * `inout`: not shown here, but means that the variable may be read and/or written to.
! once an argument has been declared, it can also be used to declare others
integer, intent(in) :: is, ie, js, je, nz
real, dimension(is:ie, js:je, nz), intent(in) :: arr1
real, dimension(is:ie, js:je), intent(out) :: arr2
! all local variables must also be declared
integer :: i, j, k
do j=js,je
! copies can be done using array slicing
arr2(:,j) = arr1(:,j,1) ! initialize the sum along columns
! MOM6 keeps nested loops on a single line, dilineated by colons.
do k=2,nz ; do i=is,ie
arr2(i,j) = arr2(i,j) + arr1(i,j,k) ! do the sum along columns
enddo ; enddo
enddo
end subroutine sum_along_column
end module my_module
Note that subroutines are invoked with call <subroutine>(...)
Array copies
Like with Python NumPy, arrays' contents can be copied between each other. If the arrays are of the same shape, they can be copied
with specifying array indices (a = b), or you may specify which slices to copy e.g. a(1:10) = b(1:10), or a(:, 1) = b(:) etc.
Noting that Fortran accesses array elements/slices using round brackets () instead of square brackets [] common in other
languages. It is also worth noting that array assignments/copies copies the array's values (different from Python where a = b
means something different from a[:] = b[:]).
Loops
Loops are dilineated by do variable=start,end,step and enddo - which is like for variable in range(start,end+1,step): in
Python. The main difference between fortran loop ranges and Python ranges is that the end is included in the range.
You may have also noticed that the loop ordering is a bit strange - where the outer loop iterates of the middle index, with the outer index in the middle, and the inner-most loop iterating over the first. This is quite common in MOM6!
Derived types
Derived types are similar to Python classes in that they can be instantiated and group related information. Like Python classes,
Fortran derived types has a name and attributes (or members). One of the key derived types in MOM6 is the
ocean_grid_type which describe the grid extents
(including the computational and halo extents). It also stores other grid information like lateral dimensions of the columns,
masking etc. We can create a simple version of the grid type and use it in our subroutine:
module my_module
implicit none
! grid type to hold grid bounds Python analogue:
type grid_type ! class gridType:
integer :: is, js ! starting indices ! def __init__(self, is, js, ie, je, nz):
integer :: ie, je ! ending indices ! self.is = is ; self.js = js
integer :: nz ! number of vertical layers ! self.ie = ie ; self.je = je
end type grid_type ! self.nz = nz
contains
! we can replace our grid indices with a grid_type
subroutine sum_along_column(g, arr1, arr2)
! We can still use the grid_type's members to define subsequent variables
! below, `g` is an instance of the `grid_type`. This `g`'s members are being accessed with `%`,
! which in python would be `g.is`. `g`'s members are then used to size the input/output arrays.
type(grid_type), intent(in) :: g ! g is an instance of grid_type
real, dimension(g%is:g%ie, g%js:g%je, g%nz), intent(in) :: arr1
real, dimension(g%is:g%ie, g%js:g%je), intent(out) :: arr2
integer :: i, j, k
do j=g%js,g%je
arr2(:,j) = arr1(:,j,1) ! initialize the sum along columns
do k=2,g%nz ; do i=g%is,g%ie
arr2(i,j) = arr2(i,j) + arr1(i,j,k) ! do the sum along columns
enddo ; enddo
enddo
end subroutine sum_along_column
end module my_module
Let's also introduce another pattern used in MOM6: the "control structure". Each module will have its own control structure that mostly stores information to control MOM6's behaviour e.g. which algorithm to use or whether a certain physics is turned on or not. In our simple example, the control structure will simply control whether to do a sum or max along columns.
module my_module
implicit none
private ! this says that by default, contents of this module aren't accessible.
!< grid type to hold grid bounds
type grid_type
integer :: is, js !< starting indices
integer :: ie, je !< ending indices
integer :: nz !< number of vertical layers
end type grid_type
!< control structure
type control_structure_type
logical :: initialized = .false. !< whether the control structure is initialized - defaults to .false.
integer :: which_op = 1 !< which operation to do - default is 1 (sum)
end type control_structure_type
! Explicitly say which things can be used - do_sum/max_along_column cannot be directly used
public :: grid_type, control_structure_type, do_something_along_column
contains
!< this subroutine either does a max or sum along columns
subroutine do_something_along_column(cs, g, arr1, arr2)
type(control_structure_type), intent(in) :: cs !< control structure
type(grid_type), intent(in) :: g !< grid type
real, dimension(g%is:g%ie, g%js:g%je, g%nz), intent(in) :: arr1 !< input array
real, dimension(g%is:g%ie, g%js:g%je), intent(out) :: arr2 !< output array
if (.not.CS%initialized) error stop "Control structure not initialized!"
if (CS%which_op == 1) then
call sum_along_column(g, arr1, arr2)
elseif (CS%which_op == 2) then
call max_along_column(g, arr1, arr2)
else
error stop "Invalid operation provided! must be either 1 or 2"
endif
end subroutine do_something_along_column
!< performs a sum reduction along columns
subroutine sum_along_column(g, arr1, arr2)
type(grid_type), intent(in) :: g !< grid type
real, dimension(g%is:g%ie, g%js:g%je, g%nz), intent(in) :: arr1 !< input array
real, dimension(g%is:g%ie, g%js:g%je), intent(out) :: arr2 !< output array
integer :: i, j, k ! loop indices
do j=g%js,g%je
arr2(:,j) = arr1(:,j,1) ! initialize the sum along columns
do k=2,g%nz ; do i=g%is,g%ie
arr2(i,j) = arr2(i,j) + arr1(i,j,k) ! do the sum along columns
enddo ; enddo
enddo
end subroutine sum_along_column
!< performs a max reduction along columns
subroutine max_along_column(g, arr1, arr2)
type(grid_type), intent(in) :: g !< grid type
real, dimension(g%is:g%ie, g%js:g%je, g%nz), intent(in) :: arr1 !< input array
real, dimension(g%is:g%ie, g%js:g%je), intent(out) :: arr2 !< output array
integer :: i, j, k ! loop indices
do j=g%js,g%je
arr2(:,j) = arr1(:,j,1) ! initialize the sum along columns
do k=2,g%nz ; do i=g%is,g%ie
arr2(i,j) = max(arr2(i,j), arr1(i,j,k)) ! do the max along columns
enddo ; enddo
enddo
end subroutine max_along_column
end module my_module
The module is much larger now - the control structure type has been added and two subroutines have also been added.
max_along_column is almost identical to sum_along_column except that it does a max operation instead of +.
If statements
In the module, we've also added a 3-branch if statement. If statements look quite similar to Python's except that the evaluation
must be put it into brackets and is followe by then: i.e. if (statement) then. Otherwise the semantics are identical.
Module public and private
To hide details, MOM6 likes to leverage "public" and "private" statements in modules. MOM6 modules will declare everything as
private by default (by having an unqualified private clause), and then explicitly list the objects that should be accessible
outside of the module with public :: list, of, objects, and, procedures. Private things are visible to other things within the
module, but not outside.
Documenting comments
MOM6 uses Doxygen comments that automatically generate documentation for the code. These type of comments are sentineled with !< (as opposed to only !). Procedures (functions and subroutines), types and members, and arguments must be documented.
Using modules
Let's finish our program and use the module code! The program will be very simple -- uses hardcoded values to initialize everything.
program my_program
! specify which things we want from the module
! Python analogue: from my_module import grid_type, control_structure_type, do_something_along_column
! note that because of the privacy in the module, `sum_along_column` is not included in the
! `public :: ...` list. This means that importing or calling `sum_along_column` from this
! program would fail. Adding `sum_along_column` to the `public :: ...` list would make it
! available here.
! Note that excluding `only` would lead to everything public being available. However, MOM6
! ensures to make imports explicit. This is, in general, good practice as it makes it easier to
! identify where something used inside the program/module comes from.
use my_module, only: grid_type, control_structure_type, do_something_along_column
implicit none
! variable declaration
type(grid_type) :: g
type(control_structure_type) :: cs
! use allocatable arrays for dynamic array sizes in the program
real, allocatable :: input_array(:, :, :), output_array(:, :)
! derived type instances don't have to be constructed (the construction is implied in the declaration)
! but they can be initialized with a default type constructor, or simply setting the members.
g = grid_type(is=1, js=2, ie=3, je=4, nz=5)
cs%initialized = .true.
cs%which_op = 1
allocate( &
input_array(g%is:g%ie, g%js:g%je, g%nz), &
output_array(g%is:g%ie, g%is:g%ie) &
)
input_array = 1.0
call do_something_along_column(cs, g, input_array, output_array)
! print the sum of the array to the terminal
write(*, *) sum(output_array)
end program my_program
A program itself looks quite similar the subroutines above, except there aren't any arguments. Near the top we use the module and
select which things we want. This convention isn't mandatory, but is closely followed by MOM6.
Allocatable arrays
Unlike subroutines, programs cannot get array sizes by simply passing them in. So, often arrays are made given the "allocatable" attribute, which lets the program "allocate" the array based on some user input or similar. Here, we allocate the arrays' based on hardcoded values, but that could be changed to use CLI arguments or something.
Compiling and running the example
To compile the program, you can put the above module and program into the same file, say example.f90, and compile it with
gfortran example.f90 -o example.x. You can then execute it with ./example.x. You should get 45 with many zeros printed!
MOM6 is a much more complex codebase with many more dependencies and consequently more complex build system.
Fortran 101 re-visited: MOM6 intrinsics
Date: 21/05/2026.
Presenter: Jorge Gálvez Vallejo (@JorgeG94).
After some careful deliberation I have decided to show a very fun bit of MOM6 that is crucial to all of our simulations! We
know that MOM6 enforces bitwise reproducbility across runs. To achieve this there are careful safeguards while writing and reviewing
code. Bitwise reproducibility means that two answers at identical to the bits that compose them, i.e. a good example is that:
a + (b+c) != (a+b) + c in terms of bit, they both resolve to (a+b+c) but the order of operations is different!
So MOM6 enforces very tight checks on how the code is written, this is why if you want to refactor a routine in MOM6 that does intermediate computations you can end up changing answers! For example one that I looked at (this routine is in the vertical viscosity):
botfn = 1. / (1. + 0.09 * z2 * z2 * z2 * z2 * z2 * z2)
I thought, "oh my god, this is ugly", let's make this more readable by doing:
botfn = 1. / (1. + 0.09*z2**6)
There are two potential things here. One, this changes the BITS of the answer although they are
the same number. This is because z2**6 can be evaluated differently by the compiler! So I thought, well,
how about:
z2_six = z2 * z2 * z2 * z2 * z2 * z2
botfn = 1. / (1. + 0.09*z2_six)
You can see now how subtle a small change can break bitwise reproducibility! So, today we are going into a super
fun bit of the code: MOM6/src/framework/MOM_intrinsic_functions.F90! This file contains some routines that
are not quite trustworthy across compilers. We will focus on the cuberoot!
The module signature:
module MOM_intrinsic_functions
use iso_fortran_env, only : stdout => output_unit, stderr => error_unit
use iso_fortran_env, only : int64, real64
implicit none ; private
public :: invcosh, cuberoot
public :: intrinsic_functions_unit_tests
!...
end module MOM_intrinsic_functions
!> Returns the cube root of a real argument at roundoff accuracy, in a form that works properly with
!! rescaling of the argument by integer powers of 8. If the argument is a NaN, a NaN is returned.
elemental function cuberoot(x) result(root)
real, intent(in) :: x !< The argument of cuberoot in arbitrary units cubed [A3]
real :: root !< The real cube root of x in arbitrary units [A]
real :: asx ! The absolute value of x rescaled by an integer power of 8 to put it into
! the range from 0.125 < asx <= 1.0, in ambiguous units cubed [B3]
real :: root_asx ! The cube root of asx [B]
real :: ra_3 ! root_asx cubed [B3]
real :: num ! The numerator of an expression for the evolving estimate of the cube root of asx
! in arbitrary units that can grow or shrink with each iteration [B C]
real :: den ! The denominator of an expression for the evolving estimate of the cube root of asx
! in arbitrary units that can grow or shrink with each iteration [C]
real :: num_prev ! The numerator of an expression for the previous iteration of the evolving estimate
! of the cube root of asx in arbitrary units that can grow or shrink with each iteration [B D]
real :: np_3 ! num_prev cubed [B3 D3]
real :: den_prev ! The denominator of an expression for the previous iteration of the evolving estimate of
! the cube root of asx in arbitrary units that can grow or shrink with each iteration [D]
real :: dp_3 ! den_prev cubed [C3]
real :: r0 ! Initial value of the iterative solver. [B C]
real :: r0_3 ! r0 cubed [B3 C3]
integer :: itt
integer(kind=int64) :: e_x, s_x
if ((x >= 0.0) .eqv. (x <= 0.0)) then
! Return 0 for an input of 0, or NaN for a NaN input.
root = x
else
call rescale_cbrt(x, asx, e_x, s_x)
! Iteratively determine root_asx = asx**1/3 using Halley's method and then Newton's method,
! noting that Halley's method onverges monotonically and needs no bounding. Halley's method is
! slightly more complicated that Newton's method, but converges in a third fewer iterations.
! Keeping the estimates in a fractional form Root = num / den allows this calculation with
! no real divisions during the iterations before doing a single real division at the end,
! and it is therefore more computationally efficient.
! This first estimate gives the same magnitude of errors for 0.125 and 1.0 after two iterations.
! The first iteration is applied explicitly.
r0 = 0.707106
r0_3 = r0 * r0 * r0
num = r0 * (r0_3 + 2.0 * asx)
den = 2.0 * r0_3 + asx
do itt=1,2
! Halley's method iterates estimates as Root = Root * (Root**3 + 2.*asx) / (2.*Root**3 + asx).
num_prev = num ; den_prev = den
! Pre-compute these as integer powers, to avoid `pow()`-like intrinsics.
np_3 = num_prev * num_prev * num_prev
dp_3 = den_prev * den_prev * den_prev
num = num_prev * (np_3 + 2.0 * asx * dp_3)
den = den_prev * (2.0 * np_3 + asx * dp_3)
! Equivalent to: root_asx = root_asx * (root_asx**3 + 2.*asx) / (2.*root_asx**3 + asx)
enddo
! At this point the error in root_asx is better than 1 part in 3e14.
root_asx = num / den
! One final iteration with Newton's method polishes up the root and gives a solution
! that is within the last bit of the true solution.
ra_3 = root_asx * root_asx * root_asx
root_asx = root_asx - (ra_3 - asx) / (3.0 * (root_asx * root_asx))
root = descale(root_asx, e_x, s_x)
endif
end function cuberoot
From Ed's last session, here we can see a couple of the things he mentioned. We have a function
that is called. Functions return a single value, similar to how they work in Python. Here however,
we have an extra attribute in Fortran, the elemental attribute. Elemental makes a routine
callable/applicable inside an array. So the cuberoot routine can be used to obtain the cuberoot
of a scalar, i.e. real :: my_scalar or an allocatable array of n dimensions! i.e. we can call
cuberoot to operate on real, allocatable :: my_array(:,:,:,:,:) with no problems whatsoever!
Next, we have a call rescale_cbrt:
!> Rescale `a` to the range [0.125, 1) and compute its cube-root exponent.
pure subroutine rescale_cbrt(a, x, e_r, s_a)
real, intent(in) :: a
!< The number to be rescaled for cube-root computation [A3]
real, intent(out) :: x
!< The rescaled value of a in the range from 0.125 < asx <= 1.0, in ambiguous units cubed [B3]
integer(kind=int64), intent(out) :: e_r
!< Cube root of the exponent of the rescaling of `a`
integer(kind=int64), intent(out) :: s_a
!< The sign bit of a
integer(kind=int64) :: xb
! Floating point value of a, bit-packed as an integer
integer(kind=int64) :: e_a
! Unscaled exponent of a
integer(kind=int64) :: e_x
! Exponent of x
integer(kind=int64) :: e_div, e_mod
! Quotient and remainder of e in e = 3*(e/3) + modulo(e,3).
! Pack bits of a into xb and extract its exponent and sign.
xb = transfer(a, 1_int64)
s_a = ibits(xb, signbit, 1)
e_a = ibits(xb, expbit, explen) - bias
! Compute terms of exponent decomposition e = 3*(e/3) + modulo(e,3).
! (Fortran division is round-to-zero, so we must emulate floor division.)
e_mod = modulo(e_a, 3_int64)
e_div = (e_a - e_mod)/3
! Our scaling decomposes e_a into e = {3*(e/3) + 3} + {modulo(e,3) - 3}.
! The first term is a perfect cube, whose cube root is computed below.
e_r = e_div + 1
! The second term ensures that x is shifted to [0.125, 1).
e_x = e_mod - 3
! Insert the new 11-bit exponent into xb and write to x and extend the
! bitcount to 12, so that the sign bit is zero and x is always positive.
call mvbits(e_x + bias, 0, explen + 1, xb, fraclen)
x = transfer(xb, 1.)
end subroutine rescale_cbrt
As we learned last week, a subroutine can be invoked via a call procedure.
Subroutines take multiple arguments and can return multiple values at once. You can
see this in:
integer(kind=int64), intent(out) :: e_r
!< Cube root of the exponent of the rescaling of `a`
integer(kind=int64), intent(out) :: s_a
e_r and s_a are both being written to inside the subroutine and being sent back. Also,
you might notice the pure attribtue in the subroutine. A pure function in Fortran is a
function with no side effects, meaning: it does not write to file/console, it does not alter
any variable outside its function body, and it does not read from or modify any external state
(such as global variables, SAVEd local variables, or I/O streams). Given the same inputs, it
must always produce the same outputs, which allows the compiler to safely call it from
within FORALL constructs, DO CONCURRENT loops, and other pure procedures. This makes it
threadsafe.
Elemental functions are pure by design, so any elemental function you write is also
pure, you can be verbose and say pure elemental function cuberoot but that's a mouthful!
Why did I show you this? Well, you cannot go and start editing MOM6 code without understanding the core tenet of MOM6: keep bitwise reproducbility! I hope this small entry helps you and encourages you to go deep into the codebase!
Navier Stokes -> stacked shallow water (adiabatic)
Date: 28/05/2026.
Presenter: Andrew Kiss (@aekiss).
Navier-Stokes equation: fully 3D
$ % shorthand for \partial \newcommand {\PD}{\partial} % 1st partial derivative \newcommand {\pd}[2]{\frac{\PD #1}{\PD #2}} % 2nd partial derivative \newcommand {\pdTwo}[2]{\frac{\PD^{2} #1}{\PD #2^{2}}} % 2nd partial derivative, allows more general denominator (eg dxdy) \newcommand {\pdTwoX}[2]{\frac{\PD^{2} #1}{ #2}} % 3rd partial derivative \newcommand {\pdThree}[2]{\frac{\PD^{3} #1}{\PD #2^{3}}} % 3rd partial derivative, allows more general denominator (eg dx^{2}dy) \newcommand {\pdThreeX}[2]{\frac{\PD^{3} #1}{ #2}} % 4th partial derivative \newcommand {\pdFour}[2]{\frac{\PD^{4} #1}{\PD #2^{4}}} % 4th partial derivative, allows more general denominator (eg dx^{3}dy) \newcommand {\pdFourX}[2]{\frac{\PD^{4} #1}{ #2}} % small 1st partial derivative \newcommand {\pds}[2]{\PD #1/\PD #2} % small 2nd partial derivative \newcommand {\pdTwos}[2]{\PD^{2} #1/\PD #2^{2}} \underbrace{ \underbrace{\underbrace{\pd{\mathbf{u}}{t}}{\text{Eulerian acceleration}} \underbrace{+(\mathbf{u}\cdot\nabla)\mathbf{u}}}}{\frac{D\mathbf{u}}{Dt}\text{: material (Lagrangian) acceleration relative to Earth}} \quad \underbrace{+2\bm{\Omega}\times\mathbf{u}}}}\text{Lagrangian acceleration relative to non-rotating frame} \quad =\underbrace{\underbrace{-\frac{1}{\rho}\nabla p} \underbrace{+A\nabla^2\mathbf{u}}}{\text{eddy viscosity}}\quad \underbrace{+\mathbf{g}} $}}}_{\frac{\mathbf{F}}{\rho}\text{: Force per unit density}