CliMA Newsletter, March 2023

March 2023

Newsletter

A message from our Principal Investigators

The Climate Modeling Alliance was born out of a vision that a step change in the accuracy of climate model projections was within reach by leveraging advances in the computational and data sciences. We wanted to build a next-generation climate model that learns directly from a wealth of Earth observations and high-resolution regional simulations. But a vision means little until it is implemented in practice. In this first of what will be quarterly newletters, you can read about the progress that the CliMA team is making toward fulfilling that vision, from developing new approaches to learning about climate models from data, to developing more accurate numerical schemes, and implementing the software interfaces to make the model output easy to analyze.

We are looking forward to the progress over the next year when we will couple the atmosphere, ocean, and land components we have developed so far and see the birth of the CliMA climate model. It is a very exciting time for the CliMA project and we are sure there will be much to report in upcoming newletters.

Tapio Schneider
Theodore Y. Wu Professor of Environmental Science and Engineering

Raffaele Ferrari
Cecil & Ida Green Professor of Oceanography

CliMA Science

A Discussion with CliMA's Lead Software Developer Simon Byrne on his Team's Parallel Input/Output Work

Space-filling curve

CliMA's software team, led by Simon Byrne, added a software interface to ClimaCore.jl for saving and loading data from distributed simulations. We caught up with Dr. Byrne near the Gong inside the CliMA conference room; an edited version of our interview is reproduced below.

Leilani Rivera-Dotson: Why did your team implement this interface?

Dr. Byrne: There are two main reasons we need to be able to save and load data: saving the quantities of interest, such as temperature and precipitation, for analysis and other post-processing tasks; capturing the state of the model so that we can reproduce or resume the simulation from a particular point in time. This is important if there is a hardware fault, or some other anomaly that we need to investigate. The saved files are commonly called restart files.

A New Numerical Scheme for Nonlinear Ocean Dynamics

Simone Silvestri and Greg Wagner have made substantial progress in improving the numerical accuracy of the ocean component of the CliMA model, Clima-Ocean. Typical finite-volume ocean models for climate prediction use "second-order" schemes to discretize terms in the ocean's momentum balance. But second-order momentum advection schemes are noisy, and therefore must be paired with artificial viscous terms to suppress spurious oscillations and prevent numerical instability. Artificial viscosity has the unfortunate side effect of artificially suppressing ocean mesoscale turbulence, which plays a fundamental role in Earth's climate system by transporting heat poleward and determining the ocean's density stratification. Read more

CliMA Researchers Upgrade their Uncertainty Pipeline

CliMA provides state-of-the-art data assimilation and machine learning tools that enable users to calibrate their models using large amounts of data. CliMA's research scientists and engineers continually upgrade these tools and made several breakthroughs in the past months.

CliMA's data assimilation and machine learning (DA/ML) team, led by Oliver Dunbar, improved the scalability of one of their machine learning emulators. Think of emulators as accelerated data-driven models, trained to replicate the evolution of a

complex system (in our case, a climate model). Once trained, emulators can be run millions of times using few computational resources. These large samples are essential to ascertain the spread or uncertainty of climate model parameters that comes from calibrating a model with noisy climate statistics. In the past, we used Gaussian processes as emulators, which have notorious poor scaling to high-dimensional problems. We improved the scalability of the emulators with random feature models. This allows CliMA’s DA/ML tools to learn the uncertainty of many more parameters simultaneously, and with higher-dimensional data.

Commenting on how such a breakthrough was made, Dr. Dunbar, citing work from more than a decade ago, said that they have been aware of the inherent limitations of the currently-used Gaussian process emulators but had used them as necessary stepping stones. The surge came last quarter when Dr. Dunbar and graduate student Nicholas Nelsen developed new automated training procedures for emulators with random feature models, and have shown that they can be used to replace Gaussian process emulators within CliMA’s DA/ML tools. Interestingly, random feature emulators approximate Gaussian processes, but in a way that scales (with number of parameters and dimension of data) with reduced cost, while maintaining Gaussian processes’ desirable features (e.g., smoothing) for emulation of climate statistics. The random features emulator removes the key computational bottleneck in the CliMA DA/ML and uncertainty quantification pipeline.

Dr. Dunbar pointed out that no random feature package existed in the Julia registry—Julia is a computer language developed at MIT and used to develop the CliMA climate model–until CliMA’s DA/ML team registered their open-source Julia package: RandomFeatures.jl.

“We’ve created a novel, gradient-free training procedure” confirmed Dr. Dunbar, “We hope that this package and its automated training algorithm will promote both the use of random features, and the use of gradient-free training for machine learning tools to the community. We are documenting this work in a forthcoming article (Dunbar, Nelsen, Mutic, 2023).”

The work in the DA/ML team continues to progress at a remarkable pace: there have been advances in some of their other machine learning tools, such as using generative AI for downscaling climate simulations to impact-relevant local scales. And so, alongside their open approach to software development, scientists and engineers across a wide array of fields are certain to benefit from their efforts.

Visit the CliMA blog