Earth system models (ESMs) are invaluable tools to study the climate system's response to specific greenhouse gas emission pathways. Large single-model initial-condition and multi-model ensembles are used to investigate the range of possible responses and serve as input to climate impact and integrated assessment models. Thereby, climate signal uncertainty is propagated along the uncertainty chain and its effect on interactions between humans and the Earth system can be quantified. However, generating both single-model initial-condition and multi-model ensembles is computationally expensive. In this study, we assess the feasibility of geographically explicit climate model emulation, i.e., of statistically producing large ensembles of land temperature field time series that closely resemble ESM runs at a negligible computational cost. For this purpose, we develop a modular emulation framework which consists of (i) a global mean temperature module, (ii) a local temperature response module, and (iii) a local residual temperature variability module. Based on this framework, MESMER, a Modular Earth System Model Emulator with spatially Resolved output, is built. We first show that to successfully mimic single-model initial-condition ensembles of yearly temperature from 1870 to 2100 on grid-point to regional scales with MESMER, it is sufficient to train on a single ESM run, but separate emulators need to be calibrated for individual ESMs given fundamental inter-model differences. We then emulate 40 climate models of the Coupled Model Intercomparison Project Phase 5 (CMIP5) to create a “superensemble”, i.e., a large ensemble which closely resembles a multi-model initial-condition ensemble. The thereby emerging ESM-specific emulator parameters provide essential insights on inter-model differences across a broad range of scales and characterize core properties of each ESM. Our results highlight that, for temperature at the spatiotemporal scales considered here, it is likely more advantageous to invest computational resources into generating multi-model ensembles rather than large single-model initial-condition ensembles. Such multi-model ensembles can be extended to superensembles with emulators like the one presented here.
The range of simulated climate responses to external radiative forcing is affected by both internal variability and inter-model differences
As climate model ensembles are inherently expensive to run, there is an interest in approximating Earth system model (ESM) output with computationally cheap emulators. In the field of climate science, the term emulator is used for a variety of statistical models which learn from existing runs of complex climate models to infer properties of runs which have not been generated yet. This makes it possible to explore the phase space at a lower computational cost. ESM emulators target different aspects of the climate system. For example, some emulators focus on the impacts of sub-grid-scale parameterizations
In this study, the term emulator is used to refer to a computationally cheap statistical tool which generates additional realizations of land temperature field time series for a specific greenhouse gas emission pathway at a yearly resolution. The presented emulator thus produces realizations which closely resemble initial-condition ensemble members of the considered ESMs. In the context of large multi-model ensembles, our computationally cheap emulator can be used to produce look-alikes of large initial-condition ensembles for every model within the multi-model ensemble, resulting in a “superensemble”, i.e., a large ensemble which closely resembles a multi-model initial-condition ensemble.
To build this statistical temperature emulator, an overarching modular framework is proposed and put into the context of previous work in Sect.
We propose an additive framework for temperature emulation at the yearly scale for a specific greenhouse gas emission pathway, which can be summarized as
Our framework requires three modules: a global mean temperature module, a module for the grid-point-level temperature response to the global mean temperature, and a local residual temperature variability module. In the following, we place existing literature within these modules before discussing the connections to our emulator. As this study is primarily concerned with temperature, we focus solely on this variable in our literature review. However, several of the referenced studies also treat additional variables such as precipitation
Global mean temperature is often an output of computationally efficient simple energy balance climate models
Pattern scaling is a frequently employed approach to relate local temperature to global mean temperature and is also used to emulate warming patterns across emission scenarios
More complex local response emulation methods are rare and often directly conditioned on
While the focus is usually on emulating the pattern associated with the global mean temperature trend, patterns associated with physical modes of variability such as the El Niño–Southern Oscillation and the Pacific Decadal Oscillation can additionally be derived
Several approaches exist to emulate local residual temperature variability based on observations and climate model simulations
When employing ESMs instead, longer time series and multiple realizations are available to derive the statistical properties of the local residual temperature variability. Several authors fit autoregressive (AR) models to a set of climate model runs to account for temporal autocorrelation when emulating local residual temperature variability
All approaches listed so far rely on the assumption that local residual temperature variability is stationary in time, which is known not to be fulfilled everywhere.
While most studies focus on one or two of the modules required to mimic an initial-condition ensemble, this study proposes a framework which incorporates all three components. Since only 12 out of 40 CMIP5 models provide several initial-condition members, it is essential to test to what extent an emulator trained on a single run is able to approximate both its training run and additional independent initial-condition members. We thus emulate the full CMIP5 multi-model ensemble based on single training runs and create a superensemble which accounts for inter-model uncertainty across all 40 climate models. To the best of our knowledge, this study is the first to implement an emulator which mimics an initial-condition ensemble based on a single training run and applies it to such a large multi-model ensemble.
Runs from 40 CMIP5 climate models
Terms used to refer to different climate model runs throughout this study.
Additionally, stratospheric aerosol optical depth is used as a proxy for volcanic activity during the historical time period. This aerosol dataset was originally described by
Here, we focus on surface temperature anomaly at a yearly resolution. Temperature fields were bilinearly interpolated onto a
Whenever regional averages are shown, area-weighted means are referred to. The regions employed in this study are 26 land regions defined in the Special Report on Managing the Risks of Extreme Events (SREX)
Map of the SREX regions and their abbreviations. The considered land grid points are shown in gray.
We follow the framework introduced in Sect.
Illustration of the emulation framework with the MESMER implementation.
To calibrate the emulator, a single run spanning 231 years (1870–2100) per model is used. For the calibration, the global mean temperature trajectory and the associated land temperature fields are required.
In the global mean temperature module, additional realizations of global mean temperature time series
In
First,
In a next step,
The time series of global mean temperature variability
In this study, the LOWESS smoothing window length is 50 years with weights decaying with increasing distance according to a tricube weight function. The regression coefficients for the forced response to volcanic eruptions are obtained with the ordinary least squares (OLS) algorithm. The coefficients of the AR process are fit by means of maximum likelihood, and the Bayesian information criterion (BIC) is employed to select its order
The local temperature response module translates the global mean temperature signal into a grid-point-level response
In this study, the linear regression coefficients are estimated with OLS at each grid point.
The local residual temperature variability
For an AR(1) process,
To estimate
In this study, the AR(1) coefficients are fit at each grid point by means of maximum likelihood. In our framework implementation, the obtained intercept terms
The emulator's performance is evaluated on the training run and – where available – on test runs. While the evaluation on the training run indicates how successfully this framework implementation captures the training run, the evaluation on the test runs serves as a proxy for the emulator's capability in mimicking true ESM initial-condition ensembles. For the evaluation, 1000 emulations are generated for each climate model.
The local trends
To evaluate how well the emulated local trends capture true climate model runs, the Pearson correlation of
The local variability
To compare the emulated
To evaluate
On regional scales, the emulated temperatures
The parameters obtained from training the emulator on four example ESMs reveal distinct inter-ESM differences in every emulator module (Fig.
Emulator calibration parameters (rows) for four example ESMs (columns).
In the local response module (Eq.
The local residual variability (Eq.
Emulated temperature fields are visually indistinguishable from ESM test runs that were not used during training (Fig.
Temporal snapshots depicting temperature field realizations in 2100 (rows) for four example ESMs (columns).
Regionally averaged temperature time series (rows) for four example ESMs (columns). The regions are (from top to bottom) global land, Central Europe (CEU), and Southeastern South America (SSA). In each panel, one emulation (EMU) is highlighted in dark gray and 49 other emulations are shown in light gray. Additionally, all available ESM test runs are plotted in color.
Time series of emulations and ESM test runs averaged over global land, CEU, and SSA highlight the emulators' capability to reproduce the regionally characteristic behavior of the climate system (Fig.
Figure
Time series of 50 emulations (EMUs) from the CESM1(CAM5) emulator (light gray) overlaid with runs from the three other example ESMs for three regions: global land
Figure
Emulator properties (rows) of the 40 CMIP5 climate models (columns).
In the global mean temperature module (Eq.
In the local response module (Eq.
In the local residual variability module (Eq.
Figure
Regionally averaged time series as 2-D histograms for 40 CMIP5 model training runs and 1000 emulations per model
Correlation between the emulated local trends and the true climate model runs is very high in both training and test runs in all CMIP5 models, indicating that the forced trends are successfully extracted from each training run (Fig.
Local trend verification for the CMIP5 models by means of Pearson correlation between the emulated local trends and the training runs (gray bars). The example shows the associated 2-D histogram for CESM1(CAM5). For the CMIP5 models with test runs, the correlation between the emulated local trends and each individual test run is indicated by a black cross. Since these correlations are nearly identical for each test run of a specific climate model, the individual black crosses cannot be visually distinguished from one another. For all climate models with test runs, the number of available test runs is given in brackets after the model name.
To evaluate the local variability at the grid-point level, lag-1 temporal autocorrelations and standard deviations are considered (Fig.
Local variability verification for the CMIP5 models (columns) by means of the Pearson correlation of grid-point-level lag-1 temporal autocorrelations
Local variability verification for the CMIP5 models (columns) by means of the Pearson correlation of cross-correlations between grid points in three geographical bands (rows) between the 1000 individual emulations and the training runs (box plots). The geographical bands cover distances below 2000 km
To evaluate the spatial cross-correlations between grid points, three geographical bands are considered (Fig.
When considering full emulations, i.e., the local trends plus the local variability, the median is successfully emulated but the emulations are a bit underdispersive compared to the training run for the vast majority of CMIP5 models and SREX regions (Fig
Deviation of climate model runs from the emulated 5 %
A modular framework is chosen for the climate model emulation because of its manifold advantages. First, the calibrated parameters of each emulator module can be used for climate model intercomparison over a wide range of scales since they can be readily visualized and easily interpreted (Sects.
In this study, an estimate of
To translate
Spatially coherent local variability is introduced in two emulator modules, namely in the local response module as the local response to
Local residual variability is modeled as an AR process with spatially correlated innovations (Sect.
We demonstrated that, for yearly temperature at grid-point to regional scales, training on a single run per climate model is sufficient to learn the key properties of the climate system of this climate model. Early results furthermore indicated that larger single-model initial-condition ensembles, in that case a 21-member CESM ensemble, can also be successfully emulated when training on a single ESM run
Our results highlight fundamental differences between large single-model initial-condition ensembles
We introduce a modular framework for climate model emulation of yearly land temperatures and present a specific, computationally cheap implementation called MESMER, which can create plausible temperature field time series within seconds based on a single climate model training run. Our emulator consists of (i) a global mean temperature module, (ii) a local temperature response module, and (iii) a local residual temperature variability module. The global mean temperature module contains a global mean temperature trend, which is shared by all emulations, and a global mean temperature variability term, which is modeled as an AR process and varies between individual emulations. The local response module is linear in nature and consists of a separate response to the global mean temperature trend and the global mean temperature variability. The local residual variability module generates spatiotemporally correlated fields by means of locally fit AR(1) processes with spatially correlated innovations.
Since emulators approximate complex ESMs in a simplified manner, they are not able to accurately reproduce all spatiotemporal ESM characteristics. The emulator presented here, e.g., dampens covariations between grid points as a function of distance in the local residual variability module due to regularization. Thus, our emulator reliably reproduces climate model variability at the grid-point level, but the emulations are increasingly underdispersive for larger regional averages and intermediate-range spatial teleconnections cannot be accounted for. This caveat could be addressed by further improving the local residual variability module implementation with a focus on such teleconnections. Alternatively, training on several ESM runs would increase the robustness of the estimated parameters and make it possible to reproduce farther-reaching teleconnections within the current emulator setup. Nevertheless, calibrating our emulator on a single training run is sufficient to generate emulations which are visually indistinguishable from true ESM runs.
Inherent inter-ESM differences in warming trends and spatiotemporal variability make it necessary to calibrate a separate emulator for each of the 40 considered CMIP5 models. The resulting emulations successfully approximate the training run for each climate model on grid-point to regional scales. For CMIP5 models with more than one initial-condition ensemble member, it was furthermore demonstrated that the ensemble of emulations is generally able to mimic true climate model initial-condition ensembles at these scales. Hence, we argue that to sample climate signal uncertainty for yearly temperature at grid-point to regional scales, it is more advantageous to invest computational resources into generating multi-model ensembles rather than large single-model ensembles, since the latter can be readily approximated by our emulator.
Superensembles such as the one generated in this study, which contains 1000 emulations per climate model, are expected to be particularly helpful in regions with large interannual variability. There, the very sparse sampling of the temperature phase space by the CMIP5 ensemble may result in biased conclusions when solely employing the CMIP5 ensemble as an input to impact or integrated assessment models which estimate the effect of climate signal uncertainty on their quantity of interest.
The emulator is designed to be flexible enough to emulate whatever climate model run it is provided with. Hence, it is not part of the emulator's tasks to judge the realism of individual climate models. Instead, the choice of considered ESMs will depend on the scope of different applications. For example, results from emergent constraints analyses
In conclusion, in this study we have presented a novel ESM emulator called MESMER that can be trained to represent separate ESMs based on single realizations of the respective ESMs and which has been shown to be able to emulate and expand multi-model ensembles such as CMIP5. We expect that the developed emulator can serve as a training ground for investigating the phase space of multi-model ensembles in new applications, e.g., related to the derivation of emissions scenarios or the assessment of impacts under different emissions pathways.
List of the 40 employed CMIP5 models, the modeling groups providing them, and the number of initial-condition ensemble members used.
The employed CMIP5 data are available from the public CMIP archive at:
The supplement related to this article is available online at:
LB, LG, and SIS designed the study based on an initial idea from SIS. LB carried out the analysis and drafted the text. LG provided statistical support for the analysis. All authors contributed to interpreting the results and refining the text.
The authors declare that they have no conflict of interest.
We acknowledge the World Climate Research Program's Working Group on Coupled Modeling, which is responsible for the Coupled Model Intercomparison Project (CMIP), and we thank the climate modeling groups (listed in Table
This research has been supported by Horizon 2020 (CRESCENDO (grant no. 641816)) and FP7 Ideas: European Research Council (DROUGHT-HEAT (grant no. 617518)).
This paper was edited by Raghavan Krishnan and reviewed by two anonymous referees.