The concept of independence has been frequently mentioned in climate science research, but has rarely been defined and discussed in a theoretically robust and quantifiable manner. In this paper we argue that any discussion must start from a clear and unambiguous definition of what independence means and how it can be determined. We introduce an approach based on the statistical definition of independence, and illustrate with simple examples how it can be applied to practical questions. Firstly, we apply these ideas to climate models, which are frequently argued to not be independent of each other, raising questions as to the robustness of results from multi-model ensembles. We explore the dependence between models in a multi-model ensemble, and suggest a possible way forward for future weighting strategies. Secondly, we discuss the issue of independence in relation to the synthesis of multiple observationally based constraints on the climate system, using equilibrium climate sensitivity as an example. We show that the same statistical theory applies to this problem, and illustrate this with a test case, indicating how researchers may estimate dependence between multiple constraints.

Approximately 30 climate models contributed to recent iterations of the CMIP
databases, and they generally agree, at least on broad statements: the world
is warming, anthropogenic emissions of CO

But should this consensus between models really lead to confidence in these
results? If we were to re-run the same scenario with the same model 30 times,
we would get the same answer 30 times, whether it be a good or bad model.
This repetition of one experiment would not tell us how good the model is,
and the behaviour of the real climate system would almost certainly lie
outside this narrow range of results. Different model development teams share
code, and even if the code is rewritten from scratch, the underlying
algorithms and methods are often linked

In Part 1 of this paper, we consider this question of model independence and
discuss how it may be addressed in a mathematically precise and well-founded
manner. We present an approach which links the usage in climate science to
the statistical definition of independence. We start by reviewing, in
Sect.

In Part 2, we illustrate how the theoretical basis for statistical
independence can also apply to the question of synthesising observational
constraints on the behaviour of the climate system, particularly the
equilibrium climate sensitivity. The equilibrium climate sensitivity

The question of independence has featured widely in climate research, but the research community has not yet arrived at a clear and unambiguous definition. Different authors have approached the question of independence in different ways, and their approaches are often mutually inconsistent.

One common approach has been to interpret model independence as meaning that
the models can be considered as having errors which are independent,
identically distributed (i.i.d. in common statistical parlance) samples drawn
from some distribution (typically Gaussian) with zero
mean

However, the truth-centred hypothesis is clearly refuted by numerous analyses
of the ensemble. In particular, the errors of different models are observed
to be strongly related, as can be shown by the positive correlations between
spatial patterns of biases in
climatology

Some approaches to model independence have been less quantitative in nature.

Perhaps the most constructive and complete approach to date is that
of

To summarise, the literature presents a strong consensus that the models are
not independent, but does not appear to present such a clear viewpoint
concerning what to do about this, or even the precise meaning of this term.
Given this lack of clarity, it is perhaps unsurprising that the IPCC does not
address this topic in detail, while nevertheless acknowledging its
importance

In probability theory, independence has a straightforward definition. Two
events,

As we have seen in Sect.

Bayes' theorem tells us how to update a prior probabilistic estimate of an
unknown,

If we have two events

If

The above elementary probability theory applies equally to the frequentist
and Bayesian paradigms. Within the frequentist paradigm, the probability of
an event is defined as the limit of its relative frequency over a large
number of repeated but random trials. Within the Bayesian paradigm, the
probability calculus may be used to describe the subjective beliefs of the
researcher. In the remainder of this paper, we exclusively adopt this
paradigm, since all the relevant uncertainties discussed here are epistemic
in nature (relating to imperfect knowledge) and not aleatory (arising from
some intrinsic source of “randomness”). Thus, rather than considering “the
pdf of

It should be noted that Bayesian probabilities, being personal in nature, are
in general conditional on some personal “background” set of beliefs of the
researcher

As we have seen, the question of (conditional) independence boils down to the
question of whether

We now explore how this Bayesian framework can be applied to the question of
model independence. We first consider the “truth-centred” hypothesis which is
perhaps most clearly presented by

In light of this failure of the truth-centred approach, we now present two alternative interpretations of statistical independence that we believe could be more relevant and appropriate in application to the ensemble of climate models. We use CMIP3 here, rather than CMIP5, primarily in order that the ideas developed here can in the future be tested against a somewhat new sample, so as to defend against the risk of data mining.

Consider firstly the case where the outputs of a subset of the models which
contributed to CMIP3 are labelled as

While the subjective nature of Bayesian probability precludes a definitive answer, we expect that, for most researchers and most model pairs where there is no clear institutional or historical link, they will indeed believe the models to be independent in this manner (i.e. conditional on the unlabelled ensemble of outputs). Conversely, if the pair of models appear to differ in only some very limited manner, such as being different resolutions of the same underlying code (consider for example the T63 and T42 versions of CCMA which were submitted to CMIP3) then it might be sensible for a researcher to instead update their prediction of the unknown model, increasing probabilities of outputs which were closer (according to some reasonable measure) to the named model, and with decreasing probabilities assigned to more distant outputs. The extent to which the probabilities are changed would be a direct indication of the strength of the dependence between the models, as judged by the researcher.

An alternative but similar approach can be formulated if, instead of using
the discrete distribution of actual climate model outputs, we parameterise
their distribution, for example as a multivariate Gaussian. If given the
parameters of a Gaussian distribution based on the outputs of

These approaches, we believe, encapsulate many of the same ideas as the model
similarity analyses of

To provide a concrete demonstration of the previous ideas, we analyse the
models which contributed to the CMIP3 database. Several modelling centres
contributed more than one model version and we expect, based on the existing
literature such as

We use as a simple distance metric the area-weighted root mean square (RMS)
difference between the climatological data fields (of commensurate variables)
after regridding to a common 5

A researcher who applied this probabilistic strategy to each of the 9 pairs
of models identified as coming from the same centre would assign a typical
(geometric mean) probability of around 0.09 to the correct field of outputs,
when averaged over all model pairs and over the three types of fields TAS,
PREC and PSL. The naive uniform distribution would in contrast only assign a
probability of

Similar results can be obtained when the analysis is performed in parametric
terms, when rather than using the sets of model outputs, only a statistical
summary of the ensemble of outputs is provided in the form of multivariate
Gaussian approximation to their distribution

Analysis of CMIP3 models. The

A natural question to ask is whether some weighting scheme could be developed
to account for model dependence of this type. If we anticipate that a pair of
models will be particularly similar, then including both in the ensemble
without downweighting either of them will tend to shift the ensemble mean
towards this pair of models. The correct weight to prevent this can easily be
calculated according to the interpolation formula in the following manner. If
we anticipate that a particular model

For example, if a second identical replicate of an existing model were to be
contributed to the ensemble (in which case

We have presented a coherent statistical framework for understanding model independence, and demonstrated how this framework can be applied in practice. Climate models cannot sensibly be considered independent estimates of reality, but fortunately this strong assumption is not required in order to make use of them. A more plausible, though still optimistic, assumption might be to interpret the ensemble as merely constituting independent samples of a distribution which represents our collective understanding of the climate system. This assumption is challenged by the near-replication of some climate models within the ensemble, and therefore sub-sampling or re-weighting the ensemble might be able to improve its usefulness. We have shown how the statistical definition of (conditional) independence can apply and how it helps in defining independence in a quantifiable and testable manner.

The definition we have presented is certainly not the only possible one and we expect that others may be able to suggest improvements within this framework. For instance, experts with knowledge of the model structures might be able to predict more detailed similarities between the outputs of model pairs. Moreover, there is no requirement that, in applying our principles, a researcher would use the most naive ignorant prediction of uniform probabilities across the ensemble of outputs, or the Gaussian summary of the distribution, as their predictions of the target model. However, our result here is sufficient to illustrate how the concept of statistical independence can be directly applied in a quantitative mathematical sense to the question of model independence, while encapsulating much of what is discussed in the literature.

An important point to note is that this interpretation of independence is
entirely unrelated to model and indeed ensemble
performance

While the question of model similarity and ensemble member selection has
already been considered by others

In Part 1, we discussed how the concept of independence applies to the sets
of models which form the CMIP ensembles of opportunity. In Part 2, we discuss
estimation of climate sensitivity, although the principles presented here
apply more generally to observational constraints on climate system
behaviour. While initially it may seem that this topic has little in common
with that of Part 1, we will show how the concept of probabilistic
independence also relates directly to this question. Thus, the probabilistic
background of Sect.

The magnitude of the equilibrium climate sensitivity

The question naturally arises as to whether these different constraints
could, and should, be synthesised. In most of the Bayesian analyses, the
prior is typically chosen to be vague, though there is some debate concerning
this choice

It should be clear from the discussion in Sect.

In Sect.

While the subjective nature of Bayesian priors (i.e.

Here we explore these ideas in a little more detail in order to illustrate how it is
possible to provide a credible basis for what are fundamentally subjective
judgements. Typically, a likelihood

A simple example is used to illustrate the point. We use a zero-dimensional
energy balance model to simulate the climate changes of both the 20th century
and the LGM. For simplicity, we only consider a subset of the relevant
uncertain parameters: the equilibrium sensitivity

For the warming of the 20th century, we assume the total forcing

By construction, we already know that the two constraints are independent
given

In the context of our example, we first create an ensemble with an arbitrary
but fixed value of

Outputs of ensemble simulations (red dots) and linear regression
fits (black lines):

We now make a small change to the model, and substitute

The linear regressions are not necessarily the best way to represent a
relationship that may in practice be more complex. However, such an approach
may be expected to capture any first-order effect. The central point of these
numerical experiments is to demonstrate that this dependence can in principle
be diagnosed from model outputs directly, without the need for detailed
knowledge or understanding of causal relationships embedded in the model
structure. Furthermore, a conditional likelihood

Such analyses may be impractical for the outputs of small ensembles such as
those arising from the CMIP multi-model experiments which explore structural
uncertainties. However, they may well be plausible for larger ensembles where
parameters are varied within a single model structure. The key requirement is
that the simulations relating to different observables are performed with the
same model in order that any dependence between constraints can be explored.
The results obtained will of course depend on the model used, but this is as
expected: the likelihood is not a property of reality, but rather a
consequence of the modelling assumptions, as was discussed in
Sect.

The question of how to combine multiple constraints on climate sensitivity
has been occasionally raised, but more commonly ignored, in analyses of this
parameter. It is well known that combining constraints should lead to more
confident conclusions, but the difficulty of accounting for possible
dependency appears to have widely discouraged researchers from attempting
this

We have discussed and presented a coherent statistical framework for understanding independence, and explained how this applies in two distinct applications. Climate models cannot sensibly be considered independent estimates of reality, but fortunately this strong assumption is not required in order to make use of them. A more plausible, though still optimistic, assumption might be to interpret the ensemble as merely constituting independent samples of a distribution which represents our collective understanding of the climate system. This assumption is challenged by the near-replication of some climate models within the ensemble, and therefore re-weighting or sub-sampling the ensemble could improve its usefulness. We have shown how the statistical definition of (conditional) independence can apply and how it helps in defining independence in a quantifiable manner. The definition we have presented is certainly not the only possible one and we expect that others may be able to suggest improvements within this framework.

When considering the use of observational evidence in constraining climate system behaviour (including the specific example of the equilibrium climate sensitivity), observational uncertainties themselves can generally be regarded as independent. However, the independence of the resulting likelihood functions is not so immediately clear, as it typically also rests on a number of modelling assumptions and uncertainties. Here we have shown how the question of independence can be readily interpreted and understood in terms of the conditional prediction of observations. These ideas may be useful in the design and analysis of ensemble experiments underpinning the analysis of observational constraints.

While our examples do not provide complete solutions to the questions raised, we have shown how the statistical framework can be usefully applied. Further, we see little prospect for progress to be made unless it is underpinned by a rigorous mathematical framework. Therefore, we hope that other researchers will be able to make use of these ideas in their future work.

The CMIP3 data used in this
paper are available at

Both authors contributed to the research and writing.

The authors declare that they have no conflict of interest.

We acknowledge the modelling groups, the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and the WCRP's Working Group on Coupled Modelling (WGCM) for their roles in making available the WCRP CMIP3 multi-model data set. Support of this data set is provided by the Office of Science, US Department of Energy. We are also particularly grateful for the many helpful suggestions made both by the reviewers and by many participants at a recent meeting at NCAR concerning this topic. Edited by: F. Sun Reviewed by: G. Abramowitz, B. Sanderson, and one anonymous referee