Scaling regimes and linear/nonlinear responses of last millennium climate to volcanic and solar forcings

At scales much longer than the deterministic predictability limits (about 10 days), the statistics of the atmosphere undergoes a drastic transition, the high-frequency weather acts as a random forcing on the lowerfrequency macroweather. In addition, up to decadal and centennial scales the equivalent radiative forcings of solar, volcanic and anthropogenic perturbations are small compared to the mean incoming solar flux. This justifies the common practice of reducing forcings to radiative equivalents (which are assumed to combine linearly), as well as the development of linear stochastic models, including for forecasting at monthly to decadal scales. In order to clarify the validity of the linearity assumption and determine its scale range, we use last millennium simulations, with both the simplified Zebiak–Cane (ZC) model and the NASA GISS E2-R fully coupled GCM. We systematically compare the statistical properties of solar-only, volcanic-only and combined solar and volcanic forcings over the range of timescales from 1 to 1000 years. We also compare the statistics to multiproxy temperature reconstructions. The main findings are (a) that the variability in the ZC and GCM models is too weak at centennial and longer scales; (b) for longer than ≈ 50 years, the solar and volcanic forcings combine subadditively (nonlinearly) compounding the weakness of the response; and (c) the models display another nonlinear effect at shorter timescales: their sensitivities are much higher for weak forcing than for strong forcing (their intermittencies are different) and we quantify this with statistical scaling exponents.


Linearity versus nonlinearity
The general circulation model (GCM) approach to climate modelling is based on the idea that whereas weather is an initial value problem, the climate is a boundary value problem (Bryson, 1997;Pielke, 1998).This means that although the weather's sensitive dependence on initial conditions (chaos, the "butterfly effect") leads to a loss of predictability at timescales of about 10 days, averaging over enough "weather" nevertheless leads to a convergence to the model's "climate".This climate is thus the state to which averages of model outputs converge for fixed atmospheric compositions and boundary conditions (i.e.control runs).
The question then arises as to the response of the system to small changes in the boundary conditions: for example, anthropogenic forcings are less than 2 W m −2 and, at least over scales of several years, solar and volcanic forcings are of similar magnitude or smaller (see e.g.Fig. 1a and the quantification in Fig. 2).These numbers are of the order of 1 % of the mean solar radiative flux; thus, we may anticipate that the atmosphere responds fairly linearly.This is indeed that usual assumption, and it justifies the reduction of potentially complex forcings to overall radiative forcings (see Meehl et al., 2004, for GCM investigations at annual scales and Hansen et al., 2005, for greenhouse gases).However, at long enough scales, linearity clearly breaks down; indeed, starting with the celebrated "Daisyworld" model (Watson and Lovelock, 1983), there is a whole body of literature that uses S. Lovejoy and C. Varotsos: Scaling regimes and linear/nonlinear responses of last millennium climate energy balance models to study the strongly nonlinear interactions/feedbacks between global temperatures and albedos.There is no debate that temperature-albedo feedbacks are important at the multimillennial scales of the glacialinterglacial transitions.Whereas some authors (e.g.Roques et al., 2014) use timescales as short as 200 years for the critical ice-albedo feedbacks, others have assumed that the temperature response to solar and volcanic forcings over the last millennium is reasonably linear (e.g.Østvand et al., 2014;Rypdal and Rypdal, 2014), while Pelletier (1998) and Fraedrich et al. (2009) assume linearity to even longer scales.
It is therefore important to establish the timescales over which linear responses are a reasonable assumption.However, clearly even over scales where typical responses to small forcings are relatively linear, the response may be nonlinear if the forcing is volcanic or volcanic-like, i.e. if it is sufficiently "spikey" or intermittent.

Atmospheric variability: scaling regimes
Before turning our attention to models, what can we learn empirically?Certainly, at high enough frequencies (the weather regime), the atmosphere is highly nonlinear.However, at about 10 days, the atmosphere undergoes a drastic transition to a lower-frequency regime, and this "macroweather" regime is potentially quasi-linear in its responses.Indeed, the basic atmospheric scaling regimes were identified some time ago -primarily using spectral analysis (Lovejoy and Schertzer, 1986;Pelletier, 1998;Shackleton and Imbrie, 1990;Huybers and Curry, 2006).However, the use of real space fluctuations provided a clearer picture and a simpler interpretation.It also showed that the usual view of atmospheric variability, as a sequence of narrow-scale range processes (e.g.nonlinear oscillators), has seriously neglected the main source of variability, namely the scaling "background spectrum" (Lovejoy, 2014b).What was found is that, for virtually all atmospheric fields, there was a transition from the behaviour of the mean temperature fluctuations scaling T ( t) ≈ t H with H > 0 to a lower-frequency scaling regime with H < 0 at scales t > ≈ 10 days -the macroweather regime.The transition scale of around 10 days can be theoretically predicted on the basis of the scaling of the turbulent wind due to solar forcing (via the imposed energy rate density; see Lovejoy andSchertzer, 2010, 2013;Lovejoy et al., 2014).Whereas the weather is naturally identified with the high-frequency H > 0 regime and with temperature values "wandering" up and down like in a drunkard's walk, the lower-frequency H < 0 regime is characterized by fluctuations tending to cancel out -effectively starting to converge.This converging regime is a low-frequency type of weather, described as "macroweather" (Lovejoy, 2013;Lovejoy et al., 2014).For the GCM control runs, macroweather effectively continues to asymptotically long times; in the real world, it continues to timescales of 10-30 years (industrial) and 50-100 years (pre-industrial), af-ter which a new H > 0 regime is observed.It is natural to associate this new regime with the climate (see Fig. 5 of Lovejoy et al., 2013; see also Franzke et al., 2013).Other papers analyzing macroweather scaling include Koscielny- Bunde et al. (1998), Eichner et al. (2003), Kantelhardt et al. (2006), Rybski et al. (2006), Bunde et al. (2005), Østvand et al. (2014), Rypdal and Rypdal (2014) and Fredriksen and Rypdal (2016).
The explanation for the "macroweather" to climate transition (at scale τ c ) appears to be that over the "macroweather" timescales -where the fluctuations are "cancelling" -other, slow processes which presumably include both external climate forcings and other slow (internal) land-ice or biogeochemical processes become stronger and stronger.At some point (τ c ) their variability dominates.A significant point where opinions diverge is the value of the global transition scale τ c during the pre-industrial Holocene, as well as the possibility that there are large regional variations in τ c during the Holocene, so that Greenland ice core data may not be globally representative; see Lovejoy (2015a) for a discussion.

Scaling in the numerical models
There have been several studies on the low-frequency control run responses of GCMs (Vyushin et al., 2004;Zhu et al., 2006;Fraedrich et al., 2009;Lovejoy et al., 2013;Fredriksen and Rypdal, 2016); the responses were found to be scaling down to their lowest frequencies.This scaling is a consequence of the absence of a characteristic timescale for the long-time model convergence; it turns out that the relevant scaling exponents are very small: empirically the GCM convergence is "ultra-slow" (Lovejoy et al., 2013) (Sect. 3.4).Most earlier studies have focused on the implications of the long-range statistical dependencies implicit in the scaling statistics.Unfortunately, due to this rather technical focus, the broader implications of the scaling have not been widely appreciated.
More recently, using scaling fluctuation analysis, behaviour has been put into the general theoretical framework of GCM climate modelling (Lovejoy et al., 2013).From the scaling point of view, it appears that the climate arises as a consequence of slow internal climate processes combined with external forcings (especially volcanic and solar, as well as -in the recent period -anthropogenic forcings).From the point of view of the GCMs, the low-frequency (multicentennial) variability arises exclusively as a response to external forcings, although potentially -with the addition of (known or currently unknown) slow processes such as landice or biogeochemical processes -new internal sources of low-frequency variability could be included.Ignoring the recent (industrial) period, and confining ourselves to the last millennium, the key question for GCMs is whether or not they can reproduce the climate regime where the decline of the "macroweather" fluctuations (H < 0) is arrested and the increasing H > 0 climate regime fluctuations begin.In a re-cent publication (Lovejoy et al., 2013), four GCMs simulating the last millennium were statistically analyzed and it was found that their low-frequency variability (especially below (100 yr) −1 ) was somewhat weak, and this was linked to both the weakness of the solar forcings (when using sunspot-based solar reconstructions with H > 0) and -for strong volcanic forcings -the statistical type of the forcing (H < 0; Lovejoy and Schertzer, 2012a;Bothe et al., 2013a, b;Zanchettin et al., 2013; see also Zanchettin et al., 2010, for the dynamics on centennial timescales).

This paper
The weakness of the responses to solar and volcanic forcings at multicentennial scales raises a question of linearity: is the response of the combined (solar plus volcanic) forcing roughly the sum of the individual responses?Additivity is often implicitly assumed when climate forcings are reduced to their equivalent radiative forcings, and Mann et al. (2005) have already pointed out that, in the Zebiak-Cane (ZC) model discussed below, they are not additive.
Here we more precisely analyze this question and quantify the degree of sub-additivity as a function of temporal scale (Sect.3.4).A related linear/nonlinear issue pointed out by Clement et al. (1996) is that, due to the nonlinear model response, there is a high sensitivity to a small forcing and a low sensitivity to a large forcing.Systems in which strong and weak events have different statistical behaviours display stronger or weaker "clustering" and are often termed "intermittent" (from turbulence).When they are also scaling, the weak and strong events are characterized by different scaling exponents that quantify how the respective clustering changes with timescale.In Sect.4, we investigate this quantitatively and confirm that it is particularly strong for volcanic forcing, and that for the ZC model the response (including that of a GCM) is much less intermittent, implying that the model strongly (and nonlinearly) smooths the forcing.
In this paper, we establish analysis methodologies that can address these issues and apply them to model outputs that cover the required range of timescales: last millennium model outputs.Unfortunately -although we consider the NASA GISS E2-R last millennium simulations -there seem to be no full last millennium GCM simulations that have the entire suite of volcanic-only, solar-only and solar plus volcanic forcings and responses; therefore we have used the simplified ZC model outputs published by Mann et al. (2005) (and even this lacked control runs to directly quantify the internal variability).
Although the ZC model lacks several important mechanisms -notably, for our purposes, deep ocean dynamicsthere are clearly sources of low-frequency variability present in the model.For example, Goswami and Shukla (1991), using 360-year control runs, found multidecadal and multicentennial nonlinear variability due to the feedbacks between Sea Surface Temperature (SST) anomalies, low-level conver-gence and atmospheric heating.In addition, in justifying their millennium ZC simulations, Mann et al. (2005) specifically cited model centennial-scale variability as a factor motivating their study.

Discussion
During the pre-industrial part of the last millennium, the atmospheric composition was roughly constant, and the Earth's orbital parameters varied by only a small amount.The main forcings used in GCM climate models over this period are thus solar and volcanic (in the GISS-E2-R simulations discussed below, reconstructed land use changes are also simulated but the corresponding forcings are comparatively weak and will not be discussed further).In particular, the importance of volcanic forcings was demonstrated by Minnis et al. (1993), who investigated the volcanic radiative forcing caused by the 1991 eruption of Mount Pinatubo, and found that volcanic aerosols produced a strong cooling effect.Later, Shindell et al. (2003) used a stratosphere-resolving GCM to examine the effect of the volcanic aerosols and solar irradiance variability on pre-industrial climate change.They found that the best agreement with historical and proxy data was obtained using both forcings.However, solar and volcanic forcings induce different responses because the stratospheric and surface influences in the solar case reinforce one another, but in the volcanic case they are opposed.In addition, there are important differences in solar and volcanic temporal variabilities (including seasonality) that statistically link volcanic eruptions with the onset of El Niño-Southern Oscillation events (Mann et al., 2005).Decreased solar irradiance cools the surface and stratosphere (Cracknell andVarotsos, 2007, 2011;Kondratyev and Varotsos, 1995a, b).In contrast, volcanic eruptions cool the surface, but aerosol heating warms the sunlit lower stratosphere (Shindell et al., 2003;Miller et al., 2012).This leads to an increased meridional gradient in the lower stratosphere but a reduced gradient in the tropopause region (Chandra et al., 1996;Varotsos et al., 1994Varotsos et al., , 2009)).Vyushin et al. (2004) suggested that volcanic forcings improve the low-frequency variability scaling performance of atmosphere-ocean models compared to all other forcings (see, however, the comment by Blender and Fraedrich, 2004, which also discusses earlier papers on the field) and Blender and Fraedrich (2004).Weber (2005) used a set of simulations with a climate model, driven by reconstructed forcings, in order to study the Northern Hemisphere temperature response to volcanic and solar forcing during 1000-1850.It was concluded that the response to solar forcing equilibrates at interdecadal timescales, while the response to volcanic forcing never equilibrates due to the fact that the time interval between volcanic eruptions is typically shorter than the dissipation timescale of the climate system (in fact, they are scal-136 S. Lovejoy and C. Varotsos: Scaling regimes and linear/nonlinear responses of last millennium climate ing, so that eruptions occur over all observed timescales; see below).
At the same time, Mann et al. (2005) investigated the response of El Niño to natural radiative forcing changes during 1000-1999 by employing the ZC model for the coupled ocean-atmosphere system in the tropical Pacific.They found that the composite feedback of the volcanic and solar radiative forcing to past changes, reproduces the fluctuations in the variability in the historic El Niño records (e.g.Efstathiou et al., 2011;Varotsos, 2013;Varotsos et al., 2015a, b).
Finally, as discussed below, Lovejoy and Schertzer (2012a) analyzed the timescale dependence of several solar reconstructions (Lean, 2000;Wang et al., 2005;Krivova et al., 2007;Steinhilber et al., 2009;Shapiro et al., 2011) and the two main volcanic reconstructions (Crowley, 2000, andGao et al., 2008; referred to as "Crowley" and "Gao" in the following).The solar forcings were found to be qualitatively quite different depending on whether the reconstructions were based on sunspots or 10 Be isotopes from ice cores, with the former increasing with timescale and the latter decreasing with timescale.This quantitative and qualitative difference brings into question the reliability of the solar reconstructions.By comparison, the two volcanic reconstructions were both statistically similar in type; they were very strong at annual and sometimes multiannual scales, but they quickly decrease with timescale (H < 0), explaining why they are weak at centennial and millennial scales.We re-examine these findings below.Mann et al. (2005) using the ZC model Mann et al. (2005) used the ZC model of the tropical Pacific coupled ocean-atmosphere system (Zebiak and Cane, 1987) to produce a 100-realization ensemble for solar forcing only, volcanic forcing only and combined forcings over the last millennium.Figure 1a shows the forcings and mean responses of the model which were obtained from ftp://ftp.ncdc.noaa.gov/pub/data/paleo/climate_forcing/mann2005/mann2005.txt.No anthropogenic effects were included.Mann et al. (2005) modelled the region between 30 and −30 • latitude by scaling the Crowley volcanic forcing reconstruction with a geometric factor 1.57 to take the limited range of latitudes into account.Figure 1b shows the corresponding GISS-E2-R simulation responses for three different forcings as discussed in Schmidt et al. (2014) and Lovejoy et al. (2013).Although these were averaged over the Northern Hemisphere land only (a somewhat different geography than the ZC simulations), one can see that the low frequencies seem similar even if the high frequencies are somewhat different.We quantify this below.(2005), integrated over the entire simulation region.The forcings are reconstructed solar (brown), solar blown up by a factor 5 (orange) and volcanic (black).For the solar forcing (top series), note the higher-resolution and wandering character for the recent centuries -this part is based on sunspots, not 10 Be.Bottom graph: the responses are for the solar forcing only (top), volcanic forcing only (middle) and both (bottom); they have been offset in the vertical for clarity by 2.5, 1.5, and 0.5 K, respectively.(b) GISS-ER-2 responses averaged over land (the Northern Hemisphereonly) at annual resolution.The industrial part since 1900 was excluded due to the dominance of the anthropogenic forcings.The solar forcing is the same as for the ZC model and is sunspot-based since 1610.The top row is for the solar forcing only, the middle series is the response to the solar and Crowley reconstructed volcanic forcing series (i.e. the same as used in the ZC model), and the bottom series uses the solar and reconstructed volcanic forcing series from Gao et al. (2008).Each series has been offset in the vertical by 1 K for clarity (these are anomalies, so the absolute temperature values are unimportant).

Comparing simulations with observations as functions of scale
The ultimate goal of weather and climate modelling (including forecasting) is to make simulations T sim (t) as close as possible to observations T obs (t).Ignoring measurement errors and simplifying the discussion by only considering a single spatial location (i.e. a single time series), the goal is to achieve simulations with T sim (t) = T obs (t).However, not only is this very ambitious for the simulations, but even when considering the observations, T obs (t) is often difficult to evaluate, if only because data are often sparse or inadequate in various ways.However, a necessary condition for T sim (t) = T obs (t) is the weaker statistical equality: T sim (t) , where "Pr" indicates "probability").Although T sim (t) d = T obs (t) is only a necessary (but not sufficient) condition for T sim (t) = T obs (t), it is much easier to empirically verify.
Starting in the 1990s, with the advent of ensemble forecasting systems, the rank histogram (RH) method was proposed (Anderson, 1996) as a simple non-parametric test of , and this has led to a large body of literature, including recently Bothe et al. (2013a, b).From our perspective there are two limitations of the RH method.First, it is non-parametric, so that its statistical power is low.More importantly, it essentially tests the equation T sim (t) d = T obs (t) at a single unique timescale/resolution.This is troublesome since the statistics of both T sim (t) and T obs (t) series will depend on their space-time resolutions; averaging in space alters the temporal statistics, so that e.g. 5 • × 5 • data are not only spatially but also effectively temporally smoothed with respect to 1 • × 1 • data.This means that, even if T sim (t) and T obs (t) have nominally the same temporal resolutions, they may easily have different high-frequency variability.Possibly more importantly -as claimed in Lovejoy et al. (2013) and below -the main difference between T sim (t) and T obs (t) may be that the latter has more low-frequency variability than the former, and this will not be captured by the RH technique, which operates only at the highest frequency available.This problem is indirectly acknowledged; see, for example, the discussion of correlations in Marzban et al. (2011).The potential significance of the low frequencies becomes obvious when H > 0 for the low-frequency range.In this case, since the series tends to "wander", small differences in the low frequencies may translate into very large differences in RH, even if the high frequencies are relatively accurate.
A straightforward solution is to use the same basic idea -i.e. to change the sense of equality from deterministic to probabilistic ("=" to " d =") -but compare the statistics systematically over a range of timescales.The simplest way to do this is to check the equality T sim ( t) d = T obs ( t), where T is the fluctuation of the temperature over a time period t (see the discussion in Lovejoy and Schertzer, 2013, Box 11.1).In general, knowledge of the probabilities is equivalent to knowledge of (all) the statistical moments (including the non-integer ones), and for technical reasons it turns out to be easier to check T sim ( t) d = T obs ( t) by considering the statistical moments.

Scaling fluctuation analysis
In order to isolate the variability as a function of timescale t, we estimated the fluctuations F ( t) (forcings, W m −2 ) and T ( t) (responses, K).Although it is traditional (and often adequate) to define fluctuations by absolute differences for our purposes this is not sufficient.Instead, we should use the absolute difference of the means from t to t + t/2 and from t + t/2 to t + t.Technically, the latter corresponds to defining fluctuations using Haar wavelets rather than "poor man's" wavelets (differences).In a scaling regime, the fluctuations vary with the time lag in a power law manner: where ϕ is a controlling dynamical variable (e.g. a dynamical flux) whose mean ϕ is independent of the lag t (i.e.independent of the timescale).This means that the behaviour of the mean fluctuation is < T > ≈ t H , so that when H > 0 fluctuations on average tend to grow with scale, whereas when H < 0 they tend to decrease.Note that the symbol "H " is in honour of Harold Edwin Hurst (Hurst, 1951).Although, in the case of quasi-Gaussian statistics, it is equal to his eponymous exponent, the H used here is valid in the more general multifractal case and is generally different.Fluctuations defined as differences are adequate for fluctuations increasing with scale (H > 0).When H > 0, the rate at which average differences increase with time lag t directly reflects the increasing importance of low frequencies with respect to high frequencies.However, in physical systems the differences tend to increase even when H < 0. This is because correlations T (t + t)T (t) tend to decrease with the time lag t and this directly implies that the mean square differences ( T ( t) 2 ) increase (mathematically, for a stationary process, . This means that when H < 0, differences cannot correctly characterize the fluctuations.For H < 0 the highfrequency details dominate the differences and prevent these differences from decreasing with increasing scale t. The Haar fluctuation, which is useful for −1 < H < 1, is particularly easy to understand since, with proper "calibration" in regions where H > 0, its value can be made to be very close to the difference fluctuation, while in regions where H < 0, it can be made close to another simple to interpret "anomaly fluctuation".The latter is simply the temporal average of the series over a duration t of the series with its overall mean removed (in Lovejoy and Schertzer, 2012b, this was termed a "tendency" fluctuation which is a less intuitive term).In this case, the decrease in the Haar fluctuations for increasing lag t characterizes how effectively averaging a (mean zero) process (the anomaly) over longer timescales reduces its variability.Here, the calibration is affected by multiplying the raw Haar fluctuation by a factor of 2, which brings the values of the Haar fluctuations very close to both the corresponding difference and anomaly fluctuations (over timescales with H > 0 and H < 0, respectively).This means that in regions where H > 0, to good accuracy, the Haar fluctuations can be treated as differences, whereas in regions where H < 0 they can be treated as anomalies.While other techniques such as detrended fluctuation analysis (Peng et al., 1994) perform just as well for determining exponents, they have the disadvantage that their fluctuations are not at all easy to interpret (they are the standard deviations of the residues of polynomial regressions on the running sum of the original series).Indeed, the DFA fluctuation function is typically presented without any units.
Once estimated, the variation in the fluctuations with timescale can be quantified by using the fluctuations' statistics; the qth order structure function S q ( t) is particularly convenient: where " " indicates ensemble averaging (here, we average over all disjoint intervals of length t).Note that although q can in principle be any value, here we restrict it to q > 0 since divergences may occur -indeed for multifractals, divergences are expected -for any q < 0. In a scaling regime, S q ( t) is a power law: where the exponent ξ (q) has a linear part qH and a generally nonlinear and convex part K(q) with K(1) = 0. K(q) characterizes the strong non-Gaussian, multifractal variability: the "intermittency".Gaussian processes have K(q) = 0.The root-mean-square (RMS) variation S 2 ( t) 1/2 (denoted simply It is only when the intermittency is small (K(q) ≈ 0) that we have ξ (2)/2 ≈ H = ξ (1).Note that since the spectrum is a second-order statistic, we have the useful exact relationship for the exponent β of the power law spectra: (this is a corollary of the Wiener-Khintchin theorem).Again, only when K(2) is small do we have the commonly used relation β ≈ 1 + 2H ; in this special case, H > 0, H < 0 corresponds to β > 1, β < 1.To get an idea of the implications of the nonlinear K(q), note that a high q value characterizes the scaling of the strong events, whereas a low q characterizes the scaling of the weak events (q is not restricted to integer values).The scalings are different whenever the strong and weak events cluster to different degrees; the clustering, in turn, is precisely determined by another exponent -the codimension -which itself is uniquely determined by K(q).We return to the phenomenon of "intermittency" in Sect.4; it is particularly pronounced in the case of volcanic forcings.
Figure 2a shows the result of estimating the Haar fluctuations for the solar and volcanic forcings.The solar reconstruction that was used is a hybrid obtained by "splicing" the annual resolution sunspot-based reconstruction (Fig. 2b, top; back to 1610, although only the more recent part was used by Mann et al., 2005) with a 10 Be-based reconstruction (Fig. 2b, bottom) at much lower resolution (≈ 40-50 years).In Fig. 2a, the two rightmost curves are for two different 10 Be reconstructions; at any given timescale, their amplitudes differ by nearly a factor of 10, yet they both have Haar fluctuations that diminish with scale (H ≈ −0.3). Figure 2b (top) clearly shows the qualitative difference with "wandering" (H > 0, sunspot-based) and Fig. 2b (bottom), the cancelling (H < 0, 10 Be-based) solar reconstructions (Lovejoy and Schertzer, 2012a).In the "spliced" reconstruction used here, the early 10 Be part (1000-1610) at low resolution was interpolated to annual resolution; the interpolation was close to linear, so that we find H ≈ 1 over the scale range 1-50 years, with the H < 0 part barely visible over the range of 100-600 years (roughly the length of the 10 Be part of the reconstruction).
The reference lines in Fig. 2a have slopes −0.4,−0.3, and 0.4, showing that both solar and volcanic forcings are fairly accurately scaling (although, because of the "splicing" for the solar, only up until ≈ 200-300 years) but with exactly opposite behaviours: whereas the solar fluctuations increase with timescale, the volcanic fluctuations decrease with scale.For timescales beyond 200-300 years, the solar forcing is stronger than the volcanic forcing (they "cross" at roughly 0.3 W m −2 ).

Linearity and nonlinearity
There is no question that -at least in the usual deterministic sense -the atmosphere is turbulent and nonlinear.Indeed, the ratio of the nonlinear to the linear terms in the dynamical equations -the Reynolds number -is typically about 10 12 .Due to the smaller range of scales, in the numerical models it is much lower, but it is still ≈ 10 3 to 10 4 .Indeed, it turns out that the variability builds up scale by scale from large to small scales, so that -since the dissipation scale is about 10 −3 m -the resulting (millimetre-scale) variability can be enormous; the statistics of this buildup are quite accurately modelled by multifractal cascades (see the review by Lovejoy and Schertzer, 2013, especially Ch. 4 for cascade analyses of data and model outputs).The cascade-based Fractionally Integrated Flux model (FIF; Schertzer and Lovejoy, 1987) is a nonlinear stochastic model of the weather scale dynamics, and can be extended to provide nonlinear stochastic models of the macroweather and climate regimes (Lovejoy and Schertzer, 2013, Ch. 10).The RMS Haar fluctuation S( t) for the solar and volcanic reconstructions used in the ZC simulation for lags t from 2 to 1000 years (left).The solar is a "hybrid" obtained by "splicing" the sunspot-based reconstruction (b, top) with a 10 Be-based reconstruction (b, bottom).The two rightmost curves are for two different 10 Be reconstructions (Shapiro et al., 2011;Steinhilber et al., 2009).Although at any given scale, their different assumptions lead to amplitudes differing by nearly a factor of 10, their exponents are virtually identical and the amplitudes diminish rapidly with scale.(b) A comparison of the sunspot derived total solar irradiance (TSI) anomaly (top, used in the ZC and GISS simulations back to 1610, H ≈ 0.4) with a recent 10 Be reconstruction (bottom, total TSImean plus anomaly -since 7362 BC; see (a) for a fluctuation analysis, H ≈ −0.3) similar to that "spliced" onto the sunspot reconstruction for the period 1000-1610.We can see that the statistical characteristics are totally different with the sunspot variations "wandering" (H > 0), whereas the 10 Be reconstruction is "cancelling" (H < 0).The sunspot data were for the "background" (i.e. with no 11-year cycle; see Wang et al., 2005, for details); the data for the 10 Be curve were from Shapiro et al. (2011).
However, ever since Hasselmann (1976), it has been proposed that sufficiently space-time-averaged variables may respond linearly to sufficiently space-time-averaged forcings.In the resulting (low-frequency) phenomenological models, the nonlinear deterministic (high-frequency) dynamics act as a source of random perturbations; the resulting stochastic model is usually taken as being linear.Such models are only justified if there is a physical-scale separation between the high-and low-frequency processes.The existence of a relevant break (at 2-10-day scales) has been known since Panofsky and Van der Hoven (1955) and was variously theorized as the "scale of migratory pressure systems of synoptic weather map scale" (Van der Hoven, 1957) and later as the "synoptic maximum" (Kolesnikov and Monin, 1965).From the point of view of Hasselman-type linear stochastic modelling (now often referred to as "linear inverse modelling" (LIM), e.g.Penland and Sardeshmuhk, 1995;Newman et al., 2003;Sardeshmukh and Sura, 2009), the system is regarded as a multivariate Ornstein-Uhlenbeck (OU) process.At high frequencies, an OU process is essentially the integral of a white noise (with spectrum ω −β h with β h = 2), whereas at low frequencies it is a white noise (i.e.ω −β l with β l = 0).In the LIMs, these regimes correspond to the weather and macroweather, respectively.Recently, Newman (2013) showed that predictive skill for global temperature hindcasts is somewhat superior to GCMs for 1-2-year horizons.
In the more general scaling picture going back to Lovejoy and Schertzer (1986), the transition corresponds to the lifetime of planetary structures.This interpretation was quantitatively justified in Lovejoy and Schertzer (2010) by using the turbulent energy rate density.The low-and high-frequency regimes were scaling and had spectra significantly different than those of OU processes (notably with 0.2 < β l < 0.8), with the two regimes now being referred to as "weather" and "macroweather" (Lovejoy and Schertzer, 2013).Indeed, the main difference with respect to the classical LIM is at low frequencies.Although the difference in β l may not seem so important, the LIM value β l = 0 (white noise) has no low-frequency predictability whereas the actual values 0.2 < β l < 0.8 (depending mostly on the land or ocean location) corresponds to potentially huge predictability (the latter can diverge as β l approaches 1).The new "ScaLIng Macroweather Model" (SLIMM) has been proposed as a set of fractional-order (but still linear) stochastic differential equations with predictive skill for global mean temperatures out to at least 10 years (Lovejoy et al., 2015;Lovejoy, 2015b).However, irrespective of the exact statistical nature of the weather and macroweather regimes, a linear stochastic model may still be a valid approximation over significant ranges.
These linear stochastic models (whether LIM or SLIMM) explicitly exploit the weather/macroweather transition and may have some skill up to macroweather scales perhaps as large as decades.However, at long enough timescales, another class of phenomenological model is often used, S. Lovejoy and C. Varotsos: Scaling regimes and linear/nonlinear responses of last millennium climate wherein the dynamics are determined by radiative energy balances.Energy balance models focus on slower (true) climate scale processes such as sea ice-albedo feedbacks and are generally quite nonlinear, being associated with nonlinear features such as tipping points and bifurcations (Budyko, 1969).These models are typically zero-or one-dimensional in space (i.e. they are averaged over the whole Earth or over latitude bands) and may be deterministic or stochastic (see Nicolis, 1988, for an early comparison of the two approaches).See Dijkstra (2013) for a survey of the classical deterministic dynamical systems approach as well as the more recent stochastic "random dynamical systems" approach (see also Ragone et al., 2014).Although energy balance models are almost always nonlinear, there have been several suggestions that linear energy balance models are in fact valid up to millennial and even multimillennial scales.
Finally, we could mention the existence of empirical evidence of stochastic linearity between forcings and responses in the macroweather regime.Such evidence comes, for example, from the apparent ability of linear regressions to "remove" the effects of volcanic, solar and anthropogenic forcings (Lean and Rind, 2008).This has perhaps been quantitatively demonstrated in the case of anthropogenic forcing, where use is made of the globally, annually averaged CO 2 radiative forcings (as a linear surrogate for all anthropogenic forcings).When this radiative forcing was regressed against similarly averaged temperatures, it gave residues with amplitudes ±0.109K (Lovejoy, 2014a), which is almost exactly the same as GCM estimates of the natural variability (e.g.Laepple et al., 2008).Notice that in this case the identification of the global temperature T globe as the sum of a regression determined anthropogenic component (T anth ) with residues as natural variability (T nat ) is in fact only a confirmation of stochastic linearity (i.e.T globe d = T anth + T nat ).This is because the actual residues would presumably have been different if there had been no anthropogenic forcing.Indeed, when the residues were analyzed using fluctuation analysis, it was only their statistics that were close to the pre-industrial multiproxy statistics.

Testing linearity: the additivity of the responses
We can now test the linearity of the model responses to solar and volcanic forcings.First, consider the model responses (Fig. 3a).Compare the response to the volcanic-only forcing (green) curve with the response from the solar-only forcing (black).As expected from Fig. 2a, the former is stronger than the latter up until centennial scales, reflecting the stronger volcanic forcing.At scales t ≈ > 100 years, however, we see that the solar-only forcing has a stronger response, also as expected from Fig. 2a.Now consider the response to the combined volcanic and solar forcing (brown).Unsurprisingly, it is very close to the volcanic-only forcing until t ≈ 100 years; however, at longer timescales, the combined response seems to decrease following the volcanic forcing curve; it seems that at these longer timescales the volcanic and solar forcings have negative feedbacks, so that the combined response to solar plus volcanic forcing is actually less than for pure solar forcing -that is, they are "subadditive".
In order to quantify this we can easily determine the expected solar and volcanic response if the two were combined additively (linearly).In the latter case, the solar and volcanic fluctuations would not interfere with each other, and since these forcings are statistically independent, the responses would also be statistically independent, the response variances would add.
A linear response means that temperature fluctuations due to only solar forcing ( T s ( t)) and only volcanic forcing ( T ν ( t)) would be related to the temperature fluctuations of the response to the combined solar plus volcanic forcings ( T s,ν ( t)) as (4) This is true regardless of the exact definition of the fluctuation: as long as the fluctuation is defined by a linear operation on the temperature series, any wavelet will do.Therefore, squaring both sides and averaging (" ") and assuming that the fluctuations in the solar and volcanic forcings are statistically independent of each other (i.e.T s ( t) T ν ( t) = 0), we obtain The implied additive response structure function S( t) = ( T s ( t) 2 + T ν ( t) 2 ) 1/2 is shown in Fig. 3b along with the ratio of the latter to the actual (nonlinear) solar plus volcanic response (top: ( T s ( t) 2 + T ν ( t) 2 ) 1/2 / T s,ν ( t) 2 1/2 ).It can be seen that the ratio is fairly close to unity for timescales below about 50 years.However, beyond 50 years there is indeed a strong negative feedback between the solar and volcanic forcings.This is seen more clearly in Fig. 3c, which shows that, at t ≈ 400 years, the negative feedback is strong enough to reduce the theoretical additive fluctuation amplitudes by a factor of ≈ 2 (the fall-off at the largest t is probably an artefact of the poor statistics at these scales).It should be noted that, in addition to linearity, the latter holds assuming statistical independence (top curve in Fig. 3c) of the solar and volcanic forcing.For comparison, the bottom curve in Fig. 3c illustrates the results obtained when analyzing the series constructed by directly summing the two response series (instead of assuming statistical independence).It is clearly seen that the basic result still holds but it is a little less strong (a factor of ≈ 1.5).The reason for the difference is that the cancellation of the cross terms assumed by statistical independence is only approximately valid on single realizations, especially at the lower frequencies, where the statistics are worse (even on a single realization, at any given scale -except the very longest -there are several fluctuations, so that there is still some averaging).Wang et al., 2005), and both (brown).No anthropogenic effects were modelled.Also shown for reference are the fluctuations for three multiproxy series (blue, dashed, from 1500 to 1900, pre-industrial; the fluctuations statistics from the three series were averaged, and this curve was taken from Lovejoy and Schertzer, 2012b).We see that the combined volcanic and solar response of the model reproduces the statistics until scales of ≈ 50-100 years; however, at longer timescales, the model fluctuations are much too weak -roughly 0.1 K (corresponding to ±0.05 K) and constant or falling -whereas at 400-year scales, the temperature fluctuations are ≈ 0.25 K (±0.125) and rising.(b) A comparison of the RMS fluctuations of the ZC model response to combined solar and volcanic forcings (brown, bottom, from a), with the theoretical additive responses (black, bottom) as well as their ratio (S additive /S actual black, top).The additive response was determined from the RMS of the solar-only and volcaniconly response variances (from a): additivity implies that the fluctuation variances add (assuming that the solar and volcanic forcings are statistically independent).We can see that, after about 50-200 years, there are strong negative feedbacks, and the solar and volcanic forcings are subadditive; see (c) for an enlargement of the ratio.(c) An enlarged view of the ratio of the linear to nonlinear responses (from b).The top curve assumes for the combined forcing, the linearity of the response and statistical independence of the solar and volcanic forcings, whereas the bottom curve uses the actual response to the combined forcings.The maximum at around 400 years (top curve) corresponds to a factor ≈ 2 (≈ 1.5, bottom curve) of negative feedback between the solar and volcanic forcings.The decline at longer durations ( t's) -especially the single 1000-year fluctuation -is likely to be an artefact of the limited statistics at these scales.
The calculations above ignored the model's internal variability; this was considered small due to the averaging over 100 realizations of the ZC model with the same forcings: the internal variability is expected to largely cancel out.While it is true that a definitive answer to this requires running the model in "control mode" so as to capture only the internal variability (as was done in for the GISS model; see Fig. 4), there are nevertheless several reasons why the internal variability is almost certainly smaller than the response due to the forcings: i.We can get a typical order of magnitude of the internal variability from the GISS model, Fig. 4; we see that for a single realization -without averaging over 100 realizations as in Fig. 3a -the typical centennial variability is ≈ ±0.05 K and decreasing with a power law with exponent ≈ ξ (2)/2 ≈ −0.2.After averaging for 100 realizations, we expect this to decrease by (100) 0.5 = 10, i.e. to ±0.005 K.This is much smaller than the centennialscale variability in the ZC responses in Fig. 3a (from the graph, these are about ≈ ±(10 −1.2 )/2 ≈ ±0.03 K).
ii.We can use the fact that (a) the observed responses are upper bounds on the internal variability and (b) that the internal variability must decrease with scale (otherwise the model's climate diverges rather than converges for long times).Exponents near the GISS value ξ (2)/2 ≈ −0.2 are common; see e.g.Lovejoy et al. (2013).From Fig. 2, we see that the ZC solar response at ≈ 20 years is ±0.03K, so this is an upper bound for the internal variability at all scales longer than ≈ 20 years.However, over the range ≈ 50-500 years (relevant for the subadditivity conclusion), the solar response variability is considerably larger than this noise value: from the graph, ≈ ±(10 −0.8 )/2 ≈ ±0.08 K.
We conclude that it is unlikely that the internal variability is strong enough to account for the results.
In the ZC model, all forcings are input at the surface so that here the subadditivity is due to the differing seasonality, fluctuation intensities and spatial distributions of the solar and volcanic forcings.In the GISS-E2-R GCM simulations, the response to the solar forcing is too small to allow us to determine whether it involves a similar solar-volcanic negative feedback (Fig. 4).In vertically stratified atmospheres, i.e. in GCMs or in the real atmosphere, non-additivity is perhaps not surprising given the difference between the solar and volcanic vertical heating profiles.If such negative feedbacks are substantiated in further simulations, it would enhance the credibility of the idea that current GCMs are missing critical slow (multi-centennial, multi-millennial) climate processes.
No matter what the exact explanation, non-additivity underlines the limitations of the convenient reduction of climate forcings to radiative forcing equivalents.It also indicates that, at scales longer than about 50-200 years, energy budget models must nonlinearly account for albedo-temperature interactions (i.e. that linear energy budget models are inadequate at these timescales, and that albedo-temperature interactions must at least be correctly parametrized).Also shown for reference in Fig. 3a are the fluctuations for three multiproxy estimates of annual Northern Hemisphere temperatures (1500-1900, pre-industrial;Moberg et al., 2005;Huang, 2004;Ljungqvist, 2010; analysis taken from Lovejoy and Schertzer, 2012c).Although it should be borne in mind that the ZC model region (the Pacific) does not coincide with the proxy region (the Northern Hemisphere), the latter is the best model validation available.In addition, since we compare model and proxy fluctuation statistics as functions of timescale, the fact that the spatial regions are somewhat different is less important than if we had attempted a direct year-by-year comparison of model outputs with the multiproxy reconstructions.
In Fig. 3a, we see that the responses of the volcanic-only and the combined volcanic and solar forcings fairly well reproduce the RMS multiproxy statistics until ≈ 50 years; however, at longer timescales, the model fluctuations are substantially too weak -roughly 0.1 K (corresponding to ±0.05 K) and constant or falling, whereas at 400-year scales, the RMS multiproxy temperature fluctuations are ≈ 0.25 K (±0.125) and rising.Indeed, in order to account for the ice ages, they must continue to rise until ≈ 5 K (±2.5 K) at glacialinterglacial scales of 50-100 kyr (the "glacial-interglacial window": according to palaeodata, this rise continues in a smooth, power law manner with H > 0 until roughly 100 kyr; see Lovejoy and Schertzer, 1986;Shackleton and Imbrie, 1990;Pelletier, 1998;Schmitt et al., 1995;Ashkenazy et al., 2003;Huybers and Curry, 2006;Lovejoy et al., 2013).
In Fig. 4, we compare the RMS Haar fluctuations from the ZC model combined (volcanic and solar forcing) response Earth Syst.Dynam., 7, 133-150, 2016 www.earth-syst-dynam.net/7/133/2016/ with those from simulations from the GISS-E2-R GCM with solar-only forcing and a control run (no forcings, black; see Lovejoy et al., 2013, for details; the GISS-E2-R solar forcing was the same as the spliced series used in the ZC simulations).We see that the three are remarkably close over the entire range; for the GISS model, this indicates that the solar-only forcing is so small that the response is nearly the same as for the unforced (control) run.The ZC combined solar and volcanic forcing is clearly much weaker than the pre-industrial multiproxies (dashed blue, same as in Fig. 3a).
The reference line with slope −0.2 shows the convergence of the control to the model climate; the shallowness of the slope (−0.2) implies that the convergence is ultra-slow.For example, fluctuations from a 10-year run control run are only reduced by a factor of (10/3000) −0.2 ≈ 3 if the run is extended to 3 kyr.
Finally, in Fig. 5, we compare the responses to the volcanic forcings for the ZC model and for the GISS-E2-R GCM for two different volcanic reconstructions (Gao et al., 2008;Crowley, 2000; the latter reconstruction was used in the ZC simulation.For reference, we again show the combined ZC response and the pre-industrial multiproxies.We see that the GISS GCM is much more sensitive to the volcanic forcing than the ZC model; indeed, it is too sensitive at scales t < ≈ 100, but nevertheless becomes too weak at scales t ≈ > 200 years.Indeed, since the volcanic forcings continue to decrease with scale, we expect the responses to keep diminishing with scale at larger t. Note that for the spatial regions covered by the ZC simulation, the GISS outputs and the multiproxy reconstructions are not the same.For the latter, the reason is that there is no perfectly appropriate (regionally defined) multiproxy series, whereas for the GISS outputs we reproduced the structure function analysis from a published source.Yet, the differences in the regions may not be so important since we are only making statistical comparisons.This is especially true since all the series are for planetary-scale temperatures (even if they are not identical global-sized regions) and, in addition, we are mostly interested in the 50-year (and longer) statistics, and these may be quite similar.

The trace moment analysis technique
In the previous sections we considered the implications of linearity when climate models were forced separately with two different forcings compared with the response to the combined forcing; we showed that the ZC model was subadditive.However, linearity also constrains the relation between the fluctuations in the forcings and the responses.For example, at least since the work of Clement et al. (1996), in the context of volcanic eruptions, it has been recognized that the models are typically sensitive to weak forcing events but A comparison of the volcanic forcings for the ZC model (bottom, green) and for the GISS-E2-R GCM for two different volcanic reconstructions (Gao et al., 2008;Crowley, 2000) (top green curves, reproduced from Lovejoy et al., 2013).Also shown is the combined response (ZC, brown) and the pre-industrial multiproxies (dashed blue).
insensitive to strong ones (i.e. they are nonlinear), and Mann et al. (2005) noticed this in their ZC simulations.In a scaling regime, both forcings and responses will be characterized by a hierarchy of exponents (i.e. the function ξ (q) in Eq. (3) or equivalently by the exponent H and the function K(q)), the differences in the statistics of weak and strong events are reflected in these different exponents; highorder moments (large q) are dominated by large fluctuations, and vice versa for low-order moments.The degree of convexity of K(q) quantifies the degree of these nonlinear effects (indeed, how they vary over timescales t).Such "intermittent" behaviour was first studied in the context of turbulence (Kolmogorov, 1962;Mandelbrot, 1974).
In order to quantify this, recall that if the system is linear, the response is a convolution of the system Green's function with the forcing; in spectral terms it acts as a filter.If it is also scaling, then the filter is a power law: ω −H , where ω is the frequency (mathematically, if T (ω) and F (ω) are the Fourier transforms of the response and forcing, for a scaling linear system, we have T (ω) ∝ ω −H F (ω); such a filter corresponds to a fractional integration of order H ). In terms of fluctuations this implies T ( t) = t H F ( t) (assuming that the fluctuations are appropriately defined).Therefore, by taking qth powers of both sides and ensemble averaging, we see that in linear scaling systems we have ξ T (q) = q H + ξ F (q) (compare Eq. 3 with ξ T (q) and ξ F (q), the structure function exponents for the response and the forcing, respectively).If ξ T (q) and ξ F (q) only differ by a term linear in q, then K T (q) = K F (q), so that if we empirically find K T (q) = K F (q) (i.e. the intermittencies are different) over some regime, then we may conclude that the system www.earth-syst-dynam.net/7/133/2016/Earth Syst.Dynam., 7, 133-150, 2016 is nonlinear (note that this result is independent of whether the linearity is deterministic or only statistical in nature).
Let us investigate the nonlinearity of the exponents by returning to Eqs. ( 1)-(3) in more detail.Up until now we have studied the statistical properties of the forcings and responses using the RMS fluctuations; for example, we have used the following equation but only for the value q = 2: (See Eq. 1.) The exponent K(q) (implicitly defined in Eq. 3) is given explicitly by where τ eff is the effective outer scale of the multifractal cascade process, ϕ gives rise to the strong variability and λ is the cascade ratio from this outer scale to the scale of interest t.
If the driving flux ϕ was quasi-Gaussian, then K(q) = 0, ξ (q) = q H and the exponent ξ (2) = 2H = β −1 would be sufficient for a complete characterization of the statistics.However, geophysical series are often far from Gaussian; even without statistical analysis, a visual inspection (the "sharp spike" of varying amplitudes; see Fig. 1a) of the volcanic series makes it obvious that it is particularly extreme in this regard.We expect -at least in this case -that the K(q) term will readily be quite large (although note the constraint K(1) = 0 and the mean of ϕ (the q = 1 statistic) is independent of scale).To characterize this, note that since K(1) = 0, we have ξ (1) = H and then use the first two derivatives of ξ (q) at q = 1 to estimate the tangent (linear approximation) to K(q) near the mean (C 1 ) and the curvature of K(q) near the mean characterized by α.This gives The parameters C 1 and α are particularly convenient since -thanks to a kind of multiplicative central limit theoremthere exist multifractal universality classes (Schertzer and Lovejoy, 1987).For such universal multifractal processes, the exponent function K(q) can be entirely (i.e.not only near q = 1) characterized by the same two parameters: In the universality case (Eq.9), it can be checked that the estimate in Eq. ( 8) (near the mean) is satisfied, so that C 1 and α characterize all the statistical moments (actually, Eqs.6 and 7 are only valid for q < q c ; for q > q c , the above will break down due to multifractal phase transitions; the critical q c is typically > 2, so that here we confine our analyses to q ≤ 2 and do not discuss the corresponding extreme -large q -behaviour).
A drawback of the above fluctuation method for using ξ (q) to estimate K(q) (Eq. 6) is that if C 1 is not too big, then for the low-order moments q, the exponent ξ (q) may be dominated by the linear (q H ) term, so that the multifractal part (K(q)) of the scaling is not too apparent.A simple way of directly studying K(q) is to transform the original series so as to estimate the flux ϕ at a small scale, essentially removing the (q H ) part of the exponent.It can then be degraded by temporal averaging and the scaling of the various statistical moments -the exponents K(q) -can be estimated directly.
To do this, we divide Eq. ( 1) by its ensemble average so as to estimate the normalized flux at the highest resolution by where the ensemble average (" ") is estimated by averaging over the available data (here a single series), and the fluctuations t are estimated at the finest resolution (here 1 year).

Trace moment analysis of forcings, responses and multiproxies
We now test Eq.( 7); for convenience, we use the symbol λ as the ratio of a convenient reference scale -here the length of the series, τ ref = 1000 years, to the resolution scale t (for some analyses, 400 years was used instead; see the captions in Fig. 6).In an empirical study, the outer scale τ eff is not known a priori and must be empirically estimated; denote the scale at which the cascade starts by λ .Starting with Eq. ( 7), the basic prediction of multiplicative cascades is that the normalized moments ϕ (Eq.10) obey the generic multiscaling relation: We can see that τ eff can readily be empirically estimated since a plot of log 10 M versus log 10 λ will have lines (one for each q, slope K(q)) converging at the outer scale λ = λ eff (although, for a single realization such as here, the outer scale will be poorly estimated.This is because for a single sample (series) there is clearly no variability at the longest timescales; there is a single long-term value that generally poorly represents the ensemble mean).Figure 6a shows the results when T is estimated by the absolute second difference at the finest resolution.The solar forcing (upper right) was only shown for the recent period (1600-2000) over which the higher-resolution sunspot-based reconstruction was used; the earlier 1000-1600 part was based on a (too) low-resolution 10 Be "splice" as discussed above (see Fig. 2b).In the solar plot (upper left), but especially in the volcanic forcing plot (upper right), we see that the scaling is excellent over nearly the entire range (the points are nearly  5 5 5  2.5 2 5  2. 2.5 2. 2.5 5  2.5 2.5 5  2 5 5 5 5 5 5  1.5  0.5  2.5 5 5 5  2 5 5 5  2 5 5 5 5  2 5 5  2 5 5 5 5  2 5 5  2 5 5 5 5 5 5  1 (Eq. 11) are plotted for q = 2, 1.9, 1.8, 1.7, 1.6, . . .0.1.Upper left is solar forcing (last 400 years only, mostly sunspot-based), upper right is volcanic, middle left is solar response (last 400 years), and middle right (volcanic response) and lower left are response to combined forcings (last 1000 years).Note that all axes are the same except for volcanic.For solar, only the last 400 years were used since this was reconstructed using the more reliable sunspot-based method.The earlier 10 Be-based reconstruction had relatively poor resolution and is not shown.Since the volcanic variability was so dominant, for the combined response (bottom left) the entire series was used.The red points and lines are the empirical values, and the blue lines are regressions constrained to go through a single outer scale point (see Eq. 11).In comparing the different parts of the figure, note in particular (i) the log-log linearity for different statistical moments, (ii) the fact that the lines for different moments reasonably cross at a single outer scale, and (iii) the overall amplitude of the fluctuationsfor example by visually comparing the range of the q = 2 moments (the top series) as we move from one graph to another.Lovejoy and Schertzer (2013), where full details and references are given.All were for the pre-industrial period AD 1500-1900; λ = 1 corresponds to 400 years.The curve shows the generic convergence of the envelope of curves to a quasi-Gaussian process; the proximity of the curve to the envelope indicates that, with the possible exception of the Mann curve, the intermittency is low.  1 shows the scaling exponent estimates for the forcings and ZC model responses.For solar (forcing and response), only the recent 400-year (sunspot-based) series were used; for the others, the entire 1000-year range was used (see Fig. 6a).The RMS exponent was estimated from Eqs. ( 6) and ( 9): H was estimated from the Haar fluctuations, and α and C 1 were estimated from the trace moments (Fig. 6a).Note that the external cascade scales are unreliable since they were estimated from a single realization.The control runs on the right are for the GISS-E2-R model discussed in the text and (ECHAM5) from the fully coupled COSMOS-ASOB millennium long-term simulations based on the Hamburg ECHAM5 model for AD 800-4000.
linear) and, in addition, the lines plausibly "point" (i.e.cross) at a unique outer scale λ = λ eff , which is not far from the length of the series; see Table 1 for estimates of the corresponding timescales.From these plots we see that the responses to the volcanic forcing "spikiness" (intermittency) are much stronger than to the corresponding responses to the weaker solar "spikiness".The model atmosphere therefore considerably dampens the intermittency, but in addition this effect is highly nonlinear, so that the intermittency of the combined volcanic and solar forcing (bottom left) is actually a little less than the volcanic-only intermittency (bottom right).Table 1 gives a quantitative characterization of the intermittency strength near the mean, using the C 1 parameter.
It is interesting at this stage to compare the intermittency of the ZC outputs with those of the GISS-E2-R GCM (Fig. 6b) and with multiproxy temperature reconstructions (Fig. 6c).In Fig. 6b, we see that the GISS-E2-R trace moments rapidly die off at large scales (small λ), so that the intermittency is limited to small scales to the right of the convergence point.In this figure, we see that the lines converge at log 10 λ ≈ 1.1-1.5, corresponding to τ eff in the range roughly 10-30 years.Since the intermittency builds up scale by scale from large scales modulating smaller scales in a hierarchical manner, and since this range of scales is small, the intermittency will be small.The partial exception is for the upper right plot, which is for the GISS-E2-R response to the large Gao volcanic forcing (recall that the ZC model uses the weaker, Crowley volcanic reconstruction, whose response is strongly intermittent; see Fig. 6b, the upper left plot).This result shows that, contrary to the ZC model, whose response is strongly intermittent (highly non-Gaussian) over most of the range of timescales, the GISS-E2-R response is nearly Gaussian, implying that the (highly non-Gaussian) forcings are quite heavily (nonlinearly) damped.
This difference in the model responses to the forcing intermittency is already interesting, but it does not settle the question as to which model is more realistic.To attempt to answer this question, we turn to Fig. 6c, which shows the trace moment analysis for six multiproxy temperature reconstructions over the same (pre-industrial) period as the GISS-E2-R model (1500-1900; unlike the ZC model, the GISS-E2-R included anthropogenic forcings, so that the period since 1900 was not used in the GISS-E2-R analysis).Statistical comparisons of nine multiproxies were made in Ch. 11 of Lovejoy and Schertzer (2013) (for reasons of space, only six of these are shown in Fig. 6c), where it was found that the pre-2003 multiproxies had significantly smaller multicentennial and lower-frequency variability than the more recent multiproxies used as reference in Figs. 4 and 5.However, Fig. 6c shows that the intermittencies are all quite low (with the partial exception of the Mann series; see the upper right plot).This conclusion is supported by the comparison with the red curves.These curves indicate the generic envelope of trace moments of quasi-Gaussian processes; for q ≤ 2 it shows how the latter converge (at large scales, small λ, to the left) to the flat (K(q) = 0) Gaussian limit.We see that the actual lines are only slightly outside this envelope showing that they are only marginally more variable than quasi-Gaussian processes.
The comparison of the GISS-E2-R outputs (Fig. 6b) with the multiproxies (Fig. 6c) indicates that they are both of low intermittency and are more similar to each other than to the ZC multiproxy statistics.One is therefore tempted to conclude that the GISS-E2-R model is more realistic than the ZC model with its much stronger intermittency.However, this conclusion may be premature since the low multiproxy and GISS intermittencies may be due to limitations of both the multiproxies and the GISS-E2-R model.Multicentennial-and multimillennial-scale ice core analyses display significant palaeotemperature intermittency (C 1 ≈ 0.05-0.1;Schmitt et al., 1995; see the discussion in Ch. 11 of Lovejoy and Schertzer, 2013), so that the multiproxies may be insufficiently intermittent.

Conclusions
From the point of view of GCMs, climate change is a consequence of changing boundary conditions (including composition); the latter are the climate forcings.Since forcings of interest (such as anthropogenic forcings) are typically of the order of 1 % of the mean solar input, the responses are plausibly linear.This justifies the reduction of the forcings to a convenient common denominator: the "equivalent radiative forcing" -a concept which is useful only if different forcings add linearly, if they are "additive".An additional consequence of linearity is that the climate sensitivities are independent of whether the fluctuations in the forcings are weak or strong.Both consequences of linearity clearly have their limits.For example, at millennial and longer scales, energy balance models commonly discard linearity altogether and assume that nonlinear albedo responses to orbital changes are dominant.Similarly, at monthly and annual scales, the linearity of the climate sensitivity has been questioned in the context of sharp, strong volcanic forcings.
In view of the widespread use of the linearity assumption, it is important to quantitatively establish its limits, and this can best be done using numerical climate models.A particularly convenient context is provided by the last millennium simulations, which (in the pre-industrial epoch) are primarily driven by the physically distinct solar and volcanic forcings (forcings due to land use changes are very weak).The ideal case would be to have a suite of the responses of fully coupled GCMs which include solar-only, volcanic-only and combined solar and volcanic forcings and control runs (for the internal variability) so that the responses could be evaluated both individually and when combined.Unfortunately, the optimal set of GCM products consists of the GISS E2-R millennium simulations with solar-only and solar plus volcanic forcing and a control run (this suite is missing the volcanic-only responses).We therefore also considered the outputs of a simplified climate model, the ZC model (Mann et al., 2005), for which the full suite of external forcing response was available.
Following a previous study, we first quantified the variability in the forcings as a function of timescale by considering fluctuations.These were estimated by using the difference between the averages of the first and second halves of intervals t ("Haar" fluctuations).This definition was necessary in order to capture the two qualitatively different regimes, namely those in which the average fluctuations increase with timescale (H > 0) and those in which they decrease with scale (H < 0).Whereas the solar forcing was small at annual scales, it generally increased with scale.In comparison, the volcanic forcing was very strong at annual scales but rapidly decreased, with the two becoming roughly equal at about 200 years.By considering the response to the combined forcing we were then able to examine and quantify their non-additivity (nonlinearity).By direct analysis (Fig. 3b  and c), it was found that, in the ZC model, additivity of the radiative forcings only works up until roughly 50-year scales; at 400-year scales, there are negative feedback interactions between the solar and volcanic forcings that reduce the combined effect by a factor of ≈ 1.5-2.This "subadditivity" makes their combined effects particularly weak at these scales.Although this result seems statistically robust for the ZC millennium simulations, until the source of the nonlinearity is pin-pointed and the results reproduced with full-blown coupled GCMs, they must be considered tentative (the conclusions would also be strengthened if ZC control runs output were available to estimate the internal variability); many more simulations with diverse forcings are needed to completely settle the issue.
In order to investigate possible nonlinear responses to sharp, strong events (such as volcanic eruptions), we used the fact that if the system is linear and scaling, then the difference between the structure function exponents (ξ (q)) for the forcings and responses is itself a linear function of the order of moment q (moments with large q are mostly sensitive to the rare large values, and small q moments are dominated by the frequent low values).By using the trace moment analysis technique, we isolated the nonlinear part of ξ (q) (i.e. the function K(q)), which quantifies the intermittent (multifractal, highly non-Gaussian) part of the variability (associated with the "spikiness" of the signal).Unsurprisingly we showed that the volcanic intermittency was much stronger than the solar intermittency but that, in both cases, the model responses were highly smoothed and were practically nonintermittent (close to Gaussian).We concluded that the model responses to sharp, strong events were not characterized by the same sensitivity as responses to the more common weaker forcing events.
By examining model outputs, we have found evidence that the response of the climate system is reasonably linear with respect to the forcing up to timescales of 50 years at least for weak (i.e.not sharp, intermittent) events.But the sharp, intermittent events such as volcanic eruptions that occasionally disrupt the linearity at shorter timescales become rapidly weaker at longer and longer timescales (with scaling exponent H ≈ −0.3).In practice, linear stochastic models may therefore be valid over most of the macroweather range, from ≈ 10 days to over 50 years.However, given their potential importance, it would be worth designing specific coupled climate model experiments in order to investigate this further.

Figure 1 .
Figure 1.(a)Top graph: the radiative forcings R F (top, W m 2 ) and responses T (K) from AD 1000 to 2000 for the Zebiak-Cane model, fromMann et al. (2005), integrated over the entire simulation region.The forcings are reconstructed solar (brown), solar blown up by a factor 5 (orange) and volcanic (black).For the solar forcing (top series), note the higher-resolution and wandering character for the recent centuries -this part is based on sunspots, not 10 Be.Bottom graph: the responses are for the solar forcing only (top), volcanic forcing only (middle) and both (bottom); they have been offset in the vertical for clarity by 2.5, 1.5, and 0.5 K, respectively.(b) GISS-ER-2 responses averaged over land (the Northern Hemisphereonly) at annual resolution.The industrial part since 1900 was excluded due to the dominance of the anthropogenic forcings.The solar forcing is the same as for the ZC model and is sunspot-based since 1610.The top row is for the solar forcing only, the middle series is the response to the solar and Crowley reconstructed volcanic forcing series (i.e. the same as used in the ZC model), and the bottom series uses the solar and reconstructed volcanic forcing series fromGao et al. (2008).Each series has been offset in the vertical by 1 K for clarity (these are anomalies, so the absolute temperature values are unimportant).

d=
T obs (t), where " d =" means equal in probability distributions (we can say that

Figure 2 .
Figure 2. (a)The RMS Haar fluctuation S( t) for the solar and volcanic reconstructions used in the ZC simulation for lags t from 2 to 1000 years (left).The solar is a "hybrid" obtained by "splicing" the sunspot-based reconstruction (b, top) with a 10 Be-based reconstruction (b, bottom).The two rightmost curves are for two different 10 Be reconstructions(Shapiro et al., 2011;Steinhilber et al., 2009).Although at any given scale, their different assumptions lead to amplitudes differing by nearly a factor of 10, their exponents are virtually identical and the amplitudes diminish rapidly with scale.(b) A comparison of the sunspot derived total solar irradiance (TSI) anomaly (top, used in the ZC and GISS simulations back to 1610, H ≈ 0.4) with a recent 10 Be reconstruction (bottom, total TSImean plus anomaly -since 7362 BC; see (a) for a fluctuation analysis, H ≈ −0.3) similar to that "spliced" onto the sunspot reconstruction for the period 1000-1610.We can see that the statistical characteristics are totally different with the sunspot variations "wandering" (H > 0), whereas the 10 Be reconstruction is "cancelling" (H < 0).The sunspot data were for the "background" (i.e. with no 11-year cycle; seeWang et al., 2005, for details); the data for the 10 Be curve were fromShapiro et al. (2011).

Figure 3 .
Figure 3. (a) The RMS Haar fluctuations of the Zebiak-Cane (ZC) model responses (from an ensemble of 100 realizations) with volcanic only (green, from the updated Crowley reconstruction), solar only (black, using the sunspot-based background;Wang et al., 2005), and both (brown).No anthropogenic effects were modelled.Also shown for reference are the fluctuations for three multiproxy series (blue, dashed, from 1500 to 1900, pre-industrial; the fluctuations statistics from the three series were averaged, and this curve was taken fromLovejoy and Schertzer, 2012b).We see that the combined volcanic and solar response of the model reproduces the statistics until scales of ≈ 50-100 years; however, at longer timescales, the model fluctuations are much too weak -roughly 0.1 K (corresponding to ±0.05 K) and constant or falling -whereas at 400-year scales, the temperature fluctuations are ≈ 0.25 K (±0.125) and rising.(b) A comparison of the RMS fluctuations of the ZC model response to combined solar and volcanic forcings (brown, bottom, from a), with the theoretical additive responses (black, bottom) as well as their ratio (S additive /S actual black, top).The additive response was determined from the RMS of the solar-only and volcaniconly response variances (from a): additivity implies that the fluctuation variances add (assuming that the solar and volcanic forcings are statistically independent).We can see that, after about 50-200 years, there are strong negative feedbacks, and the solar and volcanic forcings are subadditive; see (c) for an enlargement of the ratio.(c) An enlarged view of the ratio of the linear to nonlinear responses (from b).The top curve assumes for the combined forcing, the linearity of the response and statistical independence of the solar and volcanic forcings, whereas the bottom curve uses the actual response to the combined forcings.The maximum at around 400 years (top curve) corresponds to a factor

Figure 4 .
Figure 4.A comparison of the Zebiak-Cane (ZC) model combined (volcanic and solar forcing) response (bold, brown) with GISS-E2-R simulations with solar-only forcing (red) and a control run (no forcings, black), the GISS structure functions are for land, Northern Hemisphere, reproduced from Lovejoy et al. (2013).
Figure5.A comparison of the volcanic forcings for the ZC model (bottom, green) and for the GISS-E2-R GCM for two different volcanic reconstructions(Gao et al., 2008;Crowley, 2000) (top green curves, reproduced fromLovejoy et al., 2013).Also shown is the combined response (ZC, brown) and the pre-industrial multiproxies (dashed blue).

Figure 6 .
Figure 6.(a) Analysis of the flux/cascade structures of the ZC forcings (top row) and ZC temperature responses (middle, bottom rows); the normalized trace moments(Eq.11)  are plotted for q = 2, 1.9, 1.8, 1.7, 1.6, . . .0.1.Upper left is solar forcing (last 400 years only, mostly sunspot-based), upper right is volcanic, middle left is solar response (last 400 years), and middle right (volcanic response) and lower left are response to combined forcings (last 1000 years).Note that all axes are the same except for volcanic.For solar, only the last 400 years were used since this was reconstructed using the more reliable sunspot-based method.The earlier 10 Be-based reconstruction had relatively poor resolution and is not shown.Since the volcanic variability was so dominant, for the combined response (bottom left) the entire series was used.The red points and lines are the empirical values, and the blue lines are regressions constrained to go through a single outer scale point (see Eq. 11).In comparing the different parts of the figure, note in particular (i) the log-log linearity for different statistical moments, (ii) the fact that the lines for different moments reasonably cross at a single outer scale, and (iii) the overall amplitude of the fluctuationsfor example by visually comparing the range of the q = 2 moments (the top series) as we move from one graph to another.(b) The above shows the responses for the GISS-E2-R simulations(Northern Hemisphere, land, 1500-1900); λ = 1 corresponds to 400 years.The upper left is for the response to the Crowley reconstructed volcanic forcings (same as used in the ZC simulations, not the change in the vertical scale), the upper right for the Gao reconstructed volcanic forcings and the lower left is for the solar-only forcing (mostly sunspot-based, same as used in the ZC simulations).(c) Trace moment analysis of six annual resolution multiproxies, J: Jones; Ma: Mann 98; B: Briffa; C: Crowley; Mo: Moberg; H: Huang.The curves are reproduced, with permission, from Fig.11.8 ofLovejoy and Schertzer (2013), where full details and references are given.All were for the pre-industrial period AD 1500-1900; λ = 1 corresponds to 400 years.The curve shows the generic convergence of the envelope of curves to a quasi-Gaussian process; the proximity of the curve to the envelope indicates that, with the possible exception of the Mann curve, the intermittency is low.
Figure 6.(a) Analysis of the flux/cascade structures of the ZC forcings (top row) and ZC temperature responses (middle, bottom rows); the normalized trace moments(Eq.11)  are plotted for q = 2, 1.9, 1.8, 1.7, 1.6, . . .0.1.Upper left is solar forcing (last 400 years only, mostly sunspot-based), upper right is volcanic, middle left is solar response (last 400 years), and middle right (volcanic response) and lower left are response to combined forcings (last 1000 years).Note that all axes are the same except for volcanic.For solar, only the last 400 years were used since this was reconstructed using the more reliable sunspot-based method.The earlier 10 Be-based reconstruction had relatively poor resolution and is not shown.Since the volcanic variability was so dominant, for the combined response (bottom left) the entire series was used.The red points and lines are the empirical values, and the blue lines are regressions constrained to go through a single outer scale point (see Eq. 11).In comparing the different parts of the figure, note in particular (i) the log-log linearity for different statistical moments, (ii) the fact that the lines for different moments reasonably cross at a single outer scale, and (iii) the overall amplitude of the fluctuationsfor example by visually comparing the range of the q = 2 moments (the top series) as we move from one graph to another.(b) The above shows the responses for the GISS-E2-R simulations(Northern Hemisphere, land, 1500-1900); λ = 1 corresponds to 400 years.The upper left is for the response to the Crowley reconstructed volcanic forcings (same as used in the ZC simulations, not the change in the vertical scale), the upper right for the Gao reconstructed volcanic forcings and the lower left is for the solar-only forcing (mostly sunspot-based, same as used in the ZC simulations).(c) Trace moment analysis of six annual resolution multiproxies, J: Jones; Ma: Mann 98; B: Briffa; C: Crowley; Mo: Moberg; H: Huang.The curves are reproduced, with permission, from Fig.11.8 ofLovejoy and Schertzer (2013), where full details and references are given.All were for the pre-industrial period AD 1500-1900; λ = 1 corresponds to 400 years.The curve shows the generic convergence of the envelope of curves to a quasi-Gaussian process; the proximity of the curve to the envelope indicates that, with the possible exception of the Mann curve, the intermittency is low.

Table 1 .
The scaling exponent estimates for the forcings and ZC model responses.