Late Quaternary temperature variability described as abrupt transitions on a 1 / f noise background

In order to have a scaling description of the climate system that is not inherently nonstationary, the rapid shifts between stadials and interstadials during the last glaciation (the DansgaardOeschger events) cannot be included in the scaling law. The same is true for the shifts between the glacial and interglacial states in the quaternary climate. When these events are omitted from a scaling analysis the climate noise is consistent with a 1/f law on time scales from months to 10 years. If 5 the records analysed include the shift events, the effect is to create a break the scaling from a 1/f law to a 1/f law, with 1< β < 2. No evidence of multifractal intermittency have been found in any of the temperature records investigated, and the events are not a natural consequence of multifractal scaling.


Introduction
The temporal variations in Earth's surface temperature are well described as scaling on an extended range of timescales.In this parsimonious characterisation, a parameter β describes how the fluctuation levels on the different timescales are related to each other.The β-parameter can be defined via the scaling of the spectral density function of the signal by the relation where T (f ) is the Fourier transform of the time record T (t) and . . .denotes an ensemble average.An alternative is to measure the range of the variability on the longest timescales within a time window of length t by and to define β via the following relation (Lovejoy and Schertzer, 2013): In this description, the temperature fluctuations would decrease with scale if β < 1, implying that the climate fluctuations become less prominent as we consider longer timescales, a picture which is somewhat different from the rich long-range variability indicated by proxy reconstructions of past climate.On the other hand, a value β > 0 would imply that variability increases with scale, a property that (if it were valid on a large range of timescales) would lead to levels of temperature variability inconsistent with reality.It is therefore a natural a priori working hypothesis that Earth's typical temperature fluctuations, the climate noise, is characterised by β 1.Such a process is called a 1/f noise.The 1/f description of Earth's temperature is of course an idealised model.The reality is that the climate system consists of many components that respond to perturbations on different characteristic timescales, and the temperature signal can be seen as an aggregation of signals with different timescale characteristics.Since it is difficult to recognise pronounced timescales in the temperature records, a scaling description is both convenient and accurate.However, we are aware that the scaling is not perfect, and that there are structures in the climate system that deviate from the scaling law.One example is the El Niño Southern Oscillation Published by Copernicus Publications on behalf of the European Geosciences Union.
(ENSO), which places larger fluctuations on the times scales of a few years than what can be expected from a scaling model.Other examples are the Dansgaard-Oeschger (DO) cycles in the Greenland climate during the last glacial period, encompassing repeated and rapids shifts between a cold stadial state and a much warmer interstadial state.The result of this phenomenon is that the glacial climate in Greenland has much larger millennial-scale fluctuations than what can be expected from a 1/f description.However, as we demonstrate in this paper, the temperature variations of both the stadial and interstadial climate states fit well with the 1/fscaling, telling us that the deviation from 1/f scaling in the glacial climate arise from these regime shifting events.As we go to even longer timescales, we also observe anomalous fluctuation levels on timescales from 10 4 to 10 5 years that can be identified with the shifting between glacial and interglacial conditions.
One could argue that the DO cycles and the glaciation cycles are intrinsic to the climate system and should not be treated as special events, and their variations should be reflected in a scaling description of the climate.This idea was forwarded by Lovejoy and Schertzer (1986), elaborated in many later papers, and expanded to timescales up to almost 1 Gyr in Lovejoy (2014).Here several scaling regimes are proposed, including a "break" in the scaling law with an exponent β ≈ 1.8 on timescales longer than a century.A scaling model invoking two scaling regimes can account for the millennial-scale temperature fluctuations that are produced by the DO cycles, which are anomalous with respect to a 1/f model.However, the estimated scaling exponent will depend on the average "density" of DO events in the ice-core record used for the estimate, and since the events are not uniformly distributed over time, there is no uniquely defined scaling exponent for the last glacial period.Moreover, the scaling law would not be useful as a climate-noise model to use as a null hypothesis for determining the significance of particular trends and events, such as the anthropogenic warming over the last centuries.
The main message of this paper is that the 1/f noise characterisation of the temporal fluctuations in global mean surface temperature is very robust.It is an accurate description for the Holocene climate, but it is also valid under both stadial and interstadial conditions during glaciations, and during both glacial and interglacial conditions in the quaternary climate.The 1/f character of the climate noise provides us with robust estimates of future natural climate variability, even in the present state of global warming.Such an estimate would of course be invalidated by a future regime shift (a tipping point) to a warmer climate state provoked by anthropogenic forcing.A future observed change in the 1/f character of the noise could therefore be taken as an early warning signal for such a shift.

Data, methods and results
The analysis in this work is based on four data sets for temperature fluctuations: the HadCRUT4 monthly global mean surface temperature (Morice et al., 2012) in the period 1880-2011 CE (Common Era), the Moberg Northern Hemisphere reconstruction for annual mean temperatures in the years 1-1978 CE (Moberg et al., 2005), as well as temperature reconstructions from the North Greenland Ice Core Project (NGRIP) (Andersen et al., 2004) and the European Project for Ice Coring in Antarctica (EPICA) (Augustin et al., 2004).For the NGRIP ice core we have used 20-year means of δ 18 O going back 60 kyr.For the EPICA ice core we have temperature reconstructions going back over 300 kyr, but the data are sampled at uneven time intervals and the time between subsequent data points becomes very large as we go back more than 200 kyr.In addition we have used annual data for radiative forcing in the time period 1880-2011 CE (Hansen, 2005) to remove the anthropogenic component in HadCRUT4 data.Plots of all four data records are shown in Fig. 1.

Global versus local scaling
On the face of it, it is difficult to discern scaling laws for the climate noise on timescales longer than millennia, since we do not have high-resolution global (or hemispheric) temperature reconstructions for time periods longer than two kyr.The ice core data available only allow us to reconstruct temperatures locally in Greenland and Antarctica, and we know from the instrumental record that local and regional continental temperatures scale differently from the global mean surface temperatures on timescales shorter than millennial.The differences we find are that local temperature scaling exponents β l are smaller than global temperature exponents β g , and that the ocean temperatures scale with higher exponents than land temperatures.Since there are strong spatial correlations in the climate system, it is possible that all local temperatures are scaling with a lower exponent than the global.In (Rypdal et al., 2015) this phenomenon is illustrated in an explicit stochastic spatio-temporal model.In this model, which is fitted to observational instrumental data, we find the relationship β g = 2β l .This relationship is derived under the highly inaccurate assumption that all local temperatures scale with the same exponent, but it is still a useful approximation in the following sections, where we will argue that we can use local and regional temperature records to discern the scaling of the global mean surface temperature on timescales of 10 kyr and longer.We do this by showing that the assumption that β g /β l > 1 is valid on very long times scales leads to the impossible result that the variance of global averages becomes larger than the mean variance of local averages.Thus we conclude that β l converges to β g on a sufficiently long timescale, and we estimate an upper limit for that timescale.
Let us denote by σ g and σ l the standard deviations of the global surface temperature and a local temperature respec- tively, on a monthly timescale.From Eq. (3) it follows that the ratio between the variances for the global and local temperatures at timescale t is where τ = 1 month.Unless we expect global temperatures to have larger variations than the local temperature at timescale t (the global temperature can not have a larger standard deviation than the average standard deviation of the local temperatures), we must have ρ > 1, or equivalently, On the timescale of months, the fluctuation levels of local continental temperatures is about two orders of magnitude larger than the fluctuation level for the global mean temperature.If we also use β g = 1 and β l = 1/2 we obtain the condition t < 10 5 months ∼ 10 kyr, i.e. on timescales longer than 10 kyr the ratio β g /β l can no longer be larger than unity.
A similar estimate can be obtained from the NGRIP ice core data.In the Holocene the 20-year resolution temperature reconstructions from Greenland has a standard deviation which is about five times greater than the 20-year moving average of the Moberg reconstruction for the Northern hemisphere.
Applying the same argument restricts the timescale for which Greenland scaling exponent is smaller than the global scaling exponent to approximately 10 kyr.
Based on the reasoning above, we expect scaling of the ice core data to be similar to the global scaling on sufficiently long timescales.In the remainder of this paper we demonstrate that the scaling in the ice core data on timescales up to hundreds of kyr is similar to the 1/f scaling we observe in global temperature up to a few millennia.This suggests that the 1/f scaling on very long timescales in ice core data is a reflection of the scaling in global temperatures on these scales.

Methods for estimation of scaling
We use two methods to analyse the scaling of temperature records.The first is a simple periodogram estimation of the spectral power density S(f ).This estimator can also be applied to data with uneven time sampling using the Lomb-Scargle method (Lomb, 1976).The other method is to take the wavelet transform of the temperature data: and construct the mean square of the wavelet coefficients: the wavelet variance.This is a standard technique for estimating the scaling exponent β (Malamud and Turcotte, 1999), and it is known that We choose to use the so-called Haar wavelet , and the integral in Eq. ( 4) is computed as a sum.With this wavelet we have the relation between the wavelet transform and the Haar fluctuation of Eq. ( 2).The power spectral density and the wavelet variance are equivalent representations of the second order statistics of the time record, one in frequency domain and the other in time domain, and Eqs. ( 1) and ( 5) show that they are characterised by the same exponent β if there is scaling of the second moment.By the Wiener-Khinchin theorem, these second-order moments are also equivalent to the autocorrelation function.Hence, scaling in the second-order statistics plays a special role, irrespective of the scaling or non-scaling of other moments.
The wavelet variance method can be adapted to the case of unevenly sampled data using the method described in Lovejoy ( 2014).In the present work, we obtain very similar results using the periodogram and the wavelet variance estimators.Claims have been made that higher-order statistics in the form of a multifractal characterisation are an essential part of the statistical description of these data (e.g.Lovejoy and Schertzer (2013), Chapter 11).For this reason we include a brief analysis of higher moments of the data in Sect.2.4, and discuss their significance in Sect.2.5.

Results of second-order analysis
In Fig. 2 we show the wavelet fluctuation |W (t, t)| 2 estimated for two different segments of the NGRIP data.Both time series have the same number of data points and both represent time intervals of 8500 years.The differences between the two time series is that one contains DO cycles, whereas the other does not.The estimated wavelet fluctuations and the spectral density scale very differently for the two time series, and this motivates us to separate stadial and interstadial conditions when we analyse the scaling in NGRIP data.This separation is shown in Fig. 1a, where the red curve represents the δ 18 O concentration in interstadial periods and the blue curve represents the δ 18 O concentration in stadial periods.We have followed Svensson et al. (2008) in defining the dates for the onsets of the interstadials and we have defined the start dates for the stadial periods to be just after the rapid temperature decrease that typically follows the slow cooling in the interstadial periods.In Fig. 3 we show the spectral density function and the wavelet scaling function for the stadial data (red diamonds) and the interstadial periods (purple triangles), which both display an approximate 1/f scaling, but where the fluctuation variance in the stadial data is larger than in the interstadial data.These results are different from what is obtained when considering the NGRIP data (during the last glaciation) as a single time series (shown as blue diamonds).If we were to define a single scaling exponent for the whole time series, then we would obtain an estimate β ≈ 1.4.
Figure 3 shows that the scaling of the stadial and interstadial NGRIP data are similar to the scaling of global temperatures on shorter timescales during the Holocene.We have included an analysis of the instrumental temperature record both with (green triangles) and without the anthropogenic component (green disks).The anthropogenic component can be removed by subtracting the response to the anthropogenic forcing in a simple linear response model of the type considered in Rypdal and Rypdal (2014).We have also included an analysis of the Moberg Northern Hemisphere reconstruction (black squares), and we observe that the composite scaling wavelet variance function and the composite spectral density function obtained by combining the instrumental data with the Moberg reconstruction, is consistent with a 1/f model on timescales from months to centuries.Since the NGRIP data also show 1/f scaling, and since we believe that the scaling of the NGRIP data is a reflection of global scaling on timescales longer than a millennium, it is illustrative to adjust the fluctuation levels of the NGRIP data so that its Holocene part has a standard deviation close to that of the standard deviation of the 20-year means of the Moberg reconstruction in the same time period.This means that we use the adjusted NGRIP data as a proxy for global temperature on millennial scales.The effect of this adjustment is only a vertical shift of the wavelet scaling function and the spectral density functions in the double-logarithmic plots, so that it becomes easier to compare the scaling of the NGRIP data with the Moberg reconstruction and the instrumental data.We do not apply any adjustments of the fluctuation levels of the stadial and interstadial periods relative to each other.The same adjustment is applied to the EPICA data, and here we also consider the scaling of the glacials and interglacials separately as shown in Fig. 1b.The scaling estimated from the EPICA data for glacial periods (black crosses in Fig. 3) follows closely the scaling of the NGRIP data analysed as a single time series (blue diamonds).This shows that the glacial climates have similar characteristics in Greenland and in Antarctica.Careful examination of the figure shows that the fluctuations grow slightly faster with the scale t in the NGRIP time series than for the glacial periods of the EPICA time series.This is expected since the regime shifting events in Antarctica associated with the DO cycles are much less pronounced than in Greenland (WAIS Divide Project Members, 2015).In the EPICA data we cannot estimate a scaling exponent for the dynamics in periods without regime shifts, but our results for the EPICA data are consistent with a description of the climate as a 1/f climate noise plus regime shifts.If we analyse the EPICA data without omitting the interglacials, then the fluctuations increase even faster with the scale t (orange stars in Fig. 3).This effect is completely analogous to the effect of shifting between the stadial and interstadial conditions during glaciations.

A note on multifractal processes
The exponent β is well-defined as long as the power spectral density function S(f ) is a power law in f , or equivalently if the wavelet variance |W (t, t)| 2 is a power law in t.As mentioned in Sect.2.3, if well-defined, the β exponent is related to the temporal correlations in the signal via simple formulas.In fact, for a (zero-mean) stationary process T (t) with −1 < β < 1 we have T (t)T (t + t) ∼ (β + 1)β t β−1 and for (a zero-mean) process with stationary increments and 1 < β < 3 we have T (t) T (t + t) ∼ (β − 1)(β − 2) t β−3 , where T (t) is the increment process of T (t) (Rypdal and Rypdal, 2012).Thus, the results presented so far in this paper do not rely on any assumptions of self-similar or multifractal scaling.It is only assumed that the second-order moments |W (t, t)| 2 are well approximated by power-laws over an extended range of timescales.
A more complete scaling analysis can be performed if one imposes the more restrictive assumption that the waveletbased structure functions |W (t, t)| q are power-laws in t, not only for q = 2, but for an interval of q values.The time record can be classified as multifractal only if this is true.It is then possible to define a scaling function η(q) via the relation Lovejoy and Schertzer (2013) defines a scaling function ξ ( t) from the moment |T t (t)| q .From Eq. ( 5) we observe that their scaling function is related to ours by η(q) = ξ (q) + q/2.By Eqs. ( 5) and ( 7) we observe that η(2) = β.If T (t) is self-similar (or if T (t) is the increment process of a selfsimilar process in the case β < 1) we have η(q) = βq/2, but in general, the η(q) may be concave (it bends down).Processes that exhibit power-law structure functions and strictly concave scaling functions can be characterised as multifractal intermittent.A monofractal is a special case of the multifractal class.For a monofractal (monoscaling) process, the scaling function is linear in q.
In Fig. 4 we present a crude multifractal analysis of the data sets considered in this paper using q values in the range from 0.1 to 4. For the Holocene we find linear scaling functions for both the instrumental record and the Moberg Northern Hemisphere reconstruction, and in the NGRIP data we find linear scaling functions for the stadial periods and the interstadial periods when these are analysed separately, although, as we have already seen, there is a deviation from the 1/f scaling in the stadial periods for timescales shorter than about 200 years.If the NGRIP record is analysed with both stadial and interstadial stages included, then it is not clear how to define the scaling function since the shifts between the two types of stages cause a "break" in the power-law scaling of the wavelet-based structure functions.If we define η(q) using the timescales shorter than 2 kyr we obtain a linear scaling function corresponding to β = 1.14, and if we use the timescales longer than 4 kyr we obtain a linear scaling function corresponding to β = 1.78.In neither case do we obtain a strictly concave scaling function.A linear scaling function is also obtained if we disregard the "break" in the scaling and fit power laws using all the available timescales.In this case the scaling function corresponds to β = 1.26.For the periods of the EPICA record that corresponds to ice ages, we find wavelet-based structure functions that are closer to power-laws than what is observed in the NGRIP record.This is expected since the abrupt transitions between cold and warm periods is much less pronounced in Antarctica than in Greenland.The scaling function for the ice-age periods in the EPICA data is linear and corresponds to β = 1.18.
The results discussed above show that from this analysis there is no evidence of multifractal intermittency in the temperature records analysed in this paper.This is not very surprising and could be suspected by direct inspection of the data record.The trained observer would use the fact that if η(q) is strictly concave, then the kurtosis of W t (t), is decreasing as a power-law function of t, and is therefore leptokurtic 1 on the shorter timescales t.Multifractal intermittency in addition implies that the amplitudes of the random fluctuations are clustered in time, on all timescales, as observed in intermittent turbulence or financial time series (see e.g.Bouchaud and Muzy, 2003).These are not prominent features in the time series analysed in this paper.For the NGRIP data, the δ 18 O ratio slightly deviates from a normal distribution as a result of the DO events, but this is not well described by a multifractal model since that would require the wavelet-based structure functions to be power-laws in t.In fact, what we show in this paper is that the effect of DO events is to break the scaling, rather than to produce multifractal scaling.
Admittedly this multifractal analysis is a crude firstorder characterisation.Our crude analysis suggests that the records analysed are most reasonably modelled as monofractal.However, to establish this with confidence we need to perform statistical hypothesis testing.The strategy for such testing must consist of two elements.First, we have to test whether we can reject the hypothesis that the observed records are realisations of a multifractal (with monofractal as a special case) stochastic process.If this hypothesis can be rejected, there is no point in discussing whether the process is multifractal or monofractal.If we cannot reject the multifractal hypothesis, we must test if we can reject that this multifractal is a monofractal.The outcome of these tests depends on the lengths of the observed records, since rejection of the various null hypotheses depends on the statistical uncertainty associated with realisations of the null models.Monte Carlo simulations of these null models is the simplest tool to establish these uncertainties.In a forthcoming paper we will perform this rigorous testing of the multifractal hypothesis for the data analysed in the present paper, in addition to a wide selection of forcing data climate model data.The results presented here should therefore be taken as preliminary.

A note on non-fractal processes that scale in the second moment
In this paper we have focused on scaling in second-order statistics, or more precisely, on modelling the temperature records as stochastic processes that exhibit scaling of the second moment, but not necessarily of other moments.Reviewer Shaun Lovejoy strongly opposes this approach.He considers it as a return to old quasi-Gaussian ideas that disregards the developments of multifractal formalism and multiplicative cascades, and in his last referee comment he raises doubts about the existence of processes that exhibit scaling www.earth-syst-dynam.net/7/281/2016/Earth Syst.Dynam., 7, 281-293, 2016 in the second moment, but not in other moments.Here we will not only demonstrate the existence of such processes, but explain that the serious fallacy of Lovejoy's approach is that he fails to distinguish between multifractal noises and non-Gaussian noises that cannot be modelled within the multiplicative cascade paradigm.Examples of the latter is the large class of Lévy noises.In less technical terms, the issue is that a multifractal noise may consist of uncorrelated random variables (e.g.their signs may be uncorrelated), but they will never be independent (e.g.their squares will be correlated).
A Lévy noise, on the other hand, consists of independent random variables, which implies that all powers of the variables will be uncorrelated.Empirical multifractal analysis methods typically fail to distinguish between these different classes of processes because they implicitly assume a multifractal model.Often, the distinction is not easy to make, because a non-Gaussian Lévy noise may have a bursty (intermittent) appearance, and analysis must be designed to separate multifractal clustering (correlation in higher powers) from intermittency of non-Gaussian independent variables.Long-range memory in the process does not make the distinction less relevant.Such processes may easily be produced from those discussed above by convolving the zero-memory processes with a memory response kernel.From a physical viewpoint, it is very important to distinguish between these two classes of stochastic processes.The multifractal processes are based on a turbulent cascade paradigm and the dynamical description is fundamentally nonlinear.The Lévy noises, and their long-memory cousins, may arise from non-Gaussian, independent fluctuations on the short timescales, e.g.jumps with randomly distributed waiting times.
We distinguish between a Lévy noise T (t) and a Lévy process X(t).The latter is a continuous-time stochastic process with stationary, identical and independently distributed (i.i.d.) increments, i.e. for any τ the increments X(t + τ ) − X(t) have a well-defined distribution which is independent of t.The discrete-time process T (t) = X(t +1)−X(t), where t is the set of nonnegative integers, is a Lévy noise.Theoretical results on the fluctuation statistics on varying timescales of Lévy noises are most conveniently obtained by means of the standard structure functions of the underlying Lévy process (rather than the wavelet structure function defined in Sect.2.4), i.e. we define For a process which belongs to the multifractal class we have where the scaling function ζ (q) is related to η(q) for the wavelet moments and ξ (q) for the Haar fluctuation by ζ (q) = η(q) + q/2 = ξ (q) + q.In Appendix A we show that for a Lévy process the following relations hold for the second and fourth moments; where Y ≡ X(1) and kurt[Y ] ≡ Y 4 / Y 2 2 is the kurtosis (flatness) of Y .For a Gaussian process kurt[Y ] = 3, and hence S 4 ( t) ∝ t 2 .In this case S q ( t) ∝ t q/2 , X(t) is a Wiener process, and T (t) is a Gaussian white noise.For a non-Gaussian Lévy noise, Eq. ( 9) provides the key to distinguish it from multifractal noise.For t ∼ kurt[Y ]/3 − 1 there is a break in the scaling of S 4 ( t).In fact, for t kurt[Y ]/3 − 1 moments higher than q = 2 will scale more or less like t 1 (i.e.ζ (q) → 1 for large q), while for t kurt/3 − 1 they will scale like t q/2 .The latter corresponds to the scaling of a Gaussian white noise, which is quite obvious, since the random variables are independent and the central limit theorem implies that the fluctuations are Gaussian on the long timescales.On the other hand, on the short timescales when the fluctuations are still non-Gaussian, the scaling function ζ (q) bends over to become flat for large q, which is just the behaviour we find for multifractals.Hence, by leaving out the scales t > kurt[Y ]/3 − 1 from the analysis we will be led to the conclusion that the non-Gaussian Lévy process is multifractal.The trace-moment analysis employed by Shaun Lovejoy (Schertzer and Lovejoy, 1987;Lovejoy and Schertzer, 2013) is designed to conceal the scaling behaviour on these scales and is not suitable as a model selection test to distinguish multifractals from non-Gaussian Lévy noises or their long-memory derivatives.
In Fig. 5 we present an analysis of a synthetic jumpdiffusion process, which belongs to the class of Lévy noises.The details of this process are explained in Appendix B. The second-order structure function is a power law (a straightline in the log-log plot), but the other structure functions are not.If a scaling function is produced by fitting a straight line to the structure functions on the long timescales, and computing the slopes, we find the scaling function of a white Gaussian noise (the red line in Fig. 5d).If the same is done on the short timescales, the estimated scaling function is concave as one would expect for a multifractal (the blue curve).In Fig. 6, we show the same for a jump-diffusion process with memory, produced by convolving the Lévy noise with a memory kernel.Hence, the difficulties related to distinguishing multifractals from other types of non-Gaussian processes are not something that is limited to processes of independent random variables.change.For instance, when we apply standard statistical methods for estimating the significance of a temperature trend, the result depends crucially on the so-called error model, i.e. the model for the climate noise that is used as a null hypothesis.There is strong evidence that the temperature fluctuations are better described by scaling models than by so-called red-noise models (or AR(1)-type models).However, simply characterising the climate noise as scaling does not specify an error model.The exponent in the scaling law (the β parameter) must also be determined, and it is usually determined from the same signal as we are testing for trends.
If we do that without detrending we risk estimating a too high β for the error model, which yields a trend test with weak statistical power, i.e. we may fail to detect a trend even if it is present.It is possible to improve the statistical power in a logically consistent way by detrending prior to estimating β, but the approach is often (incorrectly) criticised for being circular, since β should be estimated under the assumption that the error model (null hypothesis) is true.However, since de-trending only has a small effect if the null hypothesis is true, de-trending is valid under both the null hypothesis and the alternative hypothesis.
Another approach, which is the motivation for this paper, is to characterise the scaling of the climate noise from preindustrial temperature records.If we are to use the scaling exponent estimated from pre-industrial records to demonstrate the anomalous climate event associated anthropogenic influence, we must be confident that the temperature scaling does not change significantly over time.We must also be confident that the scaling is robust, in the sense that it is not too sensitive to moderate changes in the climate state.The results presented in this paper suggest that, unless the climate system experiences dramatic regime shifting events, we can be confident that the natural fluctuations in global surface temperature is approximated by 1/f -type scaling on a large range of timescales.This result makes it easy to determine, on any timescale, if the observed increase in global mean surface temperature is inconsistent with the natural variability, and by how much.
The 1/f scaling described here is the same scaling observed in numerous publications by Shaun Lovejoy and Daniel Schertzer, e.g.Lovejoy et al. (2013); Lovejoy and Schertzer (2013); Lovejoy (2014).The important difference in our interpretations is that they believe this scaling is limited to the scale range from a few months to a century (the "macroweather regime").Our analysis suggests that this scaling is an expression of Nature's internal "humming" in Quaternary surface temperature variability on all scales up to d = X(1), where d = denotes equality in distribution for random variables.A random walk of this type is called a Lévy process (see Appelbaum (2004) for a review), and for it to be well-defined in continuous time one must assume that X(1) belongs to the class of infinitely divisible random variables.This is a technical requirement that ensures that an increment X(t + t) − X(t) has a well-defined distribution for arbitrary small t.Since a Lévy process has stationary and independent increments, it is uniquely defined by the (infinitely divisible) distribution of X(1).In the following we denote Y = X(1).
The characteristic function of the random variable Y is defined as the φ Y (u) = e iuY .If Y has a probability density function p Y , then φ Y (u) = e iuy p Y (y) dy is the Fourier transform of p Y .When working with Lévy processes it is common to define the function ψ(u) via the relation φ Y (u) = e ψ(u) , and ψ(u) is usually called the Lévy exponent.Note that since φ Y (0) = 1 we have ψ(0) = 0. Since ψ(u) defines the random variable Y it also determines the Lévy process uniquely.If t is an integer the value of X(t) is a random variable that can be written as a sum of lag-1 increments: X(t) = (X(1)−X(0))+(X(2)−X(1))+. ..+(X(t)−X(t −1)).This is a sum of t independent copies of the random variable Y , and therefore the characteristic function of X(t) is φ X(t) (u) = e iuY t = e tψ(u) .
In general there is a simple relation between the nth moment of a random variable and the nth derivative of its characteristic function evaluated in u = 0. Using this relation we can express the moments of X(t) via the formula (Gardiner, 2009), (u) .

Figure 1 .
Figure 1.(a) The δ 18 O concentration in the NGRIP ice core dating back to 60 kyr before present (BP).Here present means AD 2000 (= 2000 CE).The data are given as 20-year mean values.The time series are split into stadial (blue) and interstadial (red) periods.(b) The temperature reconstruction from the EPICA ice core.The shown time series are sampled with a time resolution of roughly 200 years.The temperature curve in the glacial periods is given in a blue colour.(c) The Moberg reconstruction for the mean surface temperature in the Northern Hemisphere.The data are given with annual resolution.(d) The HadCRUT4 monthly global mean surface temperature where the anthropogenic component has been removed using a linear-response model.

Figure 2 .
Figure 2. (a) The δ 18 O concentration in the NGRIP ice core.The data are given as 20-year mean values.Two different parts of the times series are shown.The blue curve represents the δ 18 O concentration in a time period starting approximately 50 kyr before present (BP) and has a duration of approximately 8500 years.As in Fig. 1, present means AD 2000 (= 2000 CE).The black curve represents the δ 18 O concentration in a long stadial period that started about 22 kyrs BP and has a duration of approximately 8500 years.(b) The wavelet scaling functions estimated from the two parts of the NGRIP data set.The blue points are the estimates from the part of the NGRIP ice core that is shown as a blue curve in (a), and which contains DO cycles.The black points are the estimates from the part of the NGRIP ice core that is shown as a black curve in (a), and which does not contain any DO cycles.

Figure 3 .
Figure 3. (a) For each time series considered in this paper we show double-logarithmic plots of the wavelet fluctuation |W (t, t)| 2 as a function of the timescale t.The green triangles and the green circles represent the HadCRUT4 monthly global mean surface temperatures with and without the anthropogenic component respectively.The black circles are the analysis of the Moberg Northern Hemisphere reconstruction.The analysis of the 20-year mean NGRIP data is shown as the blue diamonds, the purple triangles and the red diamonds.The blue diamonds show the results of the analysis of the entire data set dating back to 60 kyrs BP.The red diamonds are the results of the analysis preformed on the stadial periods only, and the purple triangles are the results of the analysis of the interstadial periods only.The results for the EPICA ice core data are shown as the orange stars and the black crosses.The orange stars are obtained by analysis of the entire data set dating back 200 kyrs, and the black crosses are obtained by only analysing the two most recent glaciations.The two solid lines have slopes β = 1 and β = 1.8.(b) As in (a), but instead of the wavelet fluctuation function we show the spectral density function S(f ).The two solid lines have slopes −β with β = 1 and β = 1.8.

Figure 4 .
Figure 4. (a)The estimated wavelet-based structure functions |W (t, t)| q for the HadCRUT4 monthly global mean surface temperature where the anthropogenic component has been removed using a linear-response model.The lines show the fitted power-law functions c q t τ (q) .The q values are q = 0.1, 1.0, 1.5, . .., 4.0.(b) The scaling function τ (q) obtained from the fitted power laws in (a).The line is a linear fit to the estimated scaling function, and the slope of this line is β/2 with β = 0.88.(c-d) As (a) and (b) but in this case for the Moberg Northern Hemisphere reconstruction.(e-f) As (a) and (b) but for the interstadial periods in the NGRIP record.(g-h) As (a) and (b) but for the stadial periods in the NGRIP record.(i-j) As (a) and (b) but for the ice age periods in the EPICA record.(k-l) As (a) and (b) but for the NGRIP record including both stadial and interstadial periods.The red curve in (l) is the scaling function estimated from the longest timescales, the blue curve is the scaling function estimated from the shortest timescales, and the green curve is the scaling function estimated using all the available timescales.

3Figure 5 .
Figure 5. (a)The increments of a jump-diffusion process shown in (b).This is a non-Gaussian independent noise process.(b) A realisation of a jump-diffusion process, and the cumulative sum of the signal in (a).This process is the sum of a Brownian motion and a Poisson jump process as described in Appendix B. The jump distribution is Gaussian with a standard deviation that is 10 times greater than the standard deviation of the increments of the Brownian motion.(c) S q ( t for q = 1, 2, 3 for the jump-diffusion process as computed from a large ensemble of realisations of the process.(d) Scaling function ζ (q) estimated from structure functions like those in (c).The red line is estimated by computing the slope of the structure-function curves on the longest timescale ( t = 500).The blue curve is estimated from the slopes at the shortest timescale (( t = 1).The black curve by estimating the slope of the straight line drawn between the end points of the structure-function curves.

Figure 6 .
Figure 6.As Fig. 5, but for a jump-diffusion process with memory as described in Appendix C. The parameter value β = 0.4 is used.