Monthly near-surface temperature anomalies from several gridded data sets (GISTEMP, Berkeley Earth, MLOST, HadCRUT4, 20th Century Reanalysis) were investigated and compared with regard to the presence of components attributable to external climate forcings (associated with anthropogenic greenhouse gases, as well as solar and volcanic activity) and to major internal climate variability modes (El Niño/Southern Oscillation, North Atlantic Oscillation, Atlantic Multidecadal Oscillation, Pacific Decadal Oscillation and variability characterized by the Trans-Polar Index). Multiple linear regression was used to separate components related to individual explanatory variables in local monthly temperatures as well as in their global means, over the 1901–2010 period. Strong correlations of temperature and anthropogenic forcing were confirmed for most of the globe, whereas only weaker and mostly statistically insignificant connections to solar activity were indicated. Imprints of volcanic forcing were found to be largely insignificant in the local temperatures, in contrast to the clear volcanic signature in their global averages. Attention was also paid to the manifestations of short-term time shifts in the responses to the forcings, and to differences in the spatial fingerprints detected from individual temperature data sets. It is shown that although the resemblance of the response patterns is usually strong, some regional contrasts appear. Noteworthy differences from the other data sets were found especially for the 20th Century Reanalysis, particularly for the components attributable to anthropogenic forcing over land, but also in the response to volcanism and in some of the teleconnection patterns related to the internal climate variability modes.
Temporal variability within the climate system results from a complex interaction of diverse processes, both exogenous and arising from internal climate dynamics. To identify and quantify the effects of individual climate-forming agents, two complementary approaches are typically employed (e.g., IPCC, 2013): numerical simulations based on general circulation models (GCMs) and statistical techniques. While the statistical methods do not offer the physical insight provided by the GCM-based simulations, they are potentially able to capture relations omitted or distorted within GCMs due to the need for simplified representation of the relevant physical processes. A number of authors have investigated the presence of relations between climate forcings and time series of climate variables by statistical means, often involving multivariable regression analysis or related techniques. The resulting studies typically show a strong link between temperature and anthropogenic forcing (e.g., Pasini et al., 2006; Lean and Rind, 2008; Schönwiese et al., 2010; Rohde et al., 2013b; Canty et al., 2013; Chylek et al., 2014b), although linear change with time is also often used to approximate the long-term temperature evolution (e.g., Foster and Rahmstorf, 2011; Gray et al., 2013; Zhou and Tung, 2013). Imprint of solar activity is usually quite weak in the near-surface temperature series (e.g., Lockwood, 2012, and references therein) and the spatial patterns of eventual response tend to be quite complex (Lockwood, 2012; Gray et al., 2013; Hood et al., 2013; Xu and Powell, 2013). Major volcanic eruptions typically manifest by temporary cooling in the globally averaged temperature, although its magnitude differs somewhat among individual temperature data sets as well as between ocean and land (Canty et al., 2013) and the geographic fingerprint of the temperature response is far from trivial (Stenchikov et al., 2006; Driscoll et al., 2012; Gray et al., 2013).
Compared to the often pan-planetary reach of the external forcings, major manifestations of internal climate variability modes tend to be more localized, though sometimes with ample projection of weaker influences through teleconnections. Relatively well understood is the El Niño/Southern Oscillation (ENSO) system, dominating in the tropical Pacific, but also affecting various aspects of weather patterns in many regions across the globe and leaving a distinct imprint in globally averaged temperature as well (e.g., Trenberth et al., 2002). The effect of the North Atlantic Oscillation (NAO) is prominent particularly in the areas around the northern Atlantic (e.g., Hurrell et al., 2003). The Northern Atlantic is also the primary area of activity of Atlantic Multidecadal Oscillation (AMO), with potential imprints noticeable in local temperatures as well as their global means (e.g., Tung and Zhou, 2013; Zhou and Tung, 2013; Rohde et al., 2013b; Muller et al., 2013; Chylek et al., 2014b; van der Werf and Dolman, 2014; Rypdal, 2015). A related (pseudo)oscillatory system manifests in the northern Pacific in the form of Pacific Decadal Oscillation (PDO: Zhang et al., 1997), although its direct link with global temperature seems to be less pronounced than AMO's (e.g., Canty et al., 2013). Other potentially influential variability modes can be identified in the climate system, though their exact mechanisms and effects are not always completely known. Selection and preparation of explanatory variables representing individual climate-forming factors is a critical part of statistical attribution analysis; more details on their choice and specific form in our tests are provided in Sect. 2.1.
Of the descriptors of the climate system, temperature-related characteristics are arguably the most intensely investigated. Over the recent years, various research groups have developed and gradually evolved data sets of near-surface global gridded temperature (including MLOST: Smith et al., 2008; GISTEMP: Hansen et al., 2010; HadCRUT4: Morice et al., 2012; Berkeley Earth: Rohde et al., 2013a, b), which now provide more than a century of mid-to-high resolution data for a substantial portion of the globe. In addition to these temperature analyses, created primarily by interpolation and/or averaging techniques, reanalysis data are also used to approximate past climate. Of particular interest regarding the longer-term variability is the 20th Century Reanalysis (20CR: Compo et al., 2011), currently providing global gridded data from the mid-19th century on. While all these data sets approximate the same historical evolution of the climate system and share much of their basic temporal variability on pan-planetary scale (e.g., Hansen et al., 2010; Foster and Rahmstorf, 2011; Compo et al., 2013; Rohde et al., 2013b), the respective temperature fields do differ to some, regionally dependent, degree. In this paper, we aim to investigate and compare selected aspects of spatio-temporal variability in several gridded data sets of monthly temperature, introduced in Sect. 2.2, with emphasis on identification of temperature responses attributable to climate forcings and major modes of internal climate variability.
Our methodology of attribution analysis is largely based on multiple linear regression, as detailed in Sect. 3. Basic match of temporal variability between the temperature data sets is quantified through linear correlations, with results shown in Sect. 4.1. Presence, magnitude and statistical significance of components attributable to individual explanatory variables in globally averaged temperatures are investigated in Sect. 4.2, including an analysis of potential time-delayed responses. An analysis of the geographical response patterns is then carried out in Sect. 4.3, followed by an assessment of local time-delayed responses in Sect. 4.4 and discussion of the results in Sect. 5. Only the key outcomes of our analysis are presented in the paper itself – additional materials are provided in the Supplement, particularly results derived for shorter sub-periods of the time series studied.
Time series of the explanatory variables employed in the
attribution analysis. Bars to the right of individual panels illustrate the
pre-selected characteristic variations of the predictors, used for
calculation of the temperature responses: increase of CO
Although many of the statistical attribution studies pursue a similar goal and share much of their basic methodology, substantial diversity exists in the selection of the explanatory factors employed and their specific variants. Here, we used eight predictors with proven or reasonably suspected influence on climate on global or continental scale, representing effects of various external forcings and climatic oscillations (Fig. 1).
Among the external influences on the climate system, role of the greenhouse
gases (GHGs) is relatively well understood (e.g., IPCC, 2013). Due to
their positive contribution to radiative forcing, man-made GHGs are believed
to be responsible for much of the near-surface global temperature rise during the
later stages of the instrumental period. Anthropogenic influences to climate
do also manifest through formation of various aerosols, including sulfates
or black carbon, or by production of tropospheric ozone, although the
uncertainties regarding their direct and especially indirect impacts are
still profound (e.g., Skeie et al., 2011; IPCC, 2013). Furthermore,
due to the limited lifespan of the aerosols, their amounts are highly
variable in time and space, unlike the concentrations of the relatively
long-lived GHGs. From the perspective of statistical analysis, the often
strong temporal correlation of the amounts of GHGs and aerosols is also
problematic, making it difficult for a regression mapping to distinguish
between their respective effects. For these reasons, anthropogenic aerosol
forcings were not directly considered here, and global CO
Global monthly series of stratospheric aerosol optical depth provided by
NASA GISS at
In addition to the external forcings tied to exogenous factors, temporal
variability of the climate system is also shaped by various internal
oscillations. Southern Oscillation index (SOI), provided by CRU at
Not all of the predictors here can be considered mutually independent, from
neither physical nor statistical perspective. In Table 1, formal similarity
of the series of individual explanatory variables is illustrated through
values of Pearson correlation coefficient
Pearson correlation coefficient between series of individual
predictors (Fig. 1) in the 1901–2010 period. The upper-right segment of the
matrix contains values for the original concurrent series, the lower-left
segment values for their time-shifted versions (as specified in Fig. 4's
caption). The bottom-most row shows values of the variance inflation factor
(VIF) for individual time-shifted predictors, calculated as
1/(1–
Monthly series of near-surface temperature on a (semi-)regular
longitude-latitude grid from four temperature analyses and one reanalysis
were studied:
GISTEMP of NASA's Goddard Institute for Space Studies, available at
Temperature analysis of the Berkeley Earth group, obtained from Merged Land-Ocean Surface Temperature Analysis (MLOST) by NOAA, from
HadCRUT4, a combined land (CRUTEM4) and sea (HadSST3) temperature data set by
Climatic Research Unit (University of East Anglia) and Hadley Centre (UK Met
Office) from 20th Century Reanalysis (20CR) by NOAA ESRL PSD, obtained in version V2 from
All four gridded temperature analysis data sets (GISTEMP, BERK, MLOST,
HadCRUT4; hereinafter also referred to as observational data sets) are
natively provided as monthly anomalies, and were analyzed as such. For 20CR
temperatures, anomalies were constructed by subtracting mean annual cycle
for the period 1951–1980. In addition to gridded temperatures, global
temperature means (representing either land-only or fully global spatial
averages) were also studied. The respective global monthly series were
obtained from the web pages of the individual research groups, with the
exception of 20CR, for which global average was calculated as a
latitude-adjusted weighted mean from the gridded data for the full globe or
for the area between 60
Despite the inherently nonlinear and deterministically chaotic nature of the climate system, the interaction of external climate forcings in temperature signals can often be approximated quite well by a simple linear superposition (e.g., Shiogama et al., 2013). Even when effects of internal climatic oscillations are studied in the frame of multivariable statistical attribution analysis, nonlinearities are generally not dominant, albeit sometimes detectable (e.g., Pasini et al., 2006; Schönwiese et al., 2010; Mikšovský et al., 2014). Further considering the increased computational costs and more complicated interpretation for the nonlinear regression techniques, only multiple linear regression (MLR) was applied here to separate contributions from individual predictors, subject to a calibration procedure minimizing the sum of squared regression residuals.
Although application of MLR-based mappings is quite straightforward in itself, potential challenges await when estimating the statistical significance of the regression coefficients, particularly due to non-Gaussianity and serial correlations in the data. For construction of the confidence intervals in Sect. 4.2, bootstrapping was used. Since the basic form of bootstrap (resampling data for individual months as fully independent cases) does not account for autocorrelation structures in the data, which cannot be ignored in the monthly temperatures (e.g., lag-1-month autocorrelations in the regression residuals ranged between 0.32 and 0.61 for different versions of globally averaged temperature), moving-block bootstrap was used (e.g., Fitzenberger, 1998).
Pair-wise Pearson correlation coefficients between local monthly temperature anomaly series from different data sets for the 1901–2010 period. See Fig. S1 for correlations during the 1901–1955 and 1956–2010 sub-periods.
In an effort to alleviate the high computational costs of full bootstrap, an
alternative approach to assessment of statistical significance was also
explored: Monte Carlo-style tests designed to estimate thresholds of the
regression coefficients, consistent with the null hypothesis of the absence
of regressor-related component(s) in the regressand. Our experiments have
shown that the effect of autocorrelation structures on the coefficient
thresholds is approximated quite well by the predictor-specific expansion
factors
The analysis has been carried out over the 1901–2010 period, chosen as a compromise between maximizing the length of the signals studied and limited availability and reliability of data for the earlier parts of the instrumental period. Additional results for the first (1901–1955) and second (1956–2010) half of the target period are provided in the Supplement. To facilitate comparison of the contributions from individual explanatory variables mutually and to temperature variability itself, outcomes of the regression analysis are presented in the form of temperature responses to pre-selected characteristic variations of individual predictors, illustrated in Fig. 1 and specified in its caption. To limit biases due to incompleteness of the temperature series in some locations and data sets, only results for predictands with less than 10 % of missing values are shown.
Ideally, all the temperature data sets should follow the same, historical,
trajectory of the climate system. In reality, differences appear among
individual representatives of the climatic past, due to variations in the
structure of the source data and specifics of their processing. While we
obviously cannot make a comparison to a perfect embodiment of the past
states of the atmosphere, the existing temperature approximations can be
compared mutually, to assess which regions and/or periods exhibit a higher degree of
match (signaling lower uncertainty due to the data set choice), and where
stronger contrasts emerge. The basic structure of these differences is
illustrated in Figs. 2 and S1 (in the Supplement) through pair-wise Pearson
correlations (
The above-specified general tendencies in regional correlation patterns also hold for the relation between the analysis-type data sets and 20CR (bottom row in Fig. 2): a relatively good match of the temperature anomalies in Europe and eastern US contrasts with more profound differences in the tropical parts of Africa and much of South America. The question remains whether the disparities detected can be attributed to misrepresentation of any specific source(s) of temperature variability – an issue that is further investigated in the following sections.
Temperature responses (
Much of the existing research of temperature variability and its attribution by statistical means focuses on globally averaged data. Aside from limiting the number of signals to be analyzed (and thus allowing for more detailed examination of each of them), the world-wide averaging suppresses regional variations and allows factors associated with global-reaching forcings to become more reliably detectable. On the other hand, effects contributing responses of opposite sign in different regions (such as ENSO or NAO) may be obscured in pan-planetary representation. In this section, global and global land temperature signals are investigated for the presence of the imprints of individual internal and external forcing factors.
It has been shown on various occasions that responses in climate variables
(including temperature) are not necessarily perfectly synchronized with the
variables representing the climate forcings, and time-offset relations may
manifest (e.g., Canty et al., 2013 and references therein). In Fig. 3, this
is illustrated via application of MLR mappings with individual predictors
offset by
All four analysis-type data sets exhibit high degree of similarity of the
features in the globally averaged series. On the other hand, some noteworthy
distinctions appear for 20CR. Most notably, the volcanism response curve is
similar in shape to the ones characterizing the observational data, but
shifted towards positive values. Furthermore, NAO response peaks at
To facilitate mutual comparability of the results, and also to consider that
the physical links between predictors and temperature should be the same for
all data sets, a unified set of time shifts was employed for the tests in
Sects. 4.2 and 4.3. Lead time of
Our analysis suggests the GHG-attributed rise in global temperature to be
approximately 0.8
Regression-estimated responses (
The response of global temperature to volcanic forcing is clear,
statistically significant and of similar magnitude in all analysis-type
data sets: drop of 0.36 to 0.44
While our results show the well-known tendency towards higher global
temperature anomalies during the El Niño phases of ENSO (e.g., Trenberth
et al., 2002), the respective components tested close to the threshold of
statistical significance at
Conforming to several previous studies concerned with association between global temperature and AMO (e.g., Rohde et al., 2013b; Zhou and Tung, 2013; Chylek et al., 2014b) and using similar (i.e., linearly detrended) version of its index, our results suggest formally strong link of detrended mean North Atlantic temperature and its global counterpart, distinct for land-based temperatures as well. The question remains, however, of how representative AMOI really is of internal variability in the climate system, as further discussed in Sect. 5.
The imprint of PDO in global temperature is quite clear and, for our combination of predictors, actually about as strong as SO's. It should be considered though that SOI and PDOI series are not independent and, as predictors, they partly compete for the same variability component in the temperature signals. When included alone among the explanatory variables (i.e., either SOI or PDOI, but not both), the respective responses are generally strengthened, as is their statistical significance. Considering that SOI and PDOI are only partly collinear and that their temperature response patterns do differ in many regions (Sect. 4.3), both were included as formally independent predictors in our analysis.
The final predictor considered in our setup, TPI, does not project much influence upon global temperature, though the respective component is borderline statistically significant for some of the data sets. Just as in the case of SOI, NAOI or PDOI, the relatively weak global response can be traced to the presence of mutually opposite contributions from different regions, as demonstrated in the next section.
Even clear and strong presence of a component associated with a particular forcing factor in globally averaged temperature does not automatically imply its universal relevance on a local scale. Conversely, locally dominant factors may be marginal in a global perspective. Here, we present an overview of geographic patterns of temperature response to external and internal forcing, for the set of eight predictors identical to that in Sect. 4.2. Only results for the data sets with mostly complete data coverage in the 1901–2010 period (GISTEMP, BERK, 20CR) are shown (Fig. 5); see the Supplement (Fig. S5) for the full set of results including MLOST and HadCRUT4.
Geographic patterns of regression-estimated contributions to local
temperature (
While positive correlation between GHG concentration and temperature is typical for most regions of the world, the strength of the component formally attributed to greenhouse gases (or, more generally, to anthropogenic forcing) varies substantially, and insignificant links or even anticorrelations appear in some smaller areas. Most prominently, the oceanic region south of Greenland, known for a negative temperature trend since 1901 (e.g., IPCC, 2013), displays high contrast to the rest of the world. Relatively good match between the analysis-type data sets is found in most regions. However, notable differences between the gridded observations and 20CR appear in a few geographically limited locations. Aside from mild contrasts in some oceanic regions (particularly central and eastern equatorial Pacific), distinctly negative temperature responses appear over land in the eastern Mediterranean, central South America and Texas. On the other hand, warming response over northern China is overestimated in 20CR. A similar pattern of discrepancy between the observed data and 20CR has already been reported and discussed by Compo et al. (2013) in their analysis of linear trends in the temperature series for 1901–2010, with various potential explanations suggested. Generally, although long-term components (whether expressed by match with anthropogenic forcing, or by linear trends) in 20CR are characterized consistently with the analysis-type data in many regions, their representativeness cannot be assumed universally.
The local temperature responses to solar irradiance are arranged in a complex pattern, encompassing both positive and negative links, combining in a near-neutral contribution to global land average. Statistically significant responses are rarely indicated and influence of solar variability therefore seems largely inconclusive at a local scale (Figs. 5b, S5b). Nonetheless, sign and magnitude of the links appear to be similar across individual data sets, including 20CR. From the results for the oceanic areas, it is revealed that main contributions to the borderline significant link between global temperature and irradiance come from southern extratropical areas and the northern Pacific. The response patterns shown by Lean (2010), Zhou and Tung (2010) or Gray et al. (2013) do differ somewhat from our results; however, direct comparison is problematic due to distinctions between time periods analyzed as well as detection methodology employed. The outcomes for the 1901–1955 and 1956–2010 sub-periods (Fig. S6) suggest some degree of stability of the response patterns, though with enough differences to explain the mismatch in contributions to globally averaged land temperature (Sect. 4.2). Overall, our analysis confirms that solar activity does not leave a strong, unambiguous imprint in lower tropospheric temperature.
While the cooling effect of volcanic forcing was clearly apparent in global mean temperature, its local influence is less ubiquitous (Figs. 5c, S5c). Regions with negative response do slightly prevail in the observational data sets, but positive contributions are detected in several areas, too. Only few locations show statistically significant responses of either sign. The pattern revealed bears basic resemblance to the ones shown by Lean and Rind (2008) and Lean (2010), with post-eruption cooling indicated in North America and warming over northern Asia. Some differences emerge, however, emphasizing the sensitivity of the forcing response patterns to the analysis details such as specific choice of the predictor(s) or time period considered. In the 20CR, positive responses are more numerous and stronger in magnitude, pushing the global mean volcanism-attributed signal towards positive values and statistical non-significance. This tendency is noticeable especially during the first half of the analysis period (Fig. S6), although it should be noted again that the relative lack of global-reaching volcanic events renders the results rather uncertain for the 1901–1955 period.
The canonical pattern of temperature response associated with SO/ENSO activity (e.g., Trenberth et al., 2002; Lean and Rind, 2008; Gray et al., 2013) also emerged in our analysis, including the teleconnections extending beyond the tropical Pacific region (Figs. 5d, S5d). While some minor differences exist among individual data sets, the resemblance of the respective patterns is high; some minor exceptions are found for 20CR over land, such as weaker projection of SOI influence over eastern Africa. The effect of North Atlantic Oscillation, too, is shown very clearly for its primary area of activity encompassing much of Eurasia and North America (Figs. 5e, S5e). 20CR data show a generally good match with the gridded observations, though minor differences emerge, such as weakened teleconnections to easternmost Asia or altered links to southern Africa.
Unlike the multipolar geographical responses associated with SO and NAO, the regression coefficients between AMOI and local temperature are predominantly positive worldwide, and significant connections extend across the globe (Figs. 5f, S5f). This largely unidirectional link, previously pointed out through correlation analysis by Muller et al. (2013), results in much stronger AMO-correlated component in global temperature. On the other hand, it also raises a question of what exactly the relation between temperatures worldwide and those in the northern Atlantic is (beyond the obvious fact that Atlantic SST is one of the components averaged into global temperature, and thus not completely independent). While many of the recent studies employed the (linearly detrended) AMO index in the role of an independent explanatory variable, arguments have been made for use of different forms of the index (see Canty et al., 2013 and the references therein) or questioning the nature of AMO itself (e.g., Booth et al., 2012; Mann et al., 2014). In our analysis, we focused rather on formal connections in the data studied and mutual (in)consistency of various data sets; the issue of exact physical nature and stability of AMO was not central. The imprint of AMOI is similar across individual data sets; noticeable differences appear especially over central and eastern Eurasia.
PDO's influence pattern shows both positive and negative connections, strongest in the Pacific area (e.g., Deser et al., 2010), but with some significant teleconnections extending to more distant regions as well (including Africa or Scandinavia). PDO's imprint in 20CR is relatively close to that in the analysis-type data; differences appear especially over parts of Africa (Figs. 5g, S5g).
The relation between temperature and TPI manifests in a semi-regular pattern of alternating positive and negative sectors over the southern oceans and nearby continents, though only in the segments near South America and Australia do the relations test as statistically significant (Figs. 5h, S5h). The 20CR-based response resembles the observational pattern in shape, but is generally stronger magnitude-wise.
Geographic distribution of the predictor offset time
The homogeneously timed predictors employed in Sect. 4.3 do provide a robust
basis for an assessment of the superposition of their effects in globally
averaged temperature, but overlook the possibility of geographically
dependent delays. To reveal the characteristic patterns of locally specific
asynchronous responses to the explanatory variables, regression analysis of
local temperature was also carried out with individual predictors shifted in
time by
For the GHG amount, the results exhibit little sensitivity within our time
window, and the magnitude of temperature responses is virtually identical to
the
The spatiotemporal variability of temperature response to ENSO phase is well known (e.g., Trenberth et al., 2002) and reflected in our results as well: the occurrence of the strongest temperature response leads SOI by a few months in the eastern equatorial Pacific, whereas largely concurrent variability is indicated for the western Pacific. In the Indian Ocean, strongest temperature response lags by a few months behind SOI and delay of 6 to 8 months is indicated around south-east Asia as well as in northern Australia. 20CR reproduces these patterns quite well over the oceans, but noticeable differences appear for teleconnections over land, most notably in less consistently expressed links to Africa and the southern part of South America.
Geographic distribution of the strongest temperature response
(
The strongest statistically significant temperature responses to NAO are
instantaneous in most areas, or delayed by 1 month (mostly over northern
Atlantic). The pattern detected from the observational data sets is
reproduced quite well in 20CR, with the most notable exception again being
the breakdown of transcontinental teleconnection over eastern Asia and
appearance of a link to southern Africa. The reason for the temporal shift
of NAO-attributed signal in 20CR global temperature (Fig. 3) therefore does
not seem to be the misrepresentation of timing of the local temperature
responses. Rather, it can be traced to the perturbed balance between the
opposite-in-sign responses from different regions (note especially the
overly negative contribution from northern Africa). Though these deviations
are relatively small, they vary for different
There is a distinct connection between the AMO index and local temperature
in many regions of the world even without a time shift (Fig. 5f), but the
timing of the maximum strength of this association varies distinctly within
our
Finally, in the case of TPI, the results indicate concurrence of the oscillations or delay of 1 month for most locations with a statistically significant response. The pattern is reproduced quite well by 20CR, though magnitude of the temperature variations is somewhat exaggerated again.
The primary objective of our analysis was twofold. Firstly, we aimed to provide a unified outlook into the local temperature responses associated with activity of multiple climate-forming agents, exogenous and endogenous, and the way they combine in pan-planetary temperature signals. While various past studies already dealt with a similar kind of statistical attribution analysis, their scope was typically more focused, phenomenon- or region-wise, but also regarding the temperature data source. Our second objective therefore consisted in assessing the robustness of the attribution analysis results among several commonly employed representations of monthly temperature throughout the 20th and early 21st century. To this end, four observational temperature data sets and one reanalysis were studied through linear regression, extracting components synchronized with temporal variability of eight predictors representing external climate forcings and internal variability modes.
The basic correlation analysis in Sect. 4.1 revealed the general geographical patterns of temperature (mis)match among different observational data sets. Unsurprisingly, the best agreement was found for regions with the best coverage by measurements (most notably Europe and eastern North America, where the Pearson correlations of monthly temperature anomalies typically exceeded 0.9), leaving relatively little room for uncertainty in the gridded data. Regions with sparser observations, such as interiors of Africa or South America, exhibited more disparity, and coverage by the gridded data was often incomplete in these locations. Of even greater interest was the resemblance between analysis-type data sets and the 20th Century Reanalysis (20CR). Since 20CR does not directly utilize the temperature measurements over land, greater deviations from “reality” may be expected, especially for the continental areas. While the correlation analysis indeed indicated somewhat loosened relation to the analysis-type data, the match was still quite good in most regions, with the poorest agreement again found in Africa and South America. Major differences between the temperature anomaly series were seldom observed over oceans (the most notable exception being the higher latitudes of the southern hemisphere). Since all the data sets (including 20CR) employ sea surface temperature as inputs, temperatures are tied more closely to the historical trajectory of the climate system and eventual contrasts can be largely ascribed to differences among individual SST representations (assessed in detail by Yasunaka and Hanawa, 2011).
While the correlation analysis pointed out the basic patterns of differences
between individual data sets, the question remains how much these can affect
the outcomes of the attribution analysis. Match among the GHG-attributed
temperature changes was generally strong in most locations, but certain
smaller regions were highlighted in 20CR where this trend-like component
diverged substantially from the analysis-type data. These local
discrepancies, previously pointed out by Compo et al. (2013), also somewhat
decrease the magnitude of the GHG-attributed component in the global land
temperature for 20CR. Furthermore, when drawing conclusions from the results
presented, it is essential to consider the limitations of the statistical
approach to the attribution analysis. First of all, even formally
statistically significant connections are not proof of physically
meaningful relations, as the regression analysis only seeks formal
similarities among the time series, unable to verify causality of the links.
For the attribution of the temperature trends to GHGs, this is particularly
critical. Although the significance level is generally high for the
GHG-related regression coefficients, it would be such for any explanatory
signal of similar structure (including a plain linear trend). While it is
physically justified to associate the increase in GHGs with warming
tendencies, there are other potential anthropogenic forcing factors sharing
similar temporal evolution, yet intentionally omitted in our analysis.
Various man-generated aerosols can contribute to either local warming
(e.g., black carbon) or cooling (e.g., sulfate aerosols; see, e.g., Skeie et al.,
2011). In many areas, the temporal progression of aerosol-related predictors
closely mimics that of GHG concentration (for instance, the Pearson
correlation between GHG concentration and regional SO
Of the natural forcings, the imprints of solar activity seem to be represented in quite a similar manner by all the data sets studied, including 20CR. The component attributed to variations of solar irradiance (involving both the 11-year cycle and longer-term variability) was quite weak, in most individual regions as well as in globally averaged temperature. These results are largely consistent with previous assessments of the impacts of solar activity on temperature (e.g., Lockwood, 2012; Gray et al., 2013). Still, the spatial patterns of solar influence exhibit some degree of temporal stability, suggesting that even though the fingerprints detected do largely not test as statistically significant, they are not just an artifact of stochastic components in the temperature series.
An interesting contrast between the results for globally averaged temperature series and for their local counterparts was found in the case of the effects of volcanic activity. The well-known near-surface cooling following major volcanic eruptions was clear in all versions of globally averaged observed temperature, but a rather complex pattern emerged from the gridded temperature data. Post-eruption warming was indicated in several regions. There might be dynamical reasons for such behavior (e.g., Stenchikov et al., 2006; Driscoll et al., 2012), but the structures detected were quite ambiguous, exhibiting both poor temporal stability and low statistical significance (an uncertainty partly ascribable to distinctiveness of individual volcanic events and their relatively brief periods of effect within the time frame of our analysis). Furthermore, aliasing of volcanic and ENSO activity (with major late-20th century eruptions coinciding with El Niño phases of ENSO) also needs to be considered when attributing the volcanic activity, as well as the possibility of its influence on the AMO phase (Knudsen et al., 2014). Interpretational pitfalls aside, there was a strong agreement between the observational data sets in their representation of the volcanism-attributed spatial pattern. 20CR data showed tendency toward more positive post-eruption temperature anomalies in several regions, resulting also in a more neutral response to volcanism in the globally averaged 20CR data (largely due to the anomalous response of 20CR-based global land temperature during the first half of our analysis period).
The temperature variability patterns related to the climate oscillations considered (SO, NAO, AMO, PDO, TPI) were generally captured similarly by individual data sets. This also applies to 20CR for the most part, though there seem to be some break-downs in the representation of trans-continental and trans-oceanic teleconnections in the reanalysis data, most noticeable in the influence of NAO over eastern Asia, AMO over northern parts of Eurasia or weakened links to SO and PDO in parts of Africa. One might speculate that this distinction is rooted in the specific behavior of the reanalysis engine, distorting the complex mechanisms propagating the teleconnections. However, an unrealistic representation of the long-distance links by the 20CR cannot be blamed automatically. Note that the differences detected are generally more prominent in the first half of the analysis period, and less striking (though still noticeable) during the later half-period (Fig. S6). The reanalysis may thus simply struggle to recreate the observed patterns in regions where the assimilable data are rare and relatively unreliable, just as the procedures generating the analysis-type gridded data are burdened with increased errors when faced with a lack of reliable inputs. Neither of these data sources can thus be considered consistently superior and increased attention to the effects of data uncertainty is needed when investigating climate variability in regions and periods with sparse observations. Keeping these limitations and specifics in mind, the 20th Century Reanalysis seems to provide a satisfactory approximation of the past temperatures during the 20th and early 21st century, and thus a suitable tool for studies concerned with validity of climate simulations.
Potential pitfalls related to the attribution of temperature changes to trend-like predictors were already discussed above, but even interpretation of the components associated with faster variable explanatory factors needs to be done with caution. Some of the internal climate oscillatory modes are interconnected, and their respective indices partly collinear. Variability assigned to a certain predictor does therefore not need to originate from the respective forcing factor alone – for instance, the relationship between SO/ENSO and PDO implies that effects of the variability modes in the Pacific area cannot be entirely separated, on neither physical nor statistical level. The issue of interdependent predictors is not limited to pair-wise relationships: it has been shown that various variability modes in the climate system are intertwined in quite complex networks, with nontrivial time-delayed relations among oscillations in different regions (e.g., Wyatt et al., 2012). Intricacy of such structures becomes even more apparent when generalized links are studied, unrestricted to just the conventional variability modes (e.g., Hlinka et al., 2013, 2014a, b).
Caution is also needed when interpreting the outcomes of the tests of statistical significance. The AR(1) model of residual autocorrelations, assumed here when assessing significance of predictors' connections to the gridded temperatures, provides basic approximation of the short-term persistence. Often, such an approach seems sufficient, especially over land where the residual autocorrelations generally rapidly approach zero. In other cases (particularly for tropical oceans and global averages encompassing oceanic areas), longer-term autocorrelations of various shapes appear in the residuals. Their presence is indicative of unaccounted-for components in the data, long-term memory and/or presence of biases and inhomogeneities, potentially infesting temperature analyses and reanalyses alike (e.g., Cowtan and Way, 2014; Ferguson and Villarini, 2014). To further assess the validity of our significance tests, bootstrap-based estimates of statistical significance for the gridded temperature data were also implemented, using a variable-sized moving block, reflecting the magnitude of residual autocorrelation (Politis and White, 2004; Bravo and Godfrey, 2012). Little difference in the regression outcomes was found compared to the other test designs in this paper. Artifacts of annual cycle were also often found in the residuals, traceable (at least in part) to non-stationary representation of the seasonal variations (Foster and Rahmstorf, 2011). A treatment by inclusion of components approximating the 12-month periodicity among the predictors was attempted, but resulted in no major changes to the regression coefficients or their significance.
Another important aspect shaping the outcomes of the regression mappings is the choice of the explanatory variables. Most of the predictors applied here exist in alternative variants, differing in their definition or method of (re)construction. A sizable discussion could be devoted to the specifics of each of them. While we did not study this issue in such depth, partial experiments were carried out to assess the degree of variability of the analysis outcomes if alternative predictors were used. First, robustness of the imprints of volcanic forcing was assessed, with GISS aerosol optical depth (Sato et al., 1993) substituted with Crowley and Unterman's (2013) data. The resulting change to the global temperature response and the corresponding spatial fingerprints proved to be minor, generally smaller than uncertainties associated with the regression coefficients themselves. Use of hemisphere-specific volcanic aerosol amounts instead of their global representation also induced just minor changes to the respective response patterns.
Of the multiple definitions of the indices characterizing the climatic
oscillations studied, we prioritized the forms not directly involving
temperature itself, to avoid explicit contribution of the temperature signal
to the explanatory variables. This was not a problem for NAO and TPI, as
their descriptors are derived from the baric characteristics. In the case of
ENSO, the pressure-based SOI was preferred over the SST-based NINO indices
or multivariate ENSO index. On the other hand, the usual forms of AMOI and
PDOI are calculated from areal SSTs, and thus likely interrelated with the
temperature signals. For PDOI, which exhibits comparatively weaker
correlation with globally averaged temperatures (at least partly due to the
fact that PDOI is, by its definition, detrended by global sea-surface
temperature), this issue seems less serious. However, it is still worthwhile
to see how much the outcomes change from employing another version of the
index. Use of the PDO index from JISAO (
Finally, it should be accentuated once again that the issue of attribution of climate variability cannot be completely resolved by statistical approach alone. Statistical solutions to this multifaceted problem therefore need to be considered alongside the GCM-based simulations, conceptually more universal than purely statistical approaches, yet still only partly successful in completely reproducing the observed features of the climate system (IPCC, 2013). Our results here hope to contribute to future efforts in this field: by showing the character and variability of temperature components formally attributable to various forcings across several data sets, their robustness (or lack thereof) was illustrated, providing a picture of the respective fingerprints, as well as support guidelines for the use of the respective data in validation of the climate models.
Several publicly available data sets were employed in our analysis. The specific references and internet links to the individual data sources are given in the text; all their authors and providers have our gratitude.
We gratefully acknowledge the support of Czech Science Foundation (GACR), through project P209/11/0956, of Ministry of Education, Youth and Sports of CR, through National Sustainability Program I (NPU I), grant number LO1415, and of Charles University, through project UNCE 204020/2012. We would also like to thank the two anonymous reviewers of the discussion version of the manuscript for their valuable comments and suggestions. Edited by: B. Kravitz