Evaluation of global teleconnections in CMIP6 climate projections using complex networks
 Deutscher Wetterdienst, Frankfurter Str. 135, 63067 Offenbach, Germany
 Deutscher Wetterdienst, Frankfurter Str. 135, 63067 Offenbach, Germany
Correspondence: Clementine Dalelane (clementin.dalelane@dwd.de)
Hide author detailsCorrespondence: Clementine Dalelane (clementin.dalelane@dwd.de)
In climatological research, the evaluation of climate models is one of the central research subjects. As an expression of largescale dynamical processes, global teleconnections play a major role in interannual to decadal climate variability. Their realistic representation is an indispensable requirement for the simulation of climate change, both natural and anthropogenic. Therefore, the evaluation of global teleconnections is of utmost importance when assessing the physical plausibility of climate projections.
We present an application of the graphtheoretical analysis tool δMAPS, which constructs complex networks on the basis of spatiotemporal gridded data sets, here sea surface temperature and geopotential height at 500 hPa. Complex networks complement more traditional methods in the analysis of climate variability, like the classification of circulation regimes or empirical orthogonal functions, assuming a new nonlinear perspective. While doing so, a number of technical tools and metrics, borrowed from different fields of data science, are implemented into the δMAPS framework in order to overcome specific challenges posed by our target problem. Those are trend empirical orthogonal functions (EOFs), distance correlation and distance multicorrelation, and the structural similarity index.
δMAPS is a twostage algorithm. In the first place, it assembles grid cells with highly coherent temporal evolution into socalled domains. In a second step, the teleconnections between the domains are inferred by means of the nonlinear distance correlation. We construct 2 unipartite and 1 bipartite network for 22 historical CMIP6 climate projections and 2 centurylong coupled reanalyses (CERA20C and 20CRv3). Potential nonstationarity is taken into account by the use of moving time windows. The networks derived from projection data are compared to those from reanalyses. Our results indicate that no single climate projection outperforms all others in every aspect of the evaluation. But there are indeed models which tend to perform better/worse in many aspects. Differences in model performance are generally low within the geopotential height unipartite networks but higher in sea surface temperature and most pronounced in the bipartite network representing the interaction between ocean and atmosphere.
 Article
(4622 KB) 
Supplement
(10631 KB)  BibTeX
 EndNote
The evaluation of general circulation models (GCMs) is one of the key topics of climate sciences. This evaluation is indispensable in the assessment of uncertainties in the projection of climate change. At the same time, it serves as a guideline for further model development.
Established methods of climate model evaluation include comparison of spatial and temporal means, and often also the variability, of important climate parameters such as air temperature, precipitation, wind speed, geopotential height, radiation, and energy fluxes between model output and observational/reanalysis data (Zhang et al., 2021). More elaborate evaluation techniques assess the temporal evolution of global mean/sea surface/hemispheric temperature (Papalexiou et al., 2020) with respect to increasing greenhouse gas concentration or regional trends (Duan et al., 2021).
Acknowledging its importance for consistent climate simulation, Simpson et al. (2020) evaluate the atmospheric circulation in terms of mean atmospheric fields, in combination with dynamical features like the jet stream, stationary waves, and blocking. In contrast, Kristóf et al. (2020) evaluated the positions of potential action centres of atmospheric teleconnections as a proxy for circulation.
Another approach is taken by Brands (2022) and Cannon (2020), who both assess circulation biases in correspondence to the representation of circulation types. Whereas Brands (2022) uses Lamb weather types, the analysis in Cannon (2020) is based on principal component analysis (PCA)derived modes of variability. Such modes of variability, extracted by eigentechniques from spatiotemporal gridded data, have been the objective of evaluation efforts in recent years as their spatial patterns are supposed to reflect largescale dynamical processes in the climate system. For example, Fasullo et al. (2020) and Coburn and Pryor (2021) have assessed the representation of six oceanic and atmospheric modes in terms of spatial and spectral accuracy, including an evaluation of the interaction between modes. Still, it has been recognised that eigenmethods suffer from a number of limitations because geometric constraints such as linearity and normality, orthogonality, and simultaneity do not correspond to physical properties of the climate system (Monahan et al., 2009; Fulton and Hegerl, 2021; Hynčica and Huth, 2020; Lee et al., 2019) and hinder their interpretation.
Besides, the evaluation of climate modes, such as El Niño–Southern Oscillation (ENSO) or North Atlantic Oscillation (NAO), is usually done at the component level. But it is the coupling among those components which defines the largescale variability in climate at interannual and decadal timescales (Tsonis et al., 2008; Steinhäuser and Tsonis, 2014).
Complex network methods are able to account for nonlinear, timelagged, and highorder interactions in highdimensional data and were introduced in climate sciences by the beginning of the 21st century (for an overview see Dijkstra et al., 2019). Such networks investigate the interdependencies between all their constituent components, thereby unveiling dynamical features that could remain hidden to traditional analysis techniques. A rather fundamental property of climate networks is their organisation in terms of communities – clusters of strongly connected nodes forming semiautonomous subcomponents of the climate system with nonaccidental similarity to many known modes of variability (Steinhäuser et al., 2009; Tsonis et al., 2011; Tantet and Dijkstra, 2014) that interact dynamically in multiple ways. Such an emergent property has been ascribed to the mismatch between spatial and temporal scales on a sphere, which allows only a finite number of degrees of freedom (Yang et al., 2021).
The comparison of such complex networkderived communities between climate simulations and observation/reanalysis data sets was used for evaluation purposes first by Steinhäuser and Tsonis (2014). They assessed the community structure in climatic fields finding rather low consistency between the model runs and the reference data set. Likewise, Fountalis et al. (2015) assessed the community structure of model simulations but complemented it with an evaluation of the interaction strength of the communities with ENSO. The idea was further developed by Fountalis et al. (2018) and Falasca et al. (2019) in their socalled δMAPS approach to comprise a whole network of all communities, which is evaluated with regards to the distribution and size of communities, the interaction strength, and the distribution of the links.
Note that there is another line of research into the evaluation of causal networks (for instance VázquezPatiño et al., 2019, or Nowack et al., 2020) which is somewhat different to the approach followed here.
In the present article, we explain (Sect. 3) and apply (Sect. 4) δMAPS (Fountalis et al., 2018) to construct functional networks for sea surface temperature (SST) and geopotential height at 500 hPa (Z500) fields, as well as a crossnetwork between SST and Z500, using GCM output data from the Coupled Model Intercomparison Project Phase 6 (CMIP6). We compare the derived networks to analogous networks from reanalysis data, namely CERA20C (Laloyaux et al., 2018) and 20CRv3 (Slivinski et al., 2019), to evaluate the capacity of the GCMs in reproducing complex nonlinear processes in the atmosphere and the ocean.
This assessment is all the more instructive as it is not possible to tune the teleconnections directly. In nature and in models, teleconnections emerge from the interplay of the governing equations under the condition of the boundaries. A model gets them right if and only if the model specifications are sufficiently well approximated and well balanced between model components.
The objective of the present study is to compare the interaction networks derived from CMIP6 GCM output from historical simulations to reference networks derived from two centurylong reanalyses in order to account for uncertainties in observations and differences in construction methods as recommended by Hynčica and Huth (2020), Lee et al. (2019), and others: (i) the Coupled Reanalysis for the 20th Century (CERA20C) provided by the European Centre for MediumRange Weather Forecasts (ECMWF; Laloyaux et al., 2018; 10 ensemble members and ensemble mean) and (ii) the NOAA–CIRES–DOE Twentieth Century Reanalysis version 3 (20CRv3) provided by the National Oceanic and Atmospheric Administration (NOAA)/Physics Science Laboratory (PSL) (Slivinski et al., 2019; best estimate).
The presented study is intended to help the selection of physically plausible GCM runs for further dynamical downscaling in the Coordinated Downscaling Experiment–European Domain (https://www.eurocordex.net/, last access: 3 February 2022). Therefore, the CMIP6 model ensemble evaluated here follows the list of model runs under consideration in EUROCORDEX for which all necessary forcing data had been provided at the time of writing, plus some extra models (Table 1).
Bi et al. (2020)Ziehn et al. (2020)Wu et al. (2019)Swart et al. (2019)Danabasoglu et al. (2020)Cherchi et al. (2019)Cherchi et al. (2019)Voldoire et al. (2019)Séférian et al. (2019)Döscher et al. (2022)Döscher et al. (2022)Roberts et al. (2019)Boucher et al. (2020)Tatebe et al. (2019)Hajima et al. (2020)Gutjahr et al. (2019)Müller et al. (2018)Yukimoto et al. (2019)Seland et al. (2020)Seland et al. (2020)Lee et al. (2020)Sellar et al. (2019)We consider the parameters sea surface temperature (SST) and geopotential height at 500 hPa (Z500). These are relatively wellobserved and smoothly varying fields suitable for the construction of networks. Steinhäuser et al. (2012) confirm good network properties for SST and Z500 with many proximitybased correlation links as well as a large number of teleconnections. In accordance, Donges et al. (2011) found the maximal link density for geopotential height at about 4 to 6 km height, and Wiedermann et al. (2017) detected the highest transitivity between SST and geopotential height at 500–300 hPa.
From the coupled network perspective, it would be highly desirable to include further parameters into the analysis like sea surface salinity or, more interestingly, variables from the stratosphere and the deep ocean. Unfortunately, the observations of such parameters have only recently become more reliable and less sparse, such that the fidelity of their reanalysis fields is impossible to verify.
The SST (Z500) data were remapped to a common grid of $\mathrm{2.25}{}^{\circ}\times \mathrm{2.25}{}^{\circ}$ ($\mathrm{2.5}{}^{\circ}\times \mathrm{2.5}{}^{\circ}$) resolution. Regions with sea ice are avoided in SST as well as circles of 5^{∘} radius around the poles at Z500 because of possibly biased representation of the polar vortices. The analysis is carried out for seasonal anomalies in the overlapping time period from 1901 to 2010.
The procedure used to assign an assessment score to each model run comprises a number of algorithmic stages that build on each other. As they are not yet well known in the climatological community, we present them in detail in the following subsections:

Detrending with trend EOF (Sect. 3.1)

Network construction with δMAPS (Sect. 3.2)

Domain identification (Sect. 3.2.1)

Network of domains (Sect. 3.2.2)


Distance covariance and distance correlation (Sect. 3.3)

Distance multivariance and distance multicorrelation (Sect. 3.3.1)


Comparison of networks with structural similarity index and multivariate network quality score (Sect. 3.4).
3.1 Detrending with trend EOF
Prior to the construction of the δMAPS networks, the data have to be detrended to avoid the correlations being distorted by longterm trends. Although it is still the most widely used technique, linear detrending has been shown to be hardly appropriate to remove the effects of external forcing (anthropogenic and natural) from climatic time series (Frankignoul et al., 2017), given its nonlinear structure and the dynamical response mechanisms including longrange memory. Conventional empirical orthogonal function (EOF) decomposition is not well suited for trend detection either for a number of reasons (Hannachi, 2007), which often cause the spreading of longterm trends between several modes of internal variability. Instead, we apply a nonparametric technique, socalled trend EOF (Hannachi, 2007), which identifies spatial patterns of trends defined as a common nonlinear, but monotone increase. The method is based on the singular value decomposition (SVD) of the matrix of inverse ranks, instead of the direct observations as in conventional EOF analysis. Since sequences of inverse ranks provide a robust measure of monotonicity, trend EOFs are able to separate patterns associated with monotone (nonlinear) trends, albeit small, from patterns not associated with trends.
Trend EOFs have been applied since in a number of studies (e.g. Barbosa and Andersen, 2009, Li et al., 2011, Meegan Kumar et al., 2021, among others). Fisher (2015) compared trend EOFs, along with conventional EOFs, to a selection of other PCAbased techniques, which are designed to extract space–time patterns maximising criteria like persistence, predictability, or autocorrelation. In contrast to conventional EOFs, all the tested methods very robustly detect a leading EOF pattern with a respective principal component (PC) that presents a distinct nonlinearly increasing trend. We consider trend EOFs therefore to be an appropriate technique for identifying anthropogenic greenhouse gas (GHG)forced trends.
Let X=((x_{it})) be the matrix of anomaly data at grid cells i=1…n (numbered consecutively) and times t=1…T. The time series x_{i} at grid cell i is transformed to the vector of inverse ranks q_{i} by setting q_{it} equal to the time position of the tthlargest value in x_{i}. The sequence q_{i} indeed reflects the total monotonicity of x_{i}: in monotone series the inverse ranks are ordered according to the trend. The stronger the trend in x_{i}, the stronger the pattern in q_{i}. By maximising the correlation in Q=((q_{it})), we find a common trend that is shared (to some extent) by all grid cells, which makes sense in light of GHGforced warming.
After centring and cosine weighting of Q with respect to the corresponding latitude, the principal components and the loading patterns are obtained by SVD: Q=UΣV^{T}. The trend is now concentrated in the first (few) principal component(s), strongly distinguished by high eigenvalue(s) standing out over the remaining low and slowly descending spectrum. If second or thirdorder outstanding eigenvalues should be detected, they indicate additional, regionally confined independent trends, which are generated by internal dynamical feedback processes. For our purpose of identifying regions with coherent time evolution, we would therefore want to retain such regional trends and eliminate only the trend associated with the first trend PC. Likewise, regional trends caused by volcanic eruption are most probably not filtered either by the first trend EOF. However, the impacts of 20thcentury eruptions lasted only for short time periods, and on the other hand they are not well represented in surfaceinput reanalyses like CERA20C and 20CRv3 (Fujiwara et al., 2015). We therefore assume that our evaluations remain valid. The first trend PC u_{1} is now transformed back to physical space by projection w_{1}=Xu_{1}, and the corresponding spatial pattern is composed of the regression coefficients between the trend PC w_{1} and the anomaly time series of the original field ${\mathit{x}}_{i},\phantom{\rule{0.33em}{0ex}}i=\mathrm{1}\mathrm{\dots}n$.
To allow for an annual cycle in the trend patterns, we extend the trend EOFs in analogy to seasonreliant EOFs (Wang and An, 2005; see also cyclostationary EOFs in Yeo et al., 2017), $\mathbf{Q}=\left({\mathbf{Q}}_{\text{MAM}}\right{\mathbf{Q}}_{\text{JJA}}\left{\mathbf{Q}}_{\text{SON}}\right{\mathbf{Q}}_{\text{DJF}})$ (seasonally centred, inverse ranks calculated for each season individually), which extract a recurrent sequence of seasonal trend patterns with one associated trend PC for the magnitude of the whole cycle as opposed to one common pattern for all seasons as in nonseasonal EOF analysis or four individual patterns with their associated individual PCs as in seasonal EOFs, respectively. At this stage it would be possible to apply a secondary SVD to the seasonal warming patterns to obtain a smoother annual cycle. While such a procedure seems undue for seasonal data, it would be a reasonable approach in the case of monthly data. Instead of applying two sequential EOFs to Q, a tensor decomposition like higherorder singular value decomposition (HOSVD; De Lathauwer et al., 2000) would serve this purpose more elegantly.
After having detrended the time series, we are able to standardise the seasonal variances without the interference of the seasonal trends, which would otherwise bias our estimates. On their part, seasonally varying variances could degrade the estimated correlations between grid cells in the first stage of the δMAPS algorithm, giving increased weight to seasons with higher variance. In turn, the spatial component of the variance will be important in the second stage of δMAPS; therefore we augment the deseasonalised time series again with their overall (nonseasonal) variance.
3.2 Network construction with δMAPS
3.2.1 Domain identification
To infer the functional interactions within and between spatiotemporal gridded data sets of climatological parameters, we adopt the δMAPS algorithm proposed by Fountalis et al. (2018). This algorithm is rooted in network sciences/graphical modelling, in which graphs are used to express the dependence structure between random variables. A graph or network consists of a set of nodes connected by a set of edges, which describe the interactions between the nodes. Networks can be classified depending on their topology: simple networks like lattices and fully connected networks or complex networks like scalefree and smallworld networks. Smallworld networks are often observed in climate and other earth sciences, in the human brain, and in social networks. Their nodes are strongly clustered into semiautonomous components, and the average shortest path length between any two nodes is small.
In contrast to structural networks or flow networks, where the edges are physically observable (like wired connections or trajectories of particles, respectively), functional networks are inferred from the behaviour of the nodes. We consider the grid cells of a selected climatological field as the nodes of the graph. The spatial embedding is naturally given by the locations of the grid cells. In Fountalis et al. (2018) the edges of a fully connected gridcelllevel network are defined using the unpruned Pearson correlation ϱ of the time series as an association measure between any pair of nodes. Based on this weighted network, the δMAPS algorithm identifies semiautonomous components D_{1}…D_{K}, called domains. A domain is a spatially contiguous set of grid cells with highly correlated temporal activity. Fountalis et al. (2018) propose an iterative algorithm that alternately expands and merges a preliminary set of domain seeds S (neighbourhoods with locally maximal correlation, 3×3 grid cells in our case) so as to find the maximum possible sets of grid cells that satisfy the homogeneity constraint δ: let D be a spatially contiguous set of grid cells with cardinality $\leftD\right$
where ϱ_{ij} is the correlation between the time series at grid cells i and j, and δ is a chosen parameter to regulate the number and size of the domains. The domains are expanded to neighbouring grid cells (one at a time) as long as ϱ_{D}≤δ. Two domains D_{i} and D_{j} are merged if they contain at least one pair of adjacent grid cells, and their union still satisfies the threshold δ. The algorithm stops when no more domains can be merged or expanded.
The number of domains K generated by this algorithm is not predefined. Overlapping domains are allowed in δMAPS because grid cells might be influenced by more than one physical process. If a grid cell does not satisfy the homogeneity constraint with any of its neighbours, it remains unassigned. Deviating from Fountalis et al. (2018), we use Spearman's rank correlation to determine the similarity between grid cells to allow for monotone, yet nonlinear association. Furthermore, we set the threshold δ for minimal average correlation within a domain to equal a selected high quantile of all pairwise correlations (our δ is not based on a significance test; therefore there is no need to correct for autocorrelation). Lower thresholds allow the domains to expand and merge, further resulting in a smaller number of spatially larger domains, which means lower parcellation, and vice versa. In Sect. 4, we choose δ so as to produce “intuitive” domains evocative of known teleconnection patterns.
In Falasca et al. (2020), the identification of domains was further refined: grid cells are assigned to a common domain if their timevarying complexity (quantified by recurrence entropy) evolves coherently. Coherent evolution of complexity reflects coherent dynamical evolution and is thus an even stronger indicator of semiautonomous component organisation than correlation between the original climatological time series. But for complexity time series to be constructed, the proposed recurrence measure has to be evaluated on moving time windows (100year windows over 6000 years of monthly values in Falasca et al., 2020). Unfortunately, our time series are not long enough to detect complexity changes by means of recurrence entropy (nor to actually occur in the real climatological fields), so we have to stick to the original definition of δMAPS in Fountalis et al. (2018).
The first stage of δMAPS is a local community detection algorithm, where the criterion to maximise is the number of grid cells assigned to a minimum number of communities under the conditions (i) ϱ_{D}≥δ, (ii) D being spatially contiguous, and (iii) D containing a seed s∈S (Fortunato and Hric, 2016). As this problem is NPhard (solvable in polynomial time; Fountalis et al., 2018), the greedy algorithm of Fountalis et al. (2018) only approximates one possible solution. Despite this, it is able to detect meaningful communities of any size (no preferred scale) and independently from the network structure in other spatial regions.
3.2.2 Network of domains
Subsequently, the domains identified above serve as supernodes in the second stage of δMAPS. A functional weighted network is inferred between the domains on the basis of a dependence measure (in Fountalis et al., 2018, the lagged Pearson correlation is used; we use distance correlation; see Sect. 3.3). The time series of a domain is defined as
where φ_{i} is the latitude of grid cell i. In contrast to Falasca et al. (2019), we use the means instead of the sums of the grid cells for domain time series. We do so because otherwise the variances of the domains would grow with their size, something that would hinder interpretation. On the other hand, the spatial correlation within the domains, the precondition for grid cells to form a domain, impedes the decrease in the variance of the domain mean following the central limit theorem at the rate of $\sqrt{\leftD\right}$. Instead, the variances of the domain means are of comparable magnitude regardless of the domain size.
Every possible link with every possible lag $L\le l\le L$ is tested for significance, which constitutes a multipletesting problem such that the cumulative probability of type I errors increases. One way to control the false discovery rate FDR to be smaller than a predefined level α was proposed by Benjamini (2010): the p levels of the individual tests are in ascending order, ${p}_{\left(\mathrm{1}\right)}\le \mathrm{\dots}\le {p}_{\left(\frac{\mathrm{1}}{\mathrm{2}}K\right(K\mathrm{1}\left)\right(\mathrm{2}L+\mathrm{1}\left)\right)}$, and the hypothesis (H_{0}; link is insignificant) is rejected only for those tests where ${p}_{\left(k\right)}<\frac{\mathrm{2}k\mathit{\alpha}}{K(K\mathrm{1})(\mathrm{2}L+\mathrm{1})}$.
The network consists of two maps, D (D: set of nodes (grid cells) ⟶ power set of domains 𝒫(D_{1}…D_{K}), which assigns one/several/no domains to every grid cell) and W (W: set of pairs of domains $\mathit{\left\{}{D}_{\mathrm{1}}\mathrm{\dots}{D}_{K}\mathit{\right\}}\times \mathit{\left\{}{D}_{\mathrm{1}}\mathrm{\dots}{D}_{K}\mathit{\right\}}\phantom{\rule{0.33em}{0ex}}\u27f6$ maximal (lagged) dependence ∈ℝ, which assigns every pair of domains a link that equals the maximal (lagged) dependency between them; we allow lags up to 10 seasons).
The distinction between grid cells that are dependent within the same domain and grid cells that are dependent across two different domains allows δMAPS to differentiate between local diffusion phenomena and remote interactions as for instance an atmospheric bridge or an oceanic tunnel (Liu and Alexander, 2007).
Since the techniques to construct the δMAPS network are statistical, long time series are convenient in order to obtain robust estimates of the dependence measures. In the case of nonstationarity, such estimates would be biased and reflect only a temporal average connectivity between the components of the network. The time dependence can be addressed using evolving networks, which are constructed over sliding time windows (see for instance Kittel et al., 2021, and Novi et al., 2021). The present study considers a timeconstant network for the period 1901–2010 and a shorterperiod network for 1951–2010, where more observations are available for assimilation into the reanalyses. To investigate the temporal evolution, a third network is constructed for 1901–1955.
The complex network framework offers a lot more approaches in order to exploit the richness of the data, as for instance multiscale, causal, and multilayer networks. Wavelet multiscale networks were proposed for investigating interactions in the climate system simultaneously at different temporal scales, revealing features which usually remain hidden when looking at one particular timescale only (Agarwal et al., 2018, 2019). Interactions between processes evolving on different timescales are investigated by Jajcay et al. (2018). Moreover, as the number of identified domains within a climatological field is drastically smaller than the number of original grid cells, this also opens up the possibility of investigating the causal relationships between them (Nowack et al., 2020), although the basic assumption of causal network inference that the dependence structure can be represented by a directed acyclic graph is questionable in the climate context. The construction of both dependencebased and causal networks can naturally be extended to crossnetworks, which include multiple fields (Feng et al., 2012; Ekhtiari et al., 2021).
3.3 Distance covariance and distance correlation
As physical processes in climate are highly dynamical and mostly nonlinear (Donges et al., 2009), we decided to substitute the Pearson correlation in the second step of network inference by a nonlinear dependence measure: distance correlation proposed by Székely et al. (2007). To begin with, distance covariance, calculated from the pairwise Euclidean distances within each sample, is an analogue to the productmoment covariance, but it is zero if and only if the random vectors are independent. The intuition of distance covariance is that if there exists a dependence between the random variables X and Y, then for two similar realisations of X, say x_{s} and x_{t}, the two corresponding realisations of Y, y_{s} and y_{t}, should be similar as well. Note that the opposite (x_{s}, x_{t} unsimilar ⟹ y_{s}, y_{t} unsimilar) is true for linear dependence, but not true in general.
Unlike the widely used information measures, distance covariance has a compact representation, is computationally fast, and is reliable in a statistical sense for sample sizes common in climatology because it is not necessary to estimate the density of the samples. We use the unbiased version of distance covariance given in Székely and Rizzo (2014). Let (x_{t}),(y_{t}), t=1…T be a statistical sample from a pair of real or vectorvalued random variables X and Y. First, compute all pairwise Euclidean distances:
Then perform a double centring for all s≠t:
Then distance covariance dCov is defined as
Distance variance dVar and distance correlation dCor are defined analogously to moment variance and moment correlation, respectively:
Distance correlation has a number of desirable properties:

0≤ dCor$(X,Y)\le \mathrm{1}$;

dCor$(X,Y)=\mathrm{0}\u27faX,\phantom{\rule{0.33em}{0ex}}Y$ independent;

$\text{dCor}(X,Y)=\mathrm{1}\u27faY$ is a linear transformation of X.
Distance correlation is furthermore robust against autodependence (Fokianos and Pitsillou, 2018), which eliminates the need to correct for autocorrelation, as was done in Fountalis et al. (2018). The correction of autocorrelation involves the estimation of a rather large number of autocorrelation coefficients. This might add to statistical uncertainty, and its expendability is therefore statistically advantageous.
An efficient test of distance correlation based on the χ^{2} distribution was proposed by Shen et al. (2022), which is universally consistent and valid for α≤0.05:
Distance correlation is defined between vectors of arbitrary dimension. One way to take advantage of this property in the construction of networks would be to assign the measurement of more than one climatological variable to every node, e.g. sea surface temperature and salinity or 500 hPa geopotential height and temperature.
We apply distance correlation in the network inference between the domains, but not in the construction of the domains. The reason is that in domain construction we are looking for similar temporal behaviour between grid cells. We choose Spearman's rank correlation because it accounts for nonlinear, yet monotone association. In contrast, in network inference we are expressly interested in nonlinear dependence including nonmonotonicity.
3.3.1 Distance multivariance and distance multicorrelation
Distance correlation has also been generalised to distance multivariance/multicorrelation by Böttcher et al. (2019) to measure the dependence between an arbitrary number n of random variables in the sense of Lancaster interaction (Lancaster, 1969; Streitberg, 1990). The Lancaster interaction ΔF quantifies the fraction of dependence between them that is not explained by factorisation, their synergy. For n=3, let F_{123} be the threedimensional joint distribution function of X_{1}, X_{2}, and X_{3}; F_{12}, F_{13}, and F_{23} the pairwise joint distributions; and F_{1}, F_{2}, and F_{3} the marginal distribution functions. Then the Lancaster interaction is defined as
the fraction of F_{123} that is not explained by pairwise dependence. Lancaster interaction excludes, in particular, linear dependence as this is indeed explained by pairwise dependence.
The concept of higherorder dependence is related to joint cumulants and higherorder moments in that ${\mathit{\kappa}}_{n}\left({X}_{\mathrm{1}}\mathrm{\dots}{X}_{n}\right)=\int {x}_{\mathrm{1}}\mathrm{\dots}{x}_{n}d\mathrm{\Delta}F$ (Streitberg, 1990). Joint cumulants are traditionally applied in multiplepoint statistics and hyperspectral analysis to describe nonlinear interaction and nonGaussian multidimensional distributions. Climate science has seen only a small number of implementations, including the contributions of Carlos A. L. Pires related to teleconnections (e.g. Pires and Hannachi, 2017, 2021). As a feature of complex systems, higherorder interactions have already been recognised as critical for the emergence of complex behaviour such as synchronisation and bifurcation in scientific fields as diverse as social networks science, ecology, molecular biology, quantum physics, neurosciences, epidemics, geodesy, image processing, and genetics (Battiston et al., 2020), and tools for the construction of hypergraphs (graphs with links that comprise more than two nodes) are increasingly available. To our knowledge, hypergraphs have not yet been introduced in climatology.
Distance multivariance is defined analogously to distance variance (Eq. 1) and is a strongly consistent estimator of Lancaster interaction (Böttcher et al., 2019). For n=3 and C_{st} the analogue to A_{st} and B_{st} for a third random variable Z is
Distance multicorrelation is defined likewise, with a slightly different normalisation:
Obviously, distance covariance between two random variables is covered by distance multivariance for n=2. Significance tests for distance multivariance are also given in Böttcher et al. (2019). As the asymptotic test is conservative, and furthermore, in the case of nonzero pairwise dependence, the test statistic is not guaranteed to diverge, it is convenient to choose a larger FDR level than the usually employed significance levels between 0.1 and 0.01.
3.4 Comparison of networks with structural similarity index and multivariate network quality score
This study aims at comparing the interaction networks derived from CMIP6 model output to the selected reference networks. Our metric of comparison netSSIM is a modification of the netCorr criterion for functional networks developed by Falasca et al. (2019). The netCorr is a sophisticated metric which evaluates the differences in topology and connectivity, combined in the adjacency matrix M of each network, simultaneously. Let $\mathbf{M}={\left(\left({M}_{ij}\right)\right)}_{i,j=\mathrm{1}}^{n}$ be a square matrix of dimension n (number of grid cells) with
where we set $W({D}_{k},{D}_{l})=\text{dCor}({\mathit{x}}_{{D}_{k}},{\mathit{x}}_{{D}_{l}})$. Alternatively, M could be rearranged in a fourmodal hypermatrix or tensor made of the Kronecker product of the lat–long field times itself containing the dependencies between the grid cells.
Apart from replacing Pearson with distance correlation, our definition of M differs from the one in Falasca et al. (2019) in three aspects. Firstly, our links are undirectional because distance correlation is much less sensitive to temporal lag than Pearson correlation. The distance correlation coefficients for lags $\mathrm{10}\le L\le \mathrm{10}$ differ only marginally from the value for L=0. So although we do construct M using maximum lagged distance correlation, we do not venture to infer the direction of the interaction from it. Secondly, we have defined $W({D}_{k},{D}_{k})=\mathrm{1}$, causing M_{ij}=1 if x_{i} and x_{j} pertain to the same domain (and no other) to emphasise that grid cells within one domain are more strongly linked to each other than to the grid cells of other domains. Thirdly, we set M_{ij} the average of the links between domains that x_{i} and x_{j} belong to instead of the maximum as a means to account for overlapping domains. We do not apply any weighting to this average because the mean internal rank correlation within each domain, i.e. the bond of a grid cell to its domains, is equally ≈δ by construction.
The netCorr between two networks measures the spatial correlation between the respective adjacency matrices, not considering the overall level and variability within the networks. We propose to augment netCorr to netSSIM. SSIM is the structural similarity index, a measure very popular in image processing, which combines terms for brightness (mean), contrast (variance), and structure (pattern correlation) of images (Wang et al., 2004). It was introduced to the hydrological/meteorological community by Mo et al. (2014). Let X and Y be two gridded fields:
where μ_{X} and μ_{Y} are the means, ${\mathit{\sigma}}_{X}^{\mathrm{2}}$ and ${\mathit{\sigma}}_{Y}^{\mathrm{2}}$, are the variances, σ_{XY} is the Pearson covariance between X and Y, and small constants ${c}_{\mathrm{1}}={c}_{\mathrm{2}}={c}_{\mathrm{3}}$ (we choose 0.00001) ensure regularity. The SSIM ranges from −1 to 1; it equals 1 only in the case of identity and −1 for an antianalogue (equal mean and variance, but correlation $=\mathrm{1}$). SSIM =0 means no similarity. Note that the SSIM is not invariant under translation and rotation, which corresponds to our requirements because we want the teleconnections to sit in the right place. SSIM is not a distance metric, but a distance metric can be constructed from it (Brunet et al., 2011).
Falasca et al. (2019) recommend the use of their netCorr criterion always in combination with a criterion comparing the strength of the interaction, which they define as the sum of the links of a particular domain in terms of covariance. We argue that the strength is a criterion that intermingles the distribution of interactions between the domains with the variances of the domains, which, in turn, are determined by the size of the domains and the variances of the included nodes. We therefore prefer to evaluate the interactions on their own using the netSSIM. The evaluation of the variances (or standard deviations) of model output data is a task that is already routinely performed in conventional evaluation setups.
We apply the (latitudeweighted) SSIM to two adjacency matrices M (Eq. 8) constructed from the significant distance correlations in two reference and/or model networks. In this way, we calculate netSSIM indices for the unipartite networks for SST and Z500 and for the crossnetworks between the SST and Z500 domains.
Alternatively, we could calculate the SSIM between adjacency matrices in a pointwise manner, comparing the slices of the fourmodal hypermatrices that correspond to the links of one individual grid cell to all others and then taking the weighted mean of all pointwise SSIMs.
Finally, we define a network quality score (NQS) by applying an exponential transform to the netSSIMs, which projects them to the interval [0,1] (recall that the netSSIM lives on $[\mathrm{1},\mathrm{1}]$). The same transform was used in Sanderson et al. (2015) and Brunner et al. (2020) to construct quality scores from error measures, which are later fed into a model selection algorithm.
In order to combine the three NQSs with respect to SST, Z500, and SST–Z500, we take the geometric mean (equal to the exponential of the arithmetic mean of the squared differences (1−netSSIM)^{2}). This shall be the multivariate network quality score MNQS:
The MNQS corresponds to the exponential transform of the squared Euclidean distance between the threedimensional vectornetSSIM and the ideal vectornetSSIM value $(\mathrm{1},\mathrm{1},\mathrm{1})$, which would be attained by a network identical to the reference, normalised with the distance between $(\mathrm{1},\mathrm{1},\mathrm{1})$ and $(\mathrm{0},\mathrm{0},\mathrm{0})$, $(\mathrm{0},\mathrm{0},\mathrm{0})$ being the value which indicates no similarity.
Any other vector norm could be utilised for the construction of MNQS, for instance an L_{p}norm with p≠2 or some weighting of the directions. The netSSIMs for additional parameters can be incorporated into the MNQS in a straightforward way. Finally, the considered models can be ranked with respect to these scores.
The netSSIM is also useful when exploring the differences between networks in more detail. As mentioned above, the slices of M with respect to a single grid cell or domain can be compared one by one. It is further possible to calculate the netSSIM for all pairwise links in a certain region, excluding the rest of the globe, or for all links from one region to another. This way, differences across models or time periods can be tracked down directly to their origin.
We demonstrate the functioning of every subprocedure considering the CERA20C ensemble mean over the whole period 1901–2010 as an example. All procedures are furthermore applied to the periods 1901–1955 and 1951–2010. Individual runs of CERA20C as well as 20CRv3 and CMIP6 model realisations are discussed depending on special interest.
4.1 Detrending with trend EOF
Trend EOFs (Hannachi, 2007), as introduced in Sect. 3.1, produce time series of common change (in SST and Z500) generated from the trend PCs in the inverserank space and the respective trendloading patterns (four seasonal trendloading patterns per trend PC in the case of seasonreliant/cyclostationary trend EOFs), indicating regions of stronger/weaker change. As expected, the increase in SST is concentrated in the first trend PC (the leading eigenvalues are 30 to 50 times higher than the trailing ones), the other trend PCs showing no secular trend. Figure 1a depicts the global mean sea surface temperature anomaly (GMSSTa) (with respect to the base period 1961–1990) in the CERA20C ensemble mean, the forced temperature increase estimated by the first trend EOF and the detrended anomalies. For comparison, we show the same plot for linearly detrended SSTs in Fig. S1 in the Supplement. The gridcellwise detrended anomalies are deseasonalised with regard to variance.
The GMSSTa derived from trend EOFs in all runs of CERA20C (not shown) as well as in the ensemble mean show a very similar evolution among each other and to Zhu et al. (2018), the breakpoints in temperature increase postulated therein at 1942, 1975, and 2004 clearly discernible. Likewise, the physical spaceloading patterns of the ensemble mean (Fig. 1b) and all runs of CERA20C are very similar to each other and resemble the leading modes extracted using slow feature analysis and dynamical mode decomposition in Fulton and Hegerl (2021), identified as warming trends.
Analogous plots for geopotential height anomalies at 500 hPa for the CERA20C ensemble mean can be found in Fig. 1c and d. Unfortunately, we were not able to find any comparable study in the literature, where Z500 was analysed for trend over the 20th century. Gillett et al. (2013), Knutson and Ploshay (2021), Garreaud et al. (2021), and Raible et al. (2005) considered sea level pressure (SLP) trends over different time periods and regions. Although not fully comparable, there is a certain similarity.
The projected trends as well as the loading patterns in the 20CRv3 best estimate are somewhat different for the period 1901–2010 (Fig. S2) but agree much better for 1951–2010 (not shown). This might well be related to low observational coverage during the first half of the century; we thus take this disagreement as a signal for caution.
When subject to the same procedure, the CMIP6 model output SST and Z500 anomalies produce trend EOFs and loading patterns roughly similar to CERA20C and 20CRv3 (not shown). Differences are more or less obvious, though, such that an evaluation of the GMSSTa time series in the spirit of Papalexiou et al. (2020) would be an obvious choice but is out of the scope of this paper.
4.2 δMAPS for CERA20C on 1901–2010
4.2.1 Domain identification
Our algorithm, presented in Sect. 3.2.1, combines grid cells with highly rankcorrelated time evolution into domains. Domains have to be contiguous but may overlap; grid cells may remain unassigned. Average mutual rank correlation within a domain has to be higher than a selected threshold δ; we examined the quantiles ${q}_{\mathrm{0.9}}\left({\mathit{\varrho}}_{ij}\righti\ne j=\mathrm{1}\mathrm{\dots}n)\le \mathit{\delta}\le {q}_{\mathrm{0.99}}({\mathit{\varrho}}_{ij}i\ne j=\mathrm{1}\mathrm{\dots}n)$ of all pairwise rank correlations. The plots included in this paper refer to thresholds q_{0.95} for SST and δ=q_{0.93} for Z500, chosen for their intuitive parcellation of the fields evocative of known teleconnection patterns. As varying the threshold affects the networks for different data sets in a similar way, the choice of δ changes the results only marginally.
The domains constructed this way from the detrended, deseasonalised SST anomalies of the CERA20C ensemble mean include all important SST teleconnection patterns with interannual to decadal timescales (see for example Messié and Chavez, 2011). The map of the CERA20C SST domains (Fig. 2a) resembles the corresponding maps for COBEv2 and HadISST in Falasca et al. (2019) reasonably well, taking into account the differing data sets and time periods. Their main domains are clearly identifiable: El Niño–Southern Oscillation (ENSO; o11, for its broad extension also reminiscent of region 2 of the Interdecadal Pacific Oscillation (IPO) tripole in Henley et al., 2015), the horseshoe pattern (o7), the South Pacific (o9), the Indian Ocean (o3), the North Tropical Atlantic (o15, with extension to the extratropics), the South Tropical Atlantic (o1). Furthermore there are domains in the extratropical southern (o2) and eastern Indian Ocean (o4), the extratropical southern (o12) and northeastern (o14) Atlantic and the Norwegian Sea (o16), the Gulf Stream (o13), the North Pacific Current (o8, region 1 of the IPO tripole), a domain corresponding to region 3 of the IPO tripole (o10), the Kuroshio Extension (o5), and a domain south of Australia including the Great Australian Bight (o6). Areas where sea ice occurs are omitted because of the confounding effect on SST.
In the CERA20C Z500 map of domains (Fig. 2b), the seasonally migrating Tropical Belt (TB; a15) formed by the Hadley circulation and the two polar cells (Arctic a1 and a13 largely overlapping and Antarctic a3) stand out, stretching around the whole globe. The midlatitudes are populated by numerous domains with more (over ocean) or less (over land) pronounced zonal extension (cyclone tracks). The missing segmentation of the tropical belt into several domains probably results from the seasonal time resolution.
4.2.2 Network of domains
The domains of SST and Z500 are now ready for network construction (see Sect. 3.2.2). Figure 2c illustrates the maximum lagged ($\mathrm{10}\le L\le \mathrm{10}$) distance correlations (Sect. 3.3) for all pairs of SST domains in the CERA20C ensemble mean, omitting the geographic information for enhanced clarity. Only significant links to the FDR level α=0.05 (Sect. 3.2.2) are shown. However, even weak links are assessed as significant because the time series are long enough (4 seasons × 110 years) to allow the distance correlation to be estimated accurately. The darkest shades (except of the self links) correspond to the links between ENSO (o11), the horseshoe (o7), and IPO3 (o10): o11↔o7, o11↔o10, o7↔o10. We see enhanced connectivity of o7, o10, and o11 to the northern and southern Pacific Ocean (o8, o9) and from the Pacific to the Indian Ocean (o7/o10/o11↔o3), corresponding to known ENSO teleconnections, but not to the Kuroshio Extension (o5). The southern Indian Ocean domain is furthermore linked to the South Atlantic (o2↔o12).
The intraAtlantic links are much weaker: o14, o15, and o16 are largely overlapping domains and together conceivably form the Atlantic Multidecadal Oscillation (AMO); the Gulf Stream is linked to the northeastern Atlantic (o13↔o14) as well as the tropical to the extratropical South Atlantic (o1↔o12). The South Atlantic is also weakly connected to all North Atlantic domains (o12↔o13/o14/o15/o16), but there is no link between the North Atlantic and the South Tropical Atlantic (o1). It might be hypothesised that this bypass is related to the thermohaline circulation that tunnels the shallow subtropical cell (Liu and Alexander, 2007). According to the network, the Atlantic is connected to the other oceans only via the Southern Ocean, with links o12/o13/o14/o16↔o9, o14/o15/o16↔o6, o1/o12↔o2, and o16↔o2 that appear rather weak, although visible against their virtually zero background. A link between the South Tropical Atlantic (o1) and ENSO (o11) as proposed in Falasca et al. (2019) and RodríguezFonseca et al. (2009) is not apparent in our network. This absence is likely caused by the nonstationarity of this link, which was not observed before 1970. Nevertheless, it does appear when a network is constructed for the period 1971–2010 (not shown).
Note that allowing for lagged dependence changes the network only marginally compared to a network with only instantaneous links. Few connections are increased in strength of distance correlation by more than 0.05 and none by more than 0.1. All links already exist in the instantaneous network, and the structure of the network remains unchanged.
The network between CERA20C Z500 domains (Fig. 2e) is considerably weaker than the SST network, possibly a consequence of the stronger highfrequency variability in the Z500 time series in response to seasonally varying solar forcing combined with weaker lowfrequency variability caused by stronger mixing of the freely flowing air masses. Moreover, many of the known atmospheric teleconnections vary considerably throughout the year, which weakens the allseason dependence between the involved domains. Apart from the overlapping domains a1/a13, a9/a10, and a7/a12, the Tropical Belt (a15) is the most strongly connected domain with links to the midlatitudinal Ferrel cell domains, enveloping the cyclone tracks, over all oceans (a15↔a4/a7/a9/a10/a12/a14), to which the undisturbed Hadley circulation releases a substantial amount of energy. Domains over land have fewer and weaker links. Known atmospheric teleconnections are clearly identifiable: the Pacific North America Pattern (PNA) with links a10↔a11, a10↔a14, but interestingly not a11↔a14, and the North Atlantic Oscillation (NAO) with a link a13↔a14 (and much weaker a1↔a14). Other complex teleconnections also seem to involve the Arctic domains: a1↔a2/a6/a8/a16 and a13↔a8. In contrast, the Antarctic domain (a3) is largely autonomous, as discussed in Spensberger et al. (2020). Lagged dependence is irrelevant in the Z500 network.
We notice that many known atmospheric teleconnections are defined as higherorder modes of some EOF decomposition. As such they exist only as additive modulations of their corresponding leading modes. We would therefore not expect to find many of them in our networks.
Network methods allow the investigation of interactions between different climatological fields in a straightforward way, constructing crossnetworks between (in our case) SST and Z500 domains that describe the coupled ocean–atmosphere variability (Liu and Alexander, 2007). We notice that the inference of links between the domains of two unipartite networks is different from the construction of bipartite communities in multilayer networks as in Ekhtiari et al. (2021). Here, we just calculate the distance correlations between pairs of one SST and one Z500 domain. The inferred CERA20C SST–Z500 crosslinks are shown in Fig. 2d. The connectivity is mostly quite weak, except for the crosslinks from the Tropical Belt (a15) and the northern and southern Pacific Z500 domains (a9, a10, a12) to the ENSOrelated SST domains (o7, o8, o9, o10, o11) and the tropical Indian Ocean (o3), but also the Great Australian Bight (o6). This feature was also observed by Feng et al. (2012), who related it to the Walker circulation. Z500 domains a4 and a7 participate in this pattern, but to a lesser extent. Z500 domains over oceans are usually connected to their underlying SST counterparts (a4↔o2, a7↔o6, a9/a10↔o8, a14↔o13 (SST modulating the NAO), a15↔o3/o11), although in the Atlantic this dependence is exceptionally weak (a15↔o1/o15, a16↔o14, a3↔o12). But teleconnections to more distant SST domains are, in some instances, as strong as or even stronger than those proximate crosslinks (a4↔o3/o10/o11/o12, a7↔o12, a14↔o15). Interestingly, the Arctic Z500 domain (a13) is weakly linked to the AMO domain (o15), but not to the North Pacific. Except for slightly increased overall connectivity levels, supposedly mediated by the SST, allowing for lagged dependence does not change the network.
The analogous plot for 20CRv3 can be found in Fig. S3.
4.2.3 Thirdorder interactions
As before, this subsection presents only results for CERA20C over the time period 1901–2010. The overall high level of connectivity between SST domains motivated us to take a deeper look into the dependence structure of the climate system. In a modest first attempt, we search for interacting triples in the sense of Lancaster, in graph theory termed as 2hyperedges, taking all combinations of three SST domains and calculating their thirdorder distance multicorrelation as introduced in Eq. (7) in Sect. 3.3.1. As discussed there, we choose a large FDR level α=0.2 in order to not suppress too many distance multicorrelations. To avoid cumbersome evaluations with different lag combinations, we stick to instantaneous networks.
Only a small number (13) of significant thirdorder dependencies are detected (we list them in Table 2 instead of plotting them), all somehow related to the ENSO phenomenon, one of them the IPO tripole itself. The hyperedges also include the tropical Indian Ocean (o3) and the Great Australian Bight (o6). As the nature of Lancaster interaction is inherently nonlinear, this concentration on ENSO corresponds to the findings in Hlinka et al. (2014), who detect substantial nonlinear contributions to mutual information in SST (apart from trends and seasonal variance) mainly in the central tropical Pacific. Likewise, Pires and Hannachi (2017) find synchronised extremes of uncorrelated PCs of SST in the Pacific that cannot be explained by linear interaction. Despite this, one distance multicorrelation is also detected in the North Atlantic: the SST triple (o14, o15, o16), which corresponds to the AMO.
Note that not every triple with strong pairwise dependencies also has a significant thirdorder dependence. Table 2 shows the hyperedges along with their distance multicorrelation and the sum of their pairwise distance correlations. As distance multicorrelation is symmetric, every significant hyperedge is listed only once in the table. Note also that the sum of pairwise distance correlations is not bounded by 1 because the pairwise dependencies are not mutually exclusive. Although the detected distance multicorrelations are significant, they are at most 20 % of the sum of the respective pairwise distance correlations. That means thirdorder interactions complement but not outweigh pairwise dependence in the threedimensional joint dependence.
The same comments essentially apply to crosshyperedges consisting of two SST domains and one Z500 domain or one SST domain and two Z500 domains. We detected 15 and 5 significant crosshyperedges, respectively, in the Pacific, which all resemble some ENSO interaction. The Z500 domains a13 and a14 (NAO) have no notable distance multicorrelation with North Atlantic SST domains, indicating that the North Atlantic is linked to the NAO domains on a pairwise basis (o15↔a13/a14), but no higherorder interaction is taking place. There is no hyperedge of three Z500 domains with significant multicorrelation. Known atmospheric tripoles like the Arctic Oscillation (a9, a13, a14) and the Pacific North America Pattern (a10, a11, a14) apparently lack significant thirdorder dependence.
We believe that the construction of higherorder networks including hyperedges by means of distance multicorrelation might well be one step towards understanding the synergies emerging from multivariate coupling of largescale oceanic/atmospheric teleconnections.
4.3 Comparison of networks
4.3.1 Reference networks
We turn to the comparison of reference networks in terms of the NQS and MNQS criteria (see Sect. 3.4), calculated from the adjacency matrices M containing the regionally distributed distance correlation links between all pairs of domains (Eq. 8). As CERA20C was produced as a 10member ensemble representing the inevitable sampling and modelling uncertainty inherent in the production process, we take this opportunity to construct the δMAPS networks individually for each member. The results are matched to the networks derived for the CERA20C ensemble mean.
The CERA20C individual networks for the complete time period 1901–2010 are very similar to each other, with average NQSs close to 1 for all three parameters (average NQS_SST = 0.98, average NQS_Z500 = 0.94, average NQS_SST–Z500 = 0.96), such that the MNQSs have a mean of 0.96 with only a small spread. The average MNQS between the individual CERA20C runs and the CERA20C ensemble mean is 0.96. The small differences are brought about by the pattern correlation factor in netSSIM, the mean and variance factor being virtually equal to 1.
The networks for the shorter periods 1901–1955 and 1951–2010 are equally similar with average MNQS = 0.95 and 0.95 between runs and 0.95 and 0.96 for the ensemble mean, respectively. Because the networks for individual CERA20C runs and the CERA20C ensemble mean are nearly indistinguishable, we only take the CERA20C ensemble mean networks for reference in the following comparisons.
When analysing the temporal evolution of the connectivity in the CERA20C ensemble mean, we find good agreement between the first and second half of the century (MNQS = 0.87; Table 3), resulting from comparable differences in the SST and SST–Z500 networks and higher similarity at Z500 (NQS_SST = 0.84, NQS_Z500 = 0.93, NQS_SST–Z500 = 0.84). In contrast, the full period is more similar to the first half in all networks (MNQS = 0.96, NQS_SST = 0.95, NQS_Z500 = 0.96, NQS_SST–Z500 = 0.95) than to the second half because especially the SST–Z500 networks bear more differences (MNQS = 0.92, NQS_SST = 0.92, NQS_Z500 = 0.94, NQS_SST–Z500 = 0.89). We emphasise that the networks contain only information about the strength of the dependencies between the domains and not about their functional form.
Because of deviating domain extension and numbering, comparing the networks by means of the rectangular network plots (like in Fig. 2c–e) is cumbersome. In Fig. 3 we have plotted twomodal slices of the spatially distributed adjacency hypermatrices M with respect to grid cells in the ENSO domain, in the AMO domain and in the Tropical Belt, respectively. The comparison of these slices is evidently not exhaustive but may give a hint regarding the nature of the differences between the networks.
The domains in the three CERA20C SST network slices for the ENSO domain (Fig. 3a–c) are very similar in shape and size, but the links between the domains are differently distributed. The networks most obviously disagree in link strength from ENSO to the tropical Indian Ocean, but also from ENSO to the North Tropical Atlantic, to the North Pacific, and to the Southern Ocean. The same is visible in the network slices for the AMO domain (Fig. 3d–f).
In contrast, the CERA20C Z500 network slices for the Tropical Belt (Fig. 3g–i) bear more apparent similarity than the SST network slices, which was already apparent in the network quality scores above. Although the shape of the tropical belt differs slightly more than the shape of the ENSO domain, the links to the rest of the globe resemble each other more strongly. However, the domains over the North Pacific and the southern Indian Ocean seem somewhat ambiguous.
The crosslinks from ENSO to the Z500 domains (Fig. 3j–l) and from the Tropical Belt to the SST domains (not shown) show differences similar to the unipartite networks. Yet, the stabilising effect of the selflinks (large patches with distance correlation 1) does not apply to the SST–Z500 crossnetworks, such that the network scores may turn out a little lower.
As regards the second reanalysis 20CRv3, we observe strong similarity to the CERA20C ensemble mean in the two shorter time periods 1951–2010 and 1901–1955 (MNQS = 0.89 and MNQS = 0.88; Table 3 and Fig. S4), where disagreement within the same time period is mainly restricted to higher southern latitudes (remember that the SSIM includes an area weighting). But dissimilarities between the first and the second half of the century are stronger in 20CRv3 than in CERA20C (MNQS = 0.81 and MNQS = 0.87; Table 3). Notably, in 1901–1955 20CRv3 shows the same strong connection between SST domains around the whole tropics as CERA20C, which is lost in 1951–2010 in both reanalyses. In contrast, the similarity between 20CRv3 and CERA20C is slightly reduced in 1901–2010 (MNQS = 0.82; Table 3 and Fig. S4) mainly due to differing atmospheric interactions and the weaker crosslinks in 20CRv3 compared to CERA20C (NQS_SST = 0.94, NQS_Z500 = 0.81, NQS_SST–Z500 = 0.72). In all three networks (SST, Z500, SST–Z500) we observe that regional unsimilarity increases with latitude. Table S1 shows pairs of most similar domains between CERA20C and 20CRv3 along with their domainwise network quality score.
Besides, the similarity between the different time periods in 20CRv3 is not the same as in CERA20C, with 1901–2010 more similar to 1951–2010 than to 1901–1955 (MNQS = 0.84 and MNQS = 0.90; Table 3 and Fig. S4). For example, in contrast to CERA20C, the link between ENSO and the South Pacific vanishes after 1950 in 20CRv3. This might be a consequence of sparse observations in the first half of the century and thus a stronger dynamical heritage from the models used to produce the reanalyses. On the other hand, there might have been changes in connectivity driven by increasing GHG levels, which are not equally reflected in CERA20C and 20CRv3 (they are model results after all). Caution leads us therefore to restrict the comparison of CMIP6 data sets to reanalyses in the period 1951–2010.
4.3.2 CMIP6 networks
The networks belonging to the CMIP6 historical projections (listed in Table 1) are compared in Fig. 4 to the CERA20C ensemble mean (bold black cross marks) and to the 20CRv3 best estimate (bold red cross marks) in the time period 1951–2010 in terms of individual network NQSs (for SST networks (a), for Z500 networks (b), and for the crossnetworks (c)) and in terms of MNQSs for each reference, respectively (d). Finally we take the average of both MNQSs to account for the uncertainty inherent in the reanalyses: $\mathrm{1}/\mathrm{2}\left(\text{MNQS}\right(\text{CERA20C})+\text{MNQS}(\text{20CRv3}\left)\right)$ ((e), bold cross marks). As expected, the similarity between models and references is generally weaker than between references, although in the Z500 networks some models reach a comparable level. Network quality scores are highest for Z500, followed by SST and SST–Z500. SST–Z500 crossnetworks show the greatest deviations across models as well as across references. The seemingly contradictory scores for Z500 with respect to CERA20C and 20CRv3 have to be put into perspective with their very high values and can be traced back to the differences between the reanalyses.
When applying the alternative, pointwise SSIM calculation (Fig. 4, thin black and red cross marks), the final average MNQS values are somewhat lower in their overall level, but similar in spread, and the model ranking suffers only minor changes.
The differences between the reanalyses are also reflected in the MNQSs of the models, where the reanalyses agree very well upon some models (HadGEM3GC31LL, IPSLCM6ALR, MPIESM112HR, MIROCES2L, MIROC6) but less upon others (MRIESM20, TaiESM1, CNRMCM61, CNRMESM21). But altogether, a tendency to differentiate between more/less similar models with respect to reanalyses is clearly visible. We conclude that, when combining several references from independent sources, the average MNQS over these references is a valid evaluation instrument for assessing whether the teleconnections between large climate components in a general circulation model are realistically represented. Still, as our evaluation is restricted to a single run per model, we are not able to differentiate between good runs and good models as such.
Using the example of four of the highestranking GCM runs with respect to MNQS, we illustrate in short the opportunities offered by the δMAPS approach to detect model deficiencies. We examine some of the pointwise adjacency maps of ECEarth3, UKESM10LL, MPIESM12HR, and IPSLCM6ALR in comparison to CERA20C and 20CRv3 over 1951–2010 (Figs. S5–S9). In the SST networks, we notice that differences are not restricted to higher latitudes, as was the case for the two reanalyses. Even in the main feature of interannual variability, ENSO, spatial connectivity deviates significantly. In all models the tropical Indian Ocean depends much more strongly, although to varying degrees, on ENSO than in both reanalyses (Fig. S5). ECEarth3 and IPSLCM6ALR do not at all reproduce the northern extension of the ENSO domain seen in both reanalyses (Fig. S5a, d, e, f), which reflects the widely recognised lowfrequency interdependency between ENSO and the Pacific Decadal Oscillation (PDO) (Henley et al., 2015). The links to the southern Indian Ocean and the South Atlantic differ considerably across models, but no model shows better performance in all domains. In MPIESM12HR the dependence between AMO and ENSO is exaggerated, whereas in IPSLCM6ALR the Norwegian Sea is nearly disconnected from the Tropical North Atlantic, which is not consistent with AMO (Fig. S6c and d). As regards Z500, UKESM10LL shows an unrealistic link between the Tropical Belt and the Antarctic domain (Fig. S7b). At the same time, the dependence of the Arctic domain is matched well only in UKESM10LL (Fig. S8b). In contrast, the crosslinks from ENSO to Z500 are well represented in all four models (Fig. S9).
Continuing the analysis of all pointwise adjacency maps, it would be possible to identify regions/climate phenomena of higher and lower confidence in any model, an exercise that might be instructive for both modelling groups and downstream users of climate projections.
In order to evaluate the physical plausibility of CMIP6 GCM output, we have constructed functional interaction networks within and between the SST and Z500 multivariate time series of 2 reanalyses (CERA20C and 20CRv3) and 22 GCM output data sets using the δMAPS procedure. In response to several theoretical challenges related to the nature of longterm climate data, a number of innovations were introduced into δMAPS:

detrending with seasonreliant trend EOFs

network construction using distance correlation

distance multicorrelation for higherorder interactions

network comparison with the structural similarity index

construction of a multireference multivariate network quality score.
First of all, the two reanalyses were compared to one another in considerable detail, including the temporal evolution of the interactions in the course of the 20th century. It could not be excluded that inconsistencies between the first and second half of the century arise at least partly from data uncertainty. The evaluation of CMIP6 model output against the references revealed a very high general similarity of the atmospheric connectivity, though with gradual differences. Oceanic teleconnections are less accurately reflected and the model differences more pronounced. The strongest deviations are found in the crossnetworks between Z500 and SST, which cooccur sometimes, but not always, with lower network quality scores in the unipartite networks. We combined the three network quality scores for each CMIP6 model on an equal basis, emphasising the equivalent importance of all considered geophysical subsystems in the generation of the earth's climate. Taking into account the uncertainty inherent in any reference, the average multivariate network quality score over several, preferably independent, references can certainly be considered a suitable criterion to assess the similarity of physical interactions between climate components in a model to those in observations.
In addition, the proposed complex network framework combined with the distance correlation measure offers many promising multivariate extensions of δMAPS as, for example, node definition based on multivariate time series, consideration of higherorder dependence, interactions on multiple timescales, and timeevolving networks. Such comparisons could be very useful to investigate subtle differences between various reanalyses. Besides, the characterisation of network evolution from past to future could add a new facet to the understanding of climate change.
The δMAPS software can be obtained in https://github.com/FabriFalasca/deltaMAPS (last access: 10 May 2020, Falasca, 2020). The GCM data used in this study are part of the World Climate Research Programme's (WCRP) 6th Coupled Model Intercomparison Project (CMIP6) openaccess data. It was accessed through the Earth System Grid Federation (ESGF; https://esgfnode.llnl.gov/search/cmip6/, last access: 6 December 2021, Deutsches Klimarechenzentrum, 2021). CERA20C data are available at https://www.ecmwf.int/en/forecasts/datasets/reanalysisdatasets/cera20c (last access: 9 May 2020, European Centre for MediumRange Weather Forecasts, 2020). The 20CRv3 data are available at https://psl.noaa.gov/data/gridded/data.20thC_ReanV3.html (last access: 21 April 2021, NOAA Physics Science Laboratory, 2021).
The supplement related to this article is available online at: https://doi.org/10.5194/esd14172023supplement.
CD developed the concept, processed the data, prepared the manuscript, and produced all figures; KW and AW contributed with indepth discussions, interpretation, and review.
The contact author has declared that neither of the authors has any competing interests.
Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We thank the editor and two anonymous referees for their insightful comments on the paper.
Support for the Twentieth Century Reanalysis Project version 3 data set is provided by the US Department of Energy, Office of Science Biological and Environmental Research (BER); by the NOAA Climate Program Office; and by the NOAA Physical Sciences Laboratory.
This paper was edited by Kira Rehfeld and reviewed by two anonymous referees.
Agarwal, A., Maheswaran, R., Marwan, N., Caesar, L., and Kurths, J.: Waveletbased multiscale similarity measure for complex networks, Eur. Phys. J. B, 91, 296, https://doi.org/10.1140/epjb/e2018904606, 2018. a
Agarwal, A., Caesar, L., Marwan, N., Maheswaran, R., Merz, B., and Kurths, J.: Networkbased identification and characterization of teleconnections on different scales, Sci. Rep., 9, 8808, https://doi.org/10.1038/s41598019454235, 2019. a
Barbosa, S. M. and Andersen, O. B.: Trend patterns in global sea surface temperature, Int. J. Climatol., 29, 2049–2055, https://doi.org/10.1002/joc.1855, 2009. a
Battiston, F., Cencetti, G., Iacopini, I., Latora, V., M., L., Patania, A., Young, J.G., and Petri, G.: Networks beyond pairwise interactions: Structure and dynamics, Phys. Rep., 874, 1–92, https://doi.org/10.1016/j.physrep.2020.05.004, 2020. a
Benjamini, Y.: Discovering the false discovery rate, J. R. Statist. Soc. B, 72, 405–416, https://doi.org/10.1111/j.14679868.2010.00746.x, 2010. a
Bi, D., Dix, M., Marsland, S., O’Farrell, S., Sullivan, A., Bodman, R., Law, R., Harman, I., Srbinovsky, J., Rashid, H. A., Dobrohotoff, P., Mackallah, C., Yan, H., Hirst, A., Savita, A., Boeira Dias, F., Woodhouse, M., Fiedler, R., and Heerdegen, A.: Configuration and spinup of ACCESSCM2, the new generation Australian Community Climate and Earth System Simulator Coupled Model, J. South. Hemisphere Earth Syst. Sci., 70, 225–251, https://doi.org/10.1071/ES19040, 2020. a
Böttcher, B., KellerRessel, M., and Schilling, R.: Distance multivariance: New dependence measures for random vectors, Ann. Stat., 47, 2757–2789, https://doi.org/10.1214/18AOS1764, 2019. a, b, c
Boucher, O., Servonnat, J., Albright, A. L., Aumont, O., Balkanski, Y., Bastrikov, V., Bekki, S., Bonnet, R., Bony, S., Bopp, L., Braconnot, P., Brockmann, P., Cadule, P., Caubel, A., Cheruy, F., Codron, F., Cozic, A., Cugnet, D., D'Andrea, F., Davini, P., de Lavergne, C., Denvil, S., Deshayes, J., Devilliers, M., Ducharne, A., Dufresne, J.L., Dupont, E., Éthé, C., Fairhead, L., Falletti, L., Flavoni, S., Foujols, M.A., Gardoll, S., Gastineau, G., Ghattas, J., Grandpeix, J.Y., Guenet, B., Guez, L. E., Guilyardi, E., Guimberteau, M., Hauglustaine, D., Hourdin, F., Idelkadi, A., Joussaume, S., Kageyama, M., Khodri, M., Krinner, G., Lebas, N., Levavasseur, G., Lévy, C., Li, L., Lott, F., Lurton, T., Luyssaert, S., Madec, G., Madeleine, J.B., Maignan, F., Marchand, M., Marti, O., Mellul, L., Meurdesoif, Y., Mignot, J., Musat, I., Ottlé, C., Peylin, P., Planton, Y., Polcher, J., Rio, C., Rochetin, N., Rousset, C., Sepulchre, P., Sima, A., Swingedouw, D., Thiéblemont, R., Traore, A. K., Vancoppenolle, M., Vial, J., Vialard, J., Viovy, N., and Vuichard, N.: Presentation and Evaluation of the IPSLCM6ALR Climate Model, J. Adv. Model. Earth Sy., 12, e2019MS002010, https://doi.org/10.1029/2019MS002010, 2020. a
Brands, S.: A circulationbased performance atlas of the CMIP5 and 6 models for regional climate studies in the Northern Hemisphere midtohigh latitudes, Geosci. Model Dev., 15, 1375–1411, https://doi.org/10.5194/gmd1513752022, 2022. a, b
Brunet, D., Vrscay, E. R., and Wang, Z.: A Class of Image Metrics Based on the Structural Similarity Quality Index, in: Image Analysis and Recognition, edited by: Kamel, M. and Campilho, A., vol. 6753 Part 1, 100–110, Springer, Berlin, Heidelberg, https://doi.org/10.1007/9783642215933_11, 2011. a
Brunner, L., Pendergrass, A. G., Lehner, F., Merrifield, A. L., Lorenz, R., and Knutti, R.: Reduced global warming from CMIP6 projections when weighting models by performance and independence, Earth Syst. Dynam., 11, 995–1012, https://doi.org/10.5194/esd119952020, 2020. a
Cannon, A. J.: Reductions in daily continentalscale atmospheric circulation biases between generations of global climate models: CMIP5 to CMIP6, Environ. Res. Lett., 15, 064006, https://doi.org/10.1088/17489326/ab7e4f, 2020. a, b
Cherchi, A., Fogli, P. G., Lovato, T., Peano, D., Iovino, D., Gualdi, S., Masina, S., Scoccimarro, E., Materia, S., Bellucci, A., and Navarra, A.: Global Mean Climate and Main Patterns of Variability in the CMCCCM2 Coupled Model, J. Adv. Model. Earth Sy., 11, 185–209, https://doi.org/10.1029/2018MS001369, 2019. a, b
Coburn, J. and Pryor, S. C.: Differential Credibility of Climate Modes in CMIP6, J. Climate, 34, 8145–8164, https://doi.org/10.1175/JCLID210359.1, 2021. a
Danabasoglu, G., Lamarque, J.F., Bacmeister, J., Bailey, D. A., DuVivier, A. K., Edwards, J., Emmons, L. K., Fasullo, J., Garcia, R., Gettelman, A., Hannay, C., Holland, M. M., Large, W. G., Lauritzen, P. H., Lawrence, D. M., Lenaerts, J. T. M., Lindsay, K., Lipscomb, W. H., Mills, M. J., Neale, R., Oleson, K. W., OttoBliesner, B., Phillips, A. S., Sacks, W., Tilmes, S., van Kampenhout, L., Vertenstein, M., Bertini, A., Dennis, J., Deser, C., Fischer, C., FoxKemper, B., Kay, J. E., Kinnison, D., Kushner, P. J., Larson, V. E., Long, M. C., Mickelson, S., Moore, J. K., Nienhouse, E., Polvani, L., Rasch, P. J., and Strand, W. G.: The Community Earth System Model Version 2 (CESM2), J. Adv. Model. Earth Sy., 12, e2019MS001916, https://doi.org/10.1029/2019MS001916, 2020. a
De Lathauwer, L., De Moor, B., and Vandewalle, J.: A Multilinear Singular Value Decomposition, SIAM J. Matrix Anal. A., 21, 1253–1278, https://doi.org/10.1137/S0895479896305696, 2000. a
Deutsches Klimarechenzentrum: ESGFData, [data set], https://esgfnode.llnl.gov/search/cmip6/, last access: 6 December 2021. a
Dijkstra, H. A., HernándezGarcía, E., Masoller, C., and Barreiro, M.: Networks in Climate, Cambridge University Press, Cambridge, https://doi.org/10.1017/9781316275757, 2019. a
Donges, F. J., Zou, Y., Marwan, N., and Kurths, J.: Complex networks in climate dynamics: Comparing linear and nonlinear network construction methods, Eur. Phys. J. Spec. Top., 174, 157–179, https://doi.org/10.1140/epjst/e2009010982, 2009. a
Donges, J., Schultz, H., Marwan, N., Zou, Y., and Kurths, J.: Investigating the topology of interacting networks: Theory and application to coupled climate subnetworks, Eur. Phys. J. B, 84, 635–651, https://doi.org/10.1140/epjb/e2011107958, 2011. a
Döscher, R., Acosta, M., Alessandri, A., Anthoni, P., Arsouze, T., Bergman, T., Bernardello, R., Boussetta, S., Caron, L.P., Carver, G., Castrillo, M., Catalano, F., Cvijanovic, I., Davini, P., Dekker, E., DoblasReyes, F. J., Docquier, D., Echevarria, P., Fladrich, U., FuentesFranco, R., Gröger, M., v. Hardenberg, J., Hieronymus, J., Karami, M. P., Keskinen, J.P., Koenigk, T., Makkonen, R., Massonnet, F., Ménégoz, M., Miller, P. A., MorenoChamarro, E., Nieradzik, L., van Noije, T., Nolan, P., O'Donnell, D., Ollinaho, P., van den Oord, G., Ortega, P., Prims, O. T., Ramos, A., Reerink, T., Rousset, C., RuprichRobert, Y., Le Sager, P., Schmith, T., Schrödner, R., Serva, F., Sicardi, V., Sloth Madsen, M., Smith, B., Tian, T., Tourigny, E., Uotila, P., Vancoppenolle, M., Wang, S., Wårlind, D., Willén, U., Wyser, K., Yang, S., YepesArbós, X., and Zhang, Q.: The ECEarth3 Earth system model for the Coupled Model Intercomparison Project 6, Geosci. Model Dev., 15, 2973–3020, https://doi.org/10.5194/gmd1529732022, 2022. a, b
Duan, Y., Kumar, S., and Kinter, J. L.: Evaluation of LongTerm Temperature Trend and Variability in CMIP6 Multimodel Ensemble, Geophys. Res. Lett., 48, e2021GL093227, https://doi.org/10.1029/2021GL093227, 2021. a
European Centre for MediumRange Weather Forecasts: CERA20C, [data set], https://www.ecmwf.int/en/forecasts/datasets/reanalysisdatasets/cera20c, last access: 9 May 2020. a
Ekhtiari, N., Ciemer, C., Kirsch, C., and Donner, R.: Coupled network analysis revealing global monthly scale covariability patterns between seasurface temperatures and precipitation in dependence on the ENSO state, Eur. Phys. J.Spec. Top., 230, 3019–3032, https://doi.org/10.1140/epjs/s1173402100168z, 2021. a, b
Falasca, F.: deltaMAPS, [code], https://github.com/FabriFalasca/deltaMAPS, last access: 10 May 2020. a
Falasca, F., Bracco, A., Nenes, A., and Fountalis, I.: Dimensionality reduction and network inference for climate data using δMAPS: Application to the CESM Large Ensemble sea surface temperature, J. Adv. Model. Earth Sy., 11, 1479–1515, https://doi.org/10.1029/2019MS001654, 2019. a, b, c, d, e, f, g
Falasca, F., Crétat, J., and Braconnot, P. Bracco, A.: Spatiotemporal complexity and timedependent networks in sea surface temperature from mid to late Holocene, Eur. Phys. J. Plus, 135, 392, https://doi.org/10.1140/epjp/s1336002000403x, 2020. a, b
Fasullo, J. T., Phillips, A. S., and Deser, C.: Evaluation of Leading Modes of Climate Variability in the CMIP Archives, J. Climate, 33, 5527–5545, https://doi.org/10.1175/JCLID191024.1, 2020. a
Feng, A., Gong, Z., Wang, Q., and Feng, G.: Threedimensional air–sea interactions investigated with bilayer networks, Theor. Appl. Climatol., 109, 635–643, https://doi.org/10.1007/s0070401206007, 2012. a, b
Fisher, M. J.: Predictable Components in Australian Daily Temperature Data, J. Climate, 28, 5969–5984, https://doi.org/10.1175/JCLID1400713.1, 2015. a
Fortunato, S. and Hric, D.: Community detection in networks: A user guide, Phys. Rep., 659, 1–44, https://doi.org/10.1016/j.physrep.2016.09.002, 2016. a
Fountalis, I., Bracco, A., and Dovrolis, C.: ENSO in CMIP5 simulations: network connectivity from the recent past to the twentythird century, Clim. Dynam., 45, 511–538, https://doi.org/10.1007/s0038201424121, 2015. a
Fountalis, I., Dovrolis, C., Bracco, A., Dilkina, B., and Keilholz, S.: δMAPS: from spatiotemporal data to a weighted and lagged network between functional domains, Appl. Netw. Sci., 3, 21, https://doi.org/10.1007/s411090180078z, 2018. a, b, c, d, e, f, g, h, i, j, k
Frankignoul, C., Gastineau, G., and Kwon, Y.O.: Estimation of the SST Response to Anthropogenic and External Forcing and Its Impact on the Atlantic Multidecadal Oscillation and the Pacific Decadal Oscillation, J. Climate, 30, 9871–9895, https://doi.org/10.1175/JCLID170009.1, 2017. a
Fujiwara, M., Hibino, T., Mehta, S. K., Gray, L., Mitchell, D., and Anstey, J.: Global temperature response to the major volcanic eruptions in multiple reanalysis data sets, Atmos. Chem. Phys., 15, 13507–13518, https://doi.org/10.5194/acp15135072015, 2015. a
Fulton, D. J. and Hegerl, G. C.: Testing Methods of Pattern Extraction for Climate Data Using Synthetic Modes, J. Climate, 34, 7645–7660, https://doi.org/10.1175/JCLID200871.1, 2021. a, b
Garreaud, R., Clem, K., and Veloso, J.: The south pacific pressure trend dipole and the southern blob, J. Climate, 34, 7661–7676, https://doi.org/10.1175/JCLID200886.1, 2021. a
Gillett, N. P., Fyfe, J. C., and Parker, D.: Attribution of observed sea level pressure trends to greenhouse gas, aerosol, and ozone changes, Geophys. Res. Lett., 40, 2302–2306, https://doi.org/10.1002/grl.50500, 2013. a
Gutjahr, O., Putrasahan, D., Lohmann, K., Jungclaus, J. H., von Storch, J.S., Brüggemann, N., Haak, H., and Stössel, A.: Max Planck Institute Earth System Model (MPIESM1.2) for the HighResolution Model Intercomparison Project (HighResMIP), Geosci. Model Dev., 12, 3241–3281, https://doi.org/10.5194/gmd1232412019, 2019. a
Hajima, T., Watanabe, M., Yamamoto, A., Tatebe, H., Noguchi, M. A., Abe, M., Ohgaito, R., Ito, A., Yamazaki, D., Okajima, H., Ito, A., Takata, K., Ogochi, K., Watanabe, S., and Kawamiya, M.: Development of the MIROCES2L Earth system model and the evaluation of biogeochemical processes and feedbacks, Geosci. Model Dev., 13, 2197–2244, https://doi.org/10.5194/gmd1321972020, 2020. a
Hannachi, A.: Pattern hunting in climate: a new method for finding trends in gridded climate data, Int. J. Climatol., 27, 1–15, https://doi.org/10.1002/joc.1375, 2007. a, b, c
Henley, B. J., Gergis, J., Karoly, D. J., Power, S., Kennedy, J., and Folland, C. K.: A Tripole Index for the Interdecadal Pacific Oscillation, Clim. Dynam., 45, 3077–3090, https://doi.org/10.1007/s0038201525251, 2015. a, b
Hlinka, J., Hartman, D., Vejmelka, M., Novotná, D., and Paluš, M.: Nonlinear dependence and teleconnections in climate data: sources, relevance, nonstationarity, Clim. Dynam., 42, 1873–1886, https://doi.org/10.1007/s0038201317802, 2014. a
Hynčica, M. and Huth, R.: Modes of atmospheric circulation variability in the Northern Extratropics: A comparison of five reanalyses, J. Climate, 33, 10707–10726, https://doi.org/10.1175/JCLID190904.1, 2020. a, b
Jajcay, N., Kravtsov, S., Sugihara, G., Tsonis, A. A., and Paluš, M.: Synchronization and causality across time scales in El Niño Southern Oscillation, npj Clim. Atmos. Sci., 1, 33, https://doi.org/10.1038/s4161201800437, 2018. a
Fokianos, K. and Pitsillou, M.: Testing independence for multivariate time series via the autodistance correlation matrix, Biometrika, 105, 337–352, https://doi.org/10.1093/biomet/asx082, 2018. a
Kittel, T., Ciemer, C., Lotfi, N., Peron, T., Rodrigues, F., Kurths, J., and Donner, R.: Evolving climate network perspectives on global surface air temperature effects of ENSO and strong volcanic eruptions, Eur. Phys. J.Spec. Top., 230, 3075–3100, https://doi.org/10.1140/epjs/s11734021002699, 2021. a
Knutson, T. R. and Ploshay, J.: Sea Level Pressure Trends: ModelBased Assessment of Detection, Attribution, and Consistency with CMIP5 Historical Simulations, J. Climate, 34, 327–346, https://doi.org/10.1175/JCLID190997.1, 2021. a
Kristóf, E., Barcza, Z., Hollós, R., Bartholy, J., and Pongrácz, R.: Evaluation of Historical CMIP5 GCM Simulation Results Based on Detected Atmospheric Teleconnections, Atmosphere, 11, 723, https://doi.org/10.3390/atmos11070723, 2020. a
Laloyaux, P., Balmaseda, M., Bidlot, J.R., Broennimann, S., Buizza, R., Boisseson, E., Dalhgren, P., Dee, D., Haimberger, L., Hersbach, H., Kosaka, Y., Martin, M., Poli, P., Rayner, N., Rustemeier, E., and Schepers, D.: CERA20C: A coupled reanalysis of the twentieth century, J. Adv. Model. Earth Sy., 10, 1172–1195, https://doi.org/10.1029/2018MS001273, 2018. a, b
Lancaster, H. O.: The Chisquared Distribution, Wiley & Sons, Inc., New York, ISBN 9780471512301, 1969. a
Lee, J., Sperber, K., Gleckler, P., Bonfils, C., and Taylor, K.: Quantifying the agreement between observed and simulated extratropical modes of interannual variability, Clim. Dynam., 52, 4057–4089, https://doi.org/10.1007/s0038201843554, 2019. a, b
Lee, W.L., Wang, Y.C., Shiu, C.J., Tsai, I., Tu, C.Y., Lan, Y.Y., Chen, J.P., Pan, H.L., and Hsu, H.H.: Taiwan Earth System Model Version 1: description and evaluation of mean state, Geosci. Model Dev., 13, 3887–3904, https://doi.org/10.5194/gmd1338872020, 2020. a
Li, G., Ren, B., Zheng, J., and Yang, C.: Trend Singular Value Decomposition Analysis and Its Application to the Global Ocean Surface Latent Heat Flux and SST Anomalies, J. Climate, 24, 2931–2948, https://doi.org/10.1175/2010JCLI3743.1, 2011. a
Liu, Z. and Alexander, M.: Atmospheric bridge, oceanic tunnel, and global climatic teleconnections, Rev. Geophys., 45, RG2005, https://doi.org/10.1029/2005RG000172, 2007. a, b, c
Meegan Kumar, D., Tierney, J. E., Bhattacharya, T., Zhu, J., McCarty, L., and Murray, J. W.: Climatic drivers of deglacial SST variability in the eastern Pacific, Paleoceanogr. Paleoclimatol., 36, e2021PA004264, https://doi.org/10.1029/2021PA004264, 2021. a
Messié, M. and Chavez, F.: Global Modes of Sea Surface Temperature Variability in Relation to Regional Climate Indices, J. Climate, 24, 4314–4331, https://doi.org/10.1175/2011JCLI3941.1, 2011. a
Mo, R., Ye, C., and Whitfield, P. H.: Application Potential of Four Nontraditional Similarity Metrics in Hydrometeorology, J. Hydrometeorol., 15, 1862–1880, https://doi.org/10.1175/JHMD130140.1, 2014. a
Monahan, A. H., Fyfe, J. C., Ambaum, M. H. P., B., S. D., and North, G. R.: Empirical Orthogonal Functions: The Medium is the Message, J. Climate, 22, 6501–6514, https://doi.org/10.1175/2009JCLI3062.1, 2009. a
Müller, W. A., Jungclaus, J. H., Mauritsen, T., Baehr, J., Bittner, M., Budich, R., Esch, F. B. M., Ghosh, R., Haak, H., Ilyina, T., Kleine, T., Kornblueh, L., Li, H., Modali, K., Notz, D., Pohlmann, H., Roeckner, E., Stemmler, I., Tian, F., and Marotzke, J.: A Higherresolution Version of the Max Planck Institute Earth System Model (MPIESM1.2HR), J. Adv. Model. Earth Sy., 10, 1383–1413, https://doi.org/10.1029/2017MS001217, 2018. a
NOAA Physics Science Laboratory: 20CRv3, [data set], https://psl.noaa.gov/data/gridded/data.20thC_ReanV3.html, last access: 21 April 2021. a
Novi, L., Bracco, A., and Falasca, F.: Uncovering marine connectivity through sea surface temperature, Sci. Rep., 11, 8839, https://doi.org/10.1038/s4159802187711z, 2021. a
Nowack, P., Runge, J., Eyring, V., and Haigh, J. D.: Causal networks for climate model evaluation and constrained projections, Nat. Commun., 11, 1415, https://doi.org/10.1038/s4146702015195y, 2020. a, b
Papalexiou, S. M., Rajulapati, C. R., Clark, M. P., and F., L.: Robustness of CMIP6 Historical Global Mean Temperature Simulations: Trends, LongTerm Persistence, Autocorrelation, and Distributional Shape, Earth's Future, 8, e2020EF001667, https://doi.org/10.1029/2020EF001667, 2020. a, b
Pires, C. A. L. and Hannachi, A.: Independent subspace analysis of the sea surface temperature variability: NonGaussian sources and sensitivity to sampling and dimensionality, Complexity, 3076810, https://doi.org/10.1155/2017/3076810, 2017. a, b
Pires, C. A. L. and Hannachi, A.: Bispectral analysis of nonlinear interaction, predictability and stochastic modelling with application to ENSO, Tellus A, 73, 1–30, https://doi.org/10.1080/16000870.2020.1866393, 2021. a
Raible, C., Stocker, T., Yoshimori, M., Renold, M., Beyerle, U., Casty, C., and Luterbacher, J.: Northern Hemispheric trends of pressure indices and atmospheric circulation patterns in observations, reconstructions, and coupled GCM simulations, J. Climate, 18, 3968–3982, https://doi.org/10.1175/JCLI3511.1, 2005. a
Roberts, M. J., Baker, A., Blockley, E. W., Calvert, D., Coward, A., Hewitt, H. T., Jackson, L. C., Kuhlbrodt, T., Mathiot, P., Roberts, C. D., Schiemann, R., Seddon, J., Vannière, B., and Vidale, P. L.: Description of the resolution hierarchy of the global coupled HadGEM3GC3.1 model as used in CMIP6 HighResMIP experiments, Geosci. Model Dev., 12, 4999–5028, https://doi.org/10.5194/gmd1249992019, 2019. a
RodríguezFonseca, B., Polo, I., GarcíaSerrano, J., Losada, T., Mohino, E., Mechoso, C. R., and Kucharski, F.: Are Atlantic Ninños enhancing Pacific ENSO events in recent decades?, Geophys. Res. Lett., 36, L20705, https://doi.org/10.1029/2009GL040048, 2009. a
Sanderson, B. M., Knutti, R., and Caldwell, P.: A Representative Democracy to Reduce Interdependency in a Multimodel Ensemble, J. Climate, 28, 5171–5194, https://doi.org/10.1175/JCLID1400362.1, 2015. a
Séférian, R., Nabat, P., Michou, M., SaintMartin, D., Voldoire, A., Colin, J., Decharme, B., Delire, C., Berthet, S., Chevallier, M., Sénési, S., Franchisteguy, L., Vial, J., Mallet, M., Joetzjer, E., Geoffroy, O., Guérémy, J.F., Moine, M.P., Msadek, R., Ribes, A., Rocher, M., Roehrig, R., Salasy Mélia, D., Sanchez, E., Terray, L., Valcke, S., Waldman, R., Aumont, O., Bopp, L., Deshayes, J., Éthé, C., and Madec, G.: Evaluation of CNRM Earth System Model, CNRMESM21: Role of Earth System Processes in PresentDay and Future Climate, J. Adv. Model. Earth Sy., 11, 4182–4227, https://doi.org/10.1029/2019MS001791, 2019. a
Seland, Ø., Bentsen, M., Olivié, D., Toniazzo, T., Gjermundsen, A., Graff, L. S., Debernard, J. B., Gupta, A. K., He, Y.C., Kirkevåg, A., Schwinger, J., Tjiputra, J., Aas, K. S., Bethke, I., Fan, Y., Griesfeller, J., Grini, A., Guo, C., Ilicak, M., Karset, I. H. H., Landgren, O., Liakka, J., Moseid, K. O., Nummelin, A., Spensberger, C., Tang, H., Zhang, Z., Heinze, C., Iversen, T., and Schulz, M.: Overview of the Norwegian Earth System Model (NorESM2) and key climate response of CMIP6 DECK, historical, and scenario simulations, Geosci. Model Dev., 13, 6165–6200, https://doi.org/10.5194/gmd1361652020, 2020. a, b
Sellar, A. A., Jones, C. G., Mulcahy, J. P., Tang, Y., Yool, A., Wiltshire, A., O'Connor, F. M., Stringer, M., Hill, R., Palmieri, J., Woodward, S., de Mora, L., Kuhlbrodt, T., Rumbold, S. T., Kelley, D. I., Ellis, R., Johnson, C. E., Walton, J., Abraham, N. L., Andrews, M. B., Andrews, T., Archibald, A. T., Berthou, S., Burke, E., Blockley, E., Carslaw, K., Dalvi, M., Edwards, J., Folberth, G. A., Gedney, N., Griffiths, P. T., Harper, A. B., Hendry, M. A., Hewitt, A. J., Johnson, B., Jones, A., Jones, C. D., Keeble, J., Liddicoat, S., Morgenstern, O., Parker, R. J., Predoi, V., Robertson, E., Siahaan, A., Smith, R. S., Swaminathan, R., Woodhouse, M. T., Zeng, G., and Zerroukat, M.: UKESM1: Description and Evaluation of the U.K. Earth System Model, J. Adv. Model. Earth Sy., 11, 4513–4558, https://doi.org/10.1029/2019MS001739, 2019. a
Shen, C., Panda, S., and Vogelstein, J.: The ChiSquare Test of Distance Correlation, J. Comput. Graph. Stat., 31, 254–262, https://doi.org/10.1080/10618600.2021.1938585, 2022. a
Simpson, I. R., Bacmeister, J., Neale, R. B., Hannay, C., Gettelman, A., Garcia, R. R., Lauritzen, P. H., Marsh, D. R., Mills, M. J., Medeiros, B., and Richter, J. B.: An Evaluation of the Large‐Scale Atmospheric Circulation and Its Variability in CESM2 and Other CMIP Models, J. Geophys. Res.Atmos., 125, e2020JD032835, https://doi.org/10.1029/2020JD032835, 2020. a
Slivinski, L. C., Compo, G. P., Whitaker, J. S., Sardeshmukh, P. D., Giese, B. S., McColl, C., Allan, R., Yin, X., Vose, R., Titchner, H., Kennedy, J., Spencer, L. J., Ashcroft, L., Brönnimann, S., Brunet, M., Camuffo, D., Cornes, R., Cram, T. A., Crouthamel, R., DomínguezCastro, F., Freeman, J. E., Gergis, J., Hawkins, E., Jones, P. D., Jourdain, S., Kaplan, A., Kubota, H., Le Blancq, F., Lee, T.C., Lorrey, A., Luterbacher, J., Maugeri, M., Mock, C. J., Moore, G. W. K., Przybylak, R., Pudmenzky, C., Reason, C. nd Slonosky, V. C., Smith, C. A., Tinz, B., Trewin, B., Valente, M. A., Wang, X. L., Wilkinson, C., Wood, K., and Wyszyński, P.: Towards a more reliable historical reanalysis: Improvements for version 3 of the Twentieth Century Reanalysis system, Q. J. Roy. Meteor. Soc., 145, 2876–2908, https://doi.org/10.1002/qj.3598, 2019. a, b
Spensberger, C., Reeder, M. J., Spengler, T., and Patterson, M.: The Connection between the Southern Annular Mode and a FeatureBased Perspective on Southern Hemisphere Midlatitude Winter Variability, J. Climate, 33, 115–129, https://doi.org/10.1175/JCLID190224.1, 2020. a
Steinhäuser, K. and Tsonis, A.: A climate model intercomparison at the dynamics level, Clim. Dynam., 42, 1665–1670, https://doi.org/10.1007/s0038201317615, 2014. a, b
Steinhäuser, K., Chawla, N. V., and Ganguly, A. R.: An Exploration of Climate Data Using Complex Networks, in: Proceedings of the Third International Workshop on Knowledge Discovery from Sensor Data, SensorKDD'09, 23–31, Association for Computing Machinery, Paris, https://doi.org/10.1145/1601966.1601973, 2009. a
Steinhäuser, K., Ganguly, A. R., and Chawla, N. V.: Multivariate and multiscale dependence in the global climate system revealed through complex networks, Clim. Dynam., 39, 889–895, https://doi.org/10.1007/s0038201111359, 2012. a
Streitberg, B.: Lancaster Interactions Revisited, Ann. Stat., 18, 1878–1885, https://doi.org/10.1214/aos/1176347885, 1990. a, b
Swart, N. C., Cole, J. N. S., Kharin, V. V., Lazare, M., Scinocca, J. F., Gillett, N. P., Anstey, J., Arora, V., Christian, J. R., Hanna, S., Jiao, Y., Lee, W. G., Majaess, F., Saenko, O. A., Seiler, C., Seinen, C., Shao, A., Sigmond, M., Solheim, L., von Salzen, K., Yang, D., and Winter, B.: The Canadian Earth System Model version 5 (CanESM5.0.3), Geosci. Model Dev., 12, 4823–4873, https://doi.org/10.5194/gmd1248232019, 2019. a
Székely, G. J. and Rizzo, M. L.: Partial distance correlation with methods for dissimilarities, Ann. Stat., 42, 2382–2412, https://doi.org/10.1214/14AOS1255, 2014. a
Székely, G. J., Rizzo, M. L., and Bakirov, N. K.: Measuring and Testing Dependence by Correlation of Distances, Ann. Stat., 35, 2769–2794, https://doi.org/10.1214/009053607000000505, 2007. a
Tantet, A. and Dijkstra, H. A.: An interaction network perspective on the relation between patterns of sea surface temperature variability and global mean surface temperature, Earth Syst. Dynam., 5, 1–14, https://doi.org/10.5194/esd512014, 2014. a
Tatebe, H., Ogura, T., Nitta, T., Komuro, Y., Ogochi, K., Takemura, T., Sudo, K., Sekiguchi, M., Abe, M., Saito, F., Chikira, M., Watanabe, S., Mori, M., Hirota, N., Kawatani, Y., Mochizuki, T., Yoshimura, K., Takata, K., O'ishi, R., Yamazaki, D., Suzuki, T., Kurogi, M., Kataoka, T., Watanabe, M., and Kimoto, M.: Description and basic evaluation of simulated mean state, internal variability, and climate sensitivity in MIROC6, Geosci. Model Dev., 12, 2727–2765, https://doi.org/10.5194/gmd1227272019, 2019. a
Tsonis, A. A., Swanson, K. L., and Wang, G.: On the Role of Atmospheric Teleconnections in Climate, J. Climate, 21, 2990–3001, https://doi.org/10.1175/2007JCLI1907.1, 2008. a
Tsonis, A. A., Wang, G., Swanson, K. L., Rodrigues, F. A., and da Fontura Costa, L.: Community structure and dynamics in climate networks, Clim. Dynam., 37, 933–940, https://doi.org/10.1007/s0038201008743, 2011. a
VázquezPatiño, A., Campozano, L., Mendoza, D., and Samaniego, E.: A causal flow approach for the evaluation of global climate models, Int. J. Climatol., 40, 4497–4517, https://doi.org/10.1002/joc.6470, 2019. a
Voldoire, A., Saint‐Martin, D., Sénési, S., Decharme, B., Alias, A., Chevallier, M., Colin, J., Guérémy, J., Michou, M., Moine, M., Nabat, P., Roehrig, R., Salas y Mélia, D., Séférian, R., Valcke, S., Beau, I., Belamari, S., Berthet, S., Cassou, C., Cattiaux, J., Deshayes, J., Douville, H., Ethé, C., Franchistéguy, L., Geoffroy, O., Lévy, C., Madec, G., Meurdesoif, Y., Msadek, R., Ribes, A., Sanchez‐Gomez, E., Terray, L., and Waldman, R.: Evaluation of CMIP6 DECK Experiments With CNRM‐CM61, J. Adv. Model. Earth Sy., 11, 2177–2213, https://doi.org/10.1029/2019MS001683, 2019. a
Wang, B. and An, S.I.: A method for detecting seasondependent modes of climate variability: SEOF analysis, Geophys. Res. Lett., 32, L15710, https://doi.org/10.1029/2005GL022709, 2005. a
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.: Image quality assessment: From error visibility to structural similarity, IEEE T. Image Process., 13, 600–612, https://doi.org/10.1109/TIP.2003.819861, 2004. a
Wiedermann, M., Donges, J. F., Handorf, D., Kurths, J., and Donner, R. V.: Hierarchical structures in Northern Hemispheric extratropical winter ocean–atmosphere interactions, Int. J. Climatol., 37, 3821–3836, https://doi.org/10.1002/joc.4956, 2017. a
Wu, T., Lu, Y., Fang, Y., Xin, X., Li, L., Li, W., Jie, W., Zhang, J., Liu, Y., Zhang, L., Zhang, F., Zhang, Y., Wu, F., Li, J., Chu, M., Wang, Z., Shi, X., Liu, X., Wei, M., Huang, A., Zhang, Y., and Liu, X.: The Beijing Climate Center Climate System Model (BCCCSM): the main progress from CMIP5 to CMIP6, Geosci. Model Dev., 12, 1573–1600, https://doi.org/10.5194/gmd1215732019, 2019. a
Yang, P., Wang, G., Xiao, Z., Tsonis, A. A., Feng, G., Liu, S., and Zhou, X.: Climate: a dynamical system with mismatched space and time domains, Clim. Dynam., 56, 3305–3311, https://doi.org/10.1007/s00382021056467, 2021. a
Yeo, S.R., Yeh, S.W., Kim, K.Y., and Kim, W.M.: The role of lowfrequency variation in the manifestation of warming trend and ENSO amplitude, Clim. Dynam., 49, 1197–1213, https://doi.org/10.1007/s0038201633760, 2017. a
Yukimoto, S., Kawai, H., Koshiro, T., Oshima, N., Yoshida, K., Urakawa, S., Tsujino, H., Deushi, M., Tanaka, T., Hosaka, M., Yabu, S., Yoshimura, H., Shindo, E., Mizuta, R., Obata, A., Adachi, Y., and Ishii, M.: The Meteorological Research Institute Earth System Model Version 2.0, MRIESM2.0: Description and Basic Evaluation of the Physical Component, J. Meteorol. Soc. Jpn., 97, 931–965, https://doi.org/10.2151/jmsj.2019051, 2019. a
Zhang, M.Z., Xu, Z., Han, Y., and Guo, W.: An improved multivariable integrated evaluation method and tool (MVIETool) v1.0 for multimodel intercomparison, Geosci. Model Dev., 14, 3079–3094, https://doi.org/10.5194/gmd1430792021, 2021. a
Zhu, X., Dong, W., Wei, Z., Guo, Y., Gao, X., Wen, X., Yang, S., Zheng, Z., Yan, D., Zhu, Y., and Chen, J.: Multidecadal evolution characteristics of global surface temperature anomaly data shown by observation and CMIP5 models, Int. J. Climatol., 38, 1533–1542, https://doi.org/10.1002/joc.5264, 2018. a
Ziehn, T., Chamberlain, M. A., Law, R. M., Lenton, A., Bodman, R. W., Dix, M., Stevens, L., Wang, Y.P., and Srbinovsky, J.: The Australian Earth System Model: ACCESSESM1.5, J. South. Hemisphere Earth Syst. Sci., 70, 193–214, https://doi.org/10.1071/ES19035, 2020. a