A climate network perspective of the intertropical convergence zone

The intertropical convergence zone (ITCZ) is an important component of the tropical rain belt. Climate models continue to struggle to adequately represent the ITCZ and differ substantially in its simulated response to climate change. Here we employ complex network approaches, which extract spatio-temporal variability patterns from climate data, to better understand differences in the dynamics of the ITCZ in state-of-the-art global circulation models (GCMs). For this purpose, we study simulations with 14 GCMs in an idealized slab-ocean aquaplanet setup from TRACMIP – the Tropical Rain belts 5 with an Annual cycle and a Continent Model Intercomparison Project. We construct network representations based on the spatial correlation pattern of monthly surface temperature anomalies and study the zonal mean patterns of different topological and spatial network characteristics. Specifically, we cluster the GCMs by means of their zonal network measure distribution utilizing hierarchical clustering. We find that in the control simulation, the zonal network measure distribution is able to pick up model differences in the tropical SST contrast, the ITCZ position and the strength of the Southern Hemisphere Hadley cell. 10 Although we do not find evidence for consistent modifications in the network structure tracing the response of the ITCZ to global warming in the considered model ensemble, our analysis demonstrates that coherent variations of the global SST field are linked with ITCZ dynamics. This suggests that climate networks can provide a new perspective on ITCZ dynamics and model differences therein.

and a self-consistent classification of different flavors of El Niño and La Niña based on their corresponding global imprints (Radebach et al., 2013;Wiedermann et al., 2016). Other successful applications include the development of early warning indicators for a possible collapse of the Atlantic Meridional Overturning Circulation based on ocean temperature correlations (van der Mheen et al., 2013), uncovering of key spatiotemporal patterns associated 60 with heavy precipitation formation in different monsoon regions (Malik et al., 2012;Boers et al., 2013Boers et al., , 2014Stolbova et al., 2014), improved forecasting of the Indian summer monsoon onset and rainfall amount (Stolbova et al., 2016;Fan et al., 2020), and identification of teleconnection pathways (Zhou et al., 2015;Boers et al., 2019). In all these examples, spatio-temporal climate data have been transformed into a network representation based on different concepts of statistical association used for identifying mutually dependent climate time series. 65 To better understand the effect of spatio-temporal SST patterns on ITCZ dynamics, we present the first application of functional climate network analysis to idealized aquaplanet simulations from the TRACMIP model ensemble (Voigt et al., 2016) that is freely available via the Earth System Grid Federation as well as the Pangeo project. The simulation setup and the corresponding data are briefly introduced in Sect. 2, which also details our analysis methodology that combines functional climate network analysis with a hierarchical clustering method to classify the TRACMIP models according to their climate 70 network topology. We focus on two questions. First, to what extent is the network-based classification related to the models' ITCZ position in the control simulation? And second, to what extent is the response of the ITCZ to quadrupled atmospheric carbon dioxide related to changes in the climate network? The obtained results are presented in Sect. 3. The paper closes with a discussion and conclusion in Sect. 4. TRACMIP was designed to study fundamental aspects of the ITCZ and its response to climate change, e.g., the link of the ITCZ with SST and cross-equatorial atmospheric energy transport (Biasutti and Voigt, 2020). TRACMIP can also be used in 80 a much broader sense, including studies of phenomena such as the Arctic amplification (Russotto and Biasutti, 2020). The TRACMIP protocol has been described in detail in Voigt et al. (2016), which includes references for the participating models.
The most salient features of TRACMIP compared to other aquaplanet studies is the use of a slab ocean with a present-daylike ocean heat transport and seasonally-varying insolation. In TRACMIP, sea surface temperatures are thus interactive and the surface energy balance is closed, the ITCZ migrates north and south during the year, and the ITCZ is located in the Northern 85 Hemisphere in the zonal-mean and time-average, consistent with the present-day climate.
In this work, we use aquaplanet simulations performed by all 14 models contributing to the intercomparison project. The AquaControl simulation is run with a present-day like CO 2 concentration of 348 ppmv. In the Aqua4xCO2 simulation, CO 2 is quadrupled to 1392 ppmv, leading to a model-dependent increase of global-mean surface temperature by 3-10 K, and changes in the ITCZ location that range from a slight southward shift to strong northward shifts by up to 8 • in latitude. Following Biasutti 90 and Voigt (2020), the ITCZ position is calculated by the precipitation centroid as defined in Adam et al. (2016) over latitudes within 20 • N/S. This locates the ITCZ somewhat closer to the equator compared to the values diagnosed in Voigt et al. (2016), but the results of our analysis are insensitive to the definition of the ITCZ position. Figure 1 illustrates the precipitation and sea surface temperature pattern of the Aquacontrol simulations. This figure also demonstrates the tight correlation of the ITCZ position with the tropical SST contrast (correlation coefficient > 0.99). Again following Biasutti and Voigt (2020), the latter 95 is defined as the SST difference between the Northern and Southern Hemisphere tropical means (Eq.-25 • N and 25 • S-Eq., respectively). Figure 1 confirms that SST is a prime control on the ITCZ, a fact that has been exploited in many previous studies. However, in contrast to previous work that used the time-average SST, we here apply functional climate network representations to study the relation between the ITCZ and the internal variability of the SST field. An illustration of the temporal SST variability is 100 shown in Fig. 2 for the AquaControl simulation. The SST variability is minimal near the equator, is relatively constant in the subtropics and midlatitudes, and drops off near the poles. This pattern is broadly in line with the variability in the sum of surface turbulent fluxes (sensible and latent heat flux) and surface winds (not shown).
The TRACMIP model output is provided on regular latitude-longitude grids that have a model-dependent horizontal resolution ranging from 1 to 3 • in latitude and longitude. For the AquaControl simulations, we restrict our analysis to the last 105 30 years and for the Aqua4xCO2 simulations to the last 25 years to ensure that the models are in statistical equilibrium. As described below, we focus on monthly-mean SST fields, from which we construct functional climate networks and study their relation to the time-average ITCZ position and tropical SST contrast.

Functional climate networks
Functional climate networks constitute an application of complex network theory to understand functional relations in the ties. In climate networks nodes are identified with geographical locations at which climate variability information is available in the form of time series (in our case, the grid points of climate model outputs), and links connect pairs of nodes whose climate time series exhibit a certain level of statistical association. In this work, we analyze the set of SST time series on the 115 longitude-latitude grid separately for each model and simulation run using the methodological setup described in the following.
First, we calculate the linear Pearson correlation coefficient for the monthly SST anomalies of all pairs of grid points. The time series of the SST anomalies are computed by subtracting the monthly-mean climatology from the original SST time series.
In principle, we could utilize any statistical association measure in this step. We chose the Pearson correlation here as it has been successfully applied in previous studies (Donges et al., 2009;Radebach et al., 2013;Wiedermann et al., 2016) and is 120 suitable for relatively short time series (here, 300 to 360 monthly anomaly values). For a model output with N grid points, this leads to a correlation matrix S of dimension N × N .
Second, we threshold the correlation matrix and transform it into a binary matrix A. A is the adjacency matrix of the associated climate network representation of the underlying data set and has the same dimension as S. If the correlation value s ij between two grid point time series (nodes) i and j is larger than the prescribed threshold, we set the corresponding 125 matrix element of A to a ij = 1 (i.e., there is a link between nodes i and j), and to a ij = 0 otherwise (no link). We choose the correlation threshold in such a way as to obtain a link density of ρ = 0.005. This implies that the network only includes the strongest 0.5% of all possible N (N − 1)/2 undirected links (when excluding self-correlations, i.e., a ii = 0). For global networks with a comparable resolution, this order of magnitude has been a common choice in previous studies (Radebach et al., 2013;Donner et al., 2017). It has to be noted, however, that changes in the link density of a climate network have been 130 previously shown to potentially result in qualitatively different behaviors of the resulting network characteristics (Radebach et al., 2013;Wiedermann et al., 2017) (depending on the specific network property under study). For this reason, we keep the link density fixed for all analyses performed in this work. However, since the individual models differ in their spatial grid resolution, the total number of links will differ between models, with a lower number of links in case of coarser grids. This has direct consequences for the quantitative analysis of the resulting networks. In the following, we will choose our analysis setup such that the effect of a different number of links on the final results is minimized.
Third, based on the adjacency matrix we calculate the values of two network characteristics: the degree and the average link distance. The degree k i encodes the number of links of each node i, where the index i corresponds to a specific latitude-longitude grid point. As our climate networks are spatially embedded on 140 regular spherical grids, the spatial density of nodes on a spherical surface is not homogeneous and increases towards the poles.
To make all nodes equally representative, we utilize the concept of the so-called node splitting invariance (n.s.i., Heitzig et al. (2012)) and define the accordingly node-weighted n.s.i.-degree as Here, w i is the n.s.i. node weight, which in our case is given by the cosine of the latitude. For simplicity, in the remainder 145 of this work, we will briefly refer to the n.s.i. degree (sometimes also termed area-weighted connectivity (Tsonis et al., 2006;Tsonis and Swanson, 2008)) as the degree. Moreover, instead of discussing the full spatial pattern of degree on the whole sphere, we will focus on the zonal-mean degree. The meaningful calculation and interpretation of this property are ensured by the zonally-uniform boundary conditions of the aquaplanet setup. To compare the zonal-mean degree pattern between models despite their differences in grid resolution, we normalize the zonal-mean degree such that its sum over all latitudes is 1.

150
The information provided by the purely topological measure of node degree is complemented by the average link distance, which is a spatial (geometric) characteristic. The average link distance is the mean great circle distance between a given node and all its connected nodes (i.e., the mean spatial length of all links of a given node). For simplicity, the average link distance is given here in units of radians; physical distances could be easily obtained by scaling the values with the Earth's radius.
We emphasize that we have also studied a suite of additional local (such as the local clustering coefficient) as well as global 155 network properties (such as the network transitivity) (Radebach et al., 2013;Donner et al., 2017). These measures have not provided additional insights and are therefore not reported in the following.

Hierarchical Clustering
After having constructed the network for each individual model and simulation, we study to what extent the models can be grouped according to their resulting zonal-mean network measure patterns. For this purpose, we employ a hierarchical cluster 160 analysis.
As a first step, we resample the zonal-mean network measure distribution to the lowest latitudinal resolution of all models (CALTECH, 2.8°: ::: 2.8 • ) utilizing linear interpolation. Then, we calculate the Pearson correlation coefficient between (selected parts of) the values of the resampled zonal-mean network measures as a function of the latitude between all pairs of models.
Since the Pearson correlation is invariant under rescaling and possible additive terms, offsets between the individual models' with C new being the rescaled inter-model similarity matrix that is then used as an input for the hierarchical cluster analysis.
Note that min C = min ij c ij and max C = max ij c ij are the minimal and maximal values among all inter-model correlations 175 and that min C can hence be negative.
Finally, we cluster models by means of the hierarchical cluster analysis as implemented in the python package scipy, where we use the single linkage method for successively combining groups of models according to the highest pairwise correlation among the respectively included models. This methodological choice ensures that the most similar models are grouped into the same cluster, yet allows for the possibility to produce outliers. With the number of considered models being quite low, 180 we can at every iteration of the algorithm identify such outliers and explain this by the properties of the employed cluster analysis method while ensuring that the most similar models indeed end up in the same cluster. Specifically, based on the models' mutual similarity, our hierarchical cluster analysis method iteratively identifies pairs of models that are successively merged into clusters until all models belong to a single cluster. The order of this clustering is depicted in a dendrogram. In the visual representation used in this work, the resulting dendrograms are meant to be read from left (where each individual model 185 constitutes a single cluster) to right (all models are combined in one cluster). Vertical lines indicate a merge of two models or clusters of models, while the horizontal lines represent the increasing cophenetic distance between the clusters (capturing the degree of similarity between the two groups based on the rescaled inter-model similarity matrix). Cutting the dendrogram at a certain level of similarity, i.e., cophenetic distance, leads to certain clusters of models that will then be used in the remainder of this study. 190 We are aware that there are alternative ways both to quantify the similarity between different models based on the values of the different network measures and to cluster the models. Our methodological choices reflect the need to account for differences in the grid resolution of the 14 global climate models, and our preference for a simple and intuitive analysis setup.

Robustness tests
We acknowledge that our analysis setup is based on several specific choices of methodological parameters or variants. To

Results
In the following, we present the results of our functional network analysis. We first make use of the AquaControl simulations and start by investigating climate networks constructed from the complete (global) SST field that include both, tropical and extratropical nodes (Sect. 3.1). Subsequently, we study networks for which some of the regional connections were excluded

AquaControl simulations: Global network properties
The global network analysis is shown in Fig. 3 and separates the models into four clusters (left panel), with cluster 1 and 2 each consisting of 6 models, and cluster 3 and cluster 4 each containing only a single model. The zonal-mean network measures that underlie this clustering are shown in Fig. 4 and discussed in more detail below. Importantly, Fig. 3 demonstrates where both models fall into the same multi-model cluster are colored in red (cluster 1) and blue (cluster2), respectively. The difference of the ITCZ position is marked by circles, the difference in the SST contrast is marked by "x". the hierarchical clustering is indeed climatologically meaningful. Despite some inter-model variability in each cluster, clusters 1 and 2 exhibit systematic differences. For cluster 1 (Fig. 4, first row), all models show a coherent degree and link distance minimum around the position of the ITCZ and a marked peak of both measures related to a strong Southern Hemisphere Hadley cell. This finding has already been reported by Wolf et al. (2019) and reflects the fact that the models of cluster 1 have the strongest Southern Hemisphere Hadley cell across the model ensemble. For cluster 2, the zonal-mean average link 235 distance exhibits a broad maximum around the ITCZ position instead of the minimum found for cluster 1 (Fig. 4, second row).
Moreover, the zonal-mean network measures of cluster 2 are more symmetric with respect to the equator than for the models in cluster 1. The comparison between clusters 1 and 2 reveals that a more northward ITCZ position tends to be associated with less symmetric network properties and minima in degree and link distance near the ITCZ, whereas an ITCZ closer to the equator is accompanied by a more symmetric pattern of zonal-mean network characteristics with only small meridional 240 contrasts in the tropical degree and a near-equatorial maximum in the average link distance.
As opposed to the aforementioned two groups of models, the networks derived from the CALTECH and AM2.1 models, which form cluster 3 and cluster 4, respectively, show zonal-mean network measures that resemble extreme versions of cluster 1 (for AM2.1) and cluster 2 (for CALTECH). CALTECH (cluster 3) has the ITCZ closest to the equator among all models, and its network characteristics exhibit near-perfect symmetry with respect to the equator. By contrast, for AM2.1 (cluster 4) the 245 ITCZ is shifted far into the Northern Hemisphere and its zonal-mean network characteristics exhibit hardly any symmetry with respect to the equator, yet marked minima of degree and average link distance at about the latitudinal position of the ITCZ. We further demonstrate the success of the functional climate network analysis along with the hierarchical cluster analysis in the following manner. For all pairs of models, we plot the inter-model difference in the ITCZ position and tropical SST contrast as a function of the similarity of the models' zonal-mean network characteristics. The latter is measured by the elements of 250 the original (non-rescaled) inter-model similarity matrix c ij , which represent the combined similarity of the patterns of the two considered zonal-mean network characteristics (see Sect. 2.3). Here, we note again that c ij is bounded to the interval [−2, 2] because the values of the Pearson correlation coefficients for zonal-mean degree and zonal-mean average link distance are both restricted to [−1, 1]. Figure 5 shows the resulting scatter plots of the pairwise similarity coefficient versus the ITCZ position and tropical SST contrast, respectively.

255
Despite the generally large scatter, there is a clear tendency towards smaller differences in the ITCZ position and SST contrast for models whose climate networks are more similar (Fig. 5). Put differently, this underlines that for models that are identified as more similar by the network and cluster analysis (larger correlation score on the x-axis), the pairwise difference in the ITCZ position and SST contrast tends to be smaller than for models with less similar network characteristics. In addition, all combinations of model pairs that belong to cluster 1 or cluster 2 (recall that cluster 3 and cluster 4 only include only one 260 model each) are colored in red and blue, respectively. This shows that indeed the clustering only groups models together that have similar network characteristics.
3.2 AquaControl simulations: Networks with stepwise exclusion of regional connections The global network analysis in the previous subsection included both tropical and extratropical connections. To disentangle the relative importance of tropical and extratropical connections, we repeat the analysis but successively exclude different classes 285 of links by setting the corresponding elements of the adjacency matrix to zero. Notably, this strategy removes links from the network whereas the zonal network measures are attributes of nodes and each link contributes to the degree and average link distance of two different nodes. Hence, we still retain complete zonal-mean characteristics of networks with a specific subset of links. The analysis of the residual network structures thereby allows us to quantify the importance of, e.g., connections within the tropics or between the tropics and the extratropics.   Hadley cell. The network measures for models in cluster 2 (second row) exhibit rather heterogeneous distributions, consistent with the relatively large model spread in SST contrast and ITCZ position in that cluster. Finally, the network properties of the models in cluster 4 (bottom row) feature some level of symmetry with respect to the equator, although their distributions differ substantially and their cophenetic distance in the dendrogram is large (Fig. 6 left). In line with the previous results, we notice 320 that a decrease in the level of symmetry in the network measures is associated with an ITCZ that is more strongly shifted into the Northern Hemisphere. This is expected as for the ITCZ to be shifted away from the equator, there needs to be some level of hemispheric asymmetry in the SST pattern.

Aqua4xCO2 -AquaControl simulations: Climate response to quadrupling carbon dioxide
In the previous two subsections, we have demonstrated that functional climate network analysis of monthly SST anomalies can identify model differences in the time-average SST contrast and ITCZ position based on the AquaControl simulations. In 335 the following, we study if the analysis can also help to understand model differences in the ITCZ response to global warming triggered by an increase in atmospheric carbon dioxide content in the Aqua4xCO2 simulations. Across the model ensemble, the ITCZ response varies from a slight southward shift by less than 1 • in latitude to a strong northward shift by up to 8 • .
We have constructed complex network representations based on the correlation pattern of monthly SST anomalies in the control simulation (AquaControl) and a global-warming simulation (Aqua4xCO2). We found that the zonal-mean node degree and average link distance of functional climate networks can separate models in terms of their ITCZ position, SST contrast 370 and Hadley cell strength in the control simulation. This separation also holds when extratropical-extratropical connections are excluded, but breaks down when further connections are excluded. This shows that the network approach correctly identifies that extratropical-tropical connections are important in setting the tropical climate (Kang, 2020). The network analysis is also consistent with known mechanisms such as a strong correlation between Hadley cell strength and ITCZ position. Overall, the climate network analysis indicates that the time-mean ITCZ is connected to spatiotemporal variability of the monthly SST, a 375 finding that is not obvious from previous work. However, we also note that the network analysis was unable to separate model differences in the ITCZ response to global warming.
Because the aquaplanet setup is zonally symmetric in a statistical sense, we have restricted our analysis to zonal mean network properties. This simplification implied the loss of local (single-node) information, which prohibited us to identify possible local connectivity structures and fully exploit the complete distribution of links in the network. For example, we have 380 not analyzed if and how individual nodes are connected in the meridional and zonal direction. Future work could look at this aspect in more detail to reveal the role of, e.g., zonally-propagating tropical waves associated with large-scale patterns of SST, wind and atmospheric energy transport. This could also involve other network characteristics like betweenness centrality (Donges et al., 2009) or edge directionality (Wolf et al., 2019), and would complement the traditional empirical orthogonal function (EOF) analysis or more sophisticated pattern recognition approaches such as self-organizing maps (SOM).

395
Code and data availability. The python package pyunicorn (Donges et al., 2015a) used for the functional climate network generation and analysis is freely available at https://github.com/pik-copan/pyunicorn. The TRACMIP data sets are available in cmorized format via the Earth System Grid Federation that also holds CMIP data.
Author contributions. All authors contributed to the design of the study. FW implemented the network analysis and analyzed the data. All authors analyzed the results and wrote the manuscript. All authors read and approved the final manuscript.