Climate as a complex, self-regulating system

5 This paper explores whether climate is complicated or complex by examining the performance of a heat engine in the tropical Pacific, the Pacific Ocean heat engine, which is linked to a teleconnected network of circulation and oscillations. Sustained radiative forcing is widely expected to produce gradual change but instead produces step-wise regime shifts. The engine is a heat pump with cold-to-hot circulation maintained by kinetic energy produced by the Coriolis Effect. It is a fundamental response of a coupled ocean-atmosphere system to asymmetric circulation. This paper surveys emergent 10 behaviours in climate models linked to such shifts. It explores how well models represent the heat engine, compares regime changes in models and observations, and examines how geostrophic controls on meridional heat transport set critical boundary conditions. The results reinforce the description of climate as a self-regulating system governed by the principle of least action. Teleconnected steady-state regimes are physically-induced by the need to maintain boundary-limited dissipation rates between the hemispheres, the equator and the poles. A sufficient imbalance of energy at the planetary surface produces 15 regime shifts that switch between slow and fast dissipation pathways. The strength of coupling measured via heat engine characteristics is weaker in models than in the observed climate, failing to distinguish clearly between free and forced modes. The capacity of the coupled ocean-atmosphere system to maintain homeostasis allows Earth’s climate to be classified physically rather than statistically, the basic unit of climate being the steady-state regime.


Introduction 20
Is the climate system complicated or complex, and does it matter? In 1999, Rind asked a similar question, relating it mainly to long-range (decadal-scale) forecasts. He answered yes, climate is complex, and as to whether it matters, depends how important nonlinearities are.
In everyday language complicated and complex are synonymous, but when used to classify systems they differ. A complicated system is symmetrical and can be deconstructed and reconstructed from its constituent parts. Anderson (1972) 25 described this as the constructionist hypothesis. The current methods used to understand and analyse climate, such as the hierarchy of simple one-dimensional to complicated four-dimensional earth system models (Schneider and Dickinson, 1974;Petersen, 2000;Held, 2005) or separating the different processes involved in the recent climate 'hiatus' to understand its cause(es) (Yao et al., 2016;Medhaug et al., 2017) follow this hypothesis. Complex systems have emergent behaviours that cannot be predicted from the sum of their parts (Nicolis and Nicolis, 30 2012;Broad, 2014;O'Connor and Wong, 2019) so are asymmetrical. Large, complex systems contain a hierarchical multilevel structure with different sets of scale-dependent rules (e.g., different trophic levels in an ecosystem). These cannot be understood through traditional methods (Anderson, 1972), conforming to what might be called an emergentist hypothesis (Clayton and Davies, 2011;O'Connor, 2020).
If climate change is complicated, the current simple-to-complex model hierarchy based on the scalability of radiative forcing 35 is appropriate. However, if important emergent behaviours are being overlooked, then the change process should be treated as complex. For example, rather than being the slowing of a trend by a combination of reduced forcing and decadal climate variability, the recent hiatus is a non-equilibrium steady state of unusual but unexceptional length bookended by externallyforced regime shifts. In this case, we can answer, yes and yes to Rind's (1999) question.
In a companion paper (JR21; Jones and Ricketts, 2021b), we describe a heat engine situated in the tropical Pacific and its 40 relationship with the broader climate network. This heat engine displays behaviours typical of complex systems. Features include: • A heat engine in the tropical Pacific consisting of the eastern-central Pacific (TEP) and western Pacific warm pool (TWP) forms a network with climatic oscillations including the Pacific Decadal Oscillation (PDO) and Atlantic Multidecadal Oscillation (AMO). 45 • The cold-to-warm structure of the heat engine describes a heat pump sustained by kinetic energy capable of shifting warmer or cooler depending on the direction of forcing.
• The heat engine and broader climate network maintains steady-state regimes with the shallow ocean absorbing all the additional heat from radiative forcing not taken up by irreversible processes.
• Heat stored in the ocean is released as part of a regime shift when an energy imbalance destabilises the existing 50 dissipation process. Regime shifts occur in response to a combination of internal and external forcing.
• Historically, climate has occupied two modes: o a free mode from the pre-industrial era to the late 1950s and 1960s, characterised by circulation-dominated dissipation and a relatively weak ENSO. After a cooling episode in the early 1900s the ocean absorbed excess heat through to the late 1930s, most regime shifts appear to be generated from the NH extratropics, 55 warming the tropics and SH. o a forced mode from the late 1960s. Dissipation rates accelerate and the heat engine works harder, generating white noise in climate data, the climate network becomes more teleconnected switching from slow to fast modes, regime shifts are generated in the tropics, most mediated by the heat engine and propagated poleward. 60 • During regime shifts in forced mode, mean sea surface temperatures shift upward in the tropics, followed by the extratropics and land with El Niño being the major distributor of heat.
• Rather than a complicated arrangement of signal and noise, climate change is a complex system process best understood as enhanced climate variability.
This paper builds on this by addressing this behaviour within a complex systems context. Section 2 sets the scene with a 65 brief description of standard theory and practice, the characteristics of complex systems as they apply to climate and the methods used in the paper. Section 3 discusses complex behaviour in climate models. It begins with a description of how model underdetermination arises when simulating complex system behaviour. This is followed by examples of emergence due to coupling, including regimes shifts, ENSO and the self-organisation of circulatory features, and ended with a list of the factors that inhibit gradual warming in the coupled system. The final part of Section 3 describes how the heat engine is 70 represented in an ensemble of coupled models from the CMIP5 archive. Section 4 addresses boundary conditions that constrain complex system behaviour then presents a conceptual framework for the Pacific Ocean heat engine and selfregulating climate network. The paper concludes with a brief summary and some wider implications of the findings.

Standard theory and practice 75
Standard practice treats natural climate variability and anthropogenic forcing as separate processes (Solomon et al., 2011;Deser et al., 2012;Hawkins and Sutton, 2012;Kirtman et al., 2013;Lehner et al., 2020). This is encapsulated in the signal-to-noise model of trend-like change upon which variability is superimposed. This model dominates the statistical analysis of observational and model data, and detection and attribution methods; it shapes how future projections are developed and how climate-related risk is characterised. 80 In its Fifth Assessment Report, the IPCC endorsed linear, trend-like warming with medium confidence, stating that "On decadal to interdecadal timescales and under continually increasing effective radiative forcing (ERF), the forced component of the GMST (global mean surface temperature) trend responds to the ERF trend relatively rapidly and almost linearly ." Two areas that combine theory and models are used to support the standard approach. The first is radiative transfer theory 85 applied within radiative-convective models (Manabe and Wetherald, 1967;Ramanathan and Coakley, 1978;Manabe, 2019).
Additional heat from the ocean is needed to close the energy balance relationship (Raper et al., 2001;Hansen et al., 2005;Meehl et al., 2005;Trenberth et al., 2016). Such models have been very successful in explaining the deterministic response to radiative forcing, so are still supported in their simple one-dimensional form as more complex models have been developed (Anderson et al., 2016;Forster, 2017;Manabe, 2019). 90 The other area of theory focuses on the thermodynamic aspect of climate and the role of nonlinear behaviour. The two main theoretical approaches addressing the relationship between forced change and internal climate variability are the linear https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. stochastic approach (Leith, 1975;Hasselmann, 1976) and deterministic nonlinear dynamics, often based on Lorenz attractors (Lorenz, 1963(Lorenz, , 1968(Lorenz, , 1975.
The linear stochastic approach utilises the fluctuation dissipation theorem, which assumes a system will produce a response 95 proportional to a small perturbation as it moves towards equilibrium (Ghil, 2014;Franzke et al., 2015;Lucarini et al., 2017;Christensen and Berner, 2019;Ghil, 2019). A system's movement towards equilibrium is the same whether a small external force, or an internal, random fluctuation is applied (Ghil and Lucarini, 2020). The stochastic element considered to be produced by dynamic interactions in weather and climate inducing nonlinear responses over expanding timescales (Hasselmann, 1976). These are superimposed on each other to form the standard signal-to-noise statistical model but aim to 100 produce the probability distribution of variability.
The linear stochastic response as laid out by Hasselmann (1976) combines fast and slower elements; e.g., where white noise from the atmosphere perturbs the slower ocean producing a red noise random walk. The focus is on negative feedbacks within the system acting as a constraint and its initial aim was to produce a continuous and realistic spectrum of climatic variability. Although the number of methods for generating variability have since expanded (Chekroun et al., 2011;Lucarini, 105 2012;Franzke et al., 2015;Franzke and O'Kane, 2017), the basic idea of generating stochastic behaviour superimposed on gradual change persists. Ghil and Lucarini (2020) combine observations, modelling, dynamical systems theory and statistical physics in an extensive review that aims to integrate the climate science and climate physics research streams. These are largely separate areas of scientific activity. Ghil and Lucarini (2020) present a unified approach based on the fluctuation dissipation theorem that combines with a range of approaches that include nonlinear and chaotic phenomena drawing from 110 dynamical systems and statistical physics.
Alternatives that explicitly represent nonlinear responses at current and near future levels of forcing are less common. A key example, Corti et al. (1999) analysed changes in multi-modal climate attractors developed from winter 500 hP geopotential height over the NH 1949NH -1994 to see whether forcing was being projected onto the principal patterns of natural variability.
They detected two patterns coinciding with changes in temperature suggesting this may be the case. Updated data from the 115 same region shows regime shifts occurring in 1977, 1998 and 2015. These coincide with regime shifts in the PDO, and closely match three of four shifts in NH surface temperature from NCDC v4 data, although 1987 takes precedence over 1977 (p<0.05) in the latter.
Most efforts exploring how attractors may respond nonlinearly to forcing have analysed shorter timescales (e.g., Palmer, 1993;Hannachi et al., 2017). These involve phenomena such as synoptic and seasonal regimes, and interannual 120 oscillations, mapped onto attractors to see how they change over time (Corti et al., 1999;Bartsev et al., 2017;Hannachi et al., 2017). Investigations in this area have been sporadic, largely because of the many possibilities as to the end result (e.g., Bartsev et al., 2017). With a lack of sound empirical evidence, there is little shared understanding as to what a meaningful result would look like. The failure of gradual warming under severe testing, passed by regime shifts in Jones and Ricketts (2017) concentrated the focus on understanding nonlinear responses to forcing on decadal timescales.

Climate approached as a complex dynamic system
The constructionist hypothesis of building the whole from the sum of its parts is widely applied in climate science, described by the simple to complex model hierarchy. However, "the reductionist hypothesis does not by any means imply a "constructionist" one: The ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe (Anderson, 1972)". While this restriction could apply to the climate system (e.g., 130 Harrison and Stainforth, 2009;Katzav and Parker, 2018), it has not been fully tested.
For complex systems, Champion et al. (2019) recommend a data-driven approach: "in many modern systems, governing equations are unknown or only partially known, and recourse to first-principles derivations is untenable. Instead, many of these systems have rich time-series data due to emerging sensor and measurement technologies (e.g., in biology and climate science). This has given rise to the new paradigm of data-driven model discovery." Champion et al. (2019) describe the 135 identification of nonlinear dynamics to diagnose parsimonious governing equations from sparse data measured in complex dynamic systems. Such methods can also incorporate partially known physics and constraints. Models diagnosed in this way include those containing Lorenz attractors, a reaction diffusion system and nonlinear pendulum (Champion et al., 2019).
Most of the climate literature that discusses complexity uses the term as a synonym for complicated. For example, the IPCC Working Group I Fifth assessment report mentions complex or complexity 270 times and most are descriptors meaning 140 complicated or describing the degree of model structure and detail. Fewer mentions are of subsystems known to be complex (e.g., soils, chemistry) and the only direct references to system complexity are when climate system and nonlinearity are defined in the glossary (IPCC, 2013).
Explicit reference to complexity is becoming more common, especially in the physics-related climate literature (Lucarini et al., 2014;Franzke et al., 2015;Ghil and Lucarini, 2020). Guidance for model application is also moving beyond the 145 constructionist approach, exploring how to model complex responses such as emergence (Harrison and Stainforth, 2009;Katzav and Parker, 2015;Jeevanjee et al., 2017). Papers on complex system science often use climate as an example (Champion et al., 2019;Toledo-Roy et al., 2019), or apply methods from complex system science to climate (Donner et al., 2009;Boschetti et al., 2011;Tsonis, 2018;Agarwal, 2019) but fewer deal with climate as a fundamentally complex system. Recent examples include a review of power laws and scaling on climate variability (Franzke et al., 2020), bifurcation 150 analysis looking at climate oscillations (Dijkstra, 2019), statistical mechanics of climate .
Work on specific phenomena known to be the produce of complex system behaviour while allowing the deterministic element to dominate is more common. For example, abrupt climate change or tipping points (Ditlevsen and Johnsen, 2010;McNeall et al., 2011;Bathiany et al., 2018;Lenton et al., 2019), extreme events (Boers et al., 2019) and on the impact of climate on social and ecological systems are more common (Lenton and Williams, 2013;Moore, 2018;Sterner et al., 2019). 155 The deterministic relationship resulting in a gradual response to forcing is generally accepted to the point where higher rates of forcing could potentially precipitate an abrupt change. Self-organisation or self-regulation is accepted in the biosphere which includes climate (Lovelock, 1986;Lenton, 2009)  The evidence presented in JR21 describes a self-organising system. For example, climate evolving from free to forced mode in response to increasing radiative forcing, switching from slow to fast dissipating processes across a network and a heat engine that remains in a metastable state while exhibiting oscillatory behaviour and dominating the warming process. Table   1 outlines a number of examples that collectively provide reasons as to why climate may be considered as complex rather than complicated. The challenge is to bring these into a unified understanding. 165

Methodology
JR21 presents the results of data-driven model discovery based on observations. Tracking the timing and location of regime shifts using the multistep bivariate test (Ricketts, 2015;Jones and Ricketts, 2017;Ricketts and Jones, 2018;Ricketts, 2019) we identified the Pacific Ocean heat engine, then traced its links to the broader climate network via ENSO and indices of 170 decadal variability. These patterns are the result of physical induction, the process of emergence in systems arising from autonomous exchanges of information via currently unknown organising principles. Here we address the issue of emergence in climate models to explore how well they reproduce the behaviour seen in observations. This is introduced by a brief description of model underdetermination.
The integrity of the shift results is described in the SI Sect S2.1 and Table S1. Annual average shift size for TWP and TEP is 175 estimated from an ensemble of 30 RCP4.5 GCMs from CMIP5 (Table S2). GMST from 29 models is available, allowing two https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License.
issues to be addressed: (1) whether GCMs contain a recognisable Pacific Ocean heat engine and (2) whether the behaviour of those heat engines can be linked to shifts in GMST. Tests from the ensemble include measurement of the TWP-TEP gradient, step sizes and frequency, and relationship of step size to model equilibrium climate sensitivity (ECS). The probability of each matching shift in the heat engine and GMST with ±1 year accuracy against being random is calculated 180 for each model. This allows for a time gap of seven years between shifts in each of TEP and TWP (the sampling interval in the multistep bivariate test). There is no additional benefit if shifts in both TEP and TWP match a shift in GMST (Sect. S2.2 for details). These measures are also correlated with 23 skill scores from CMIP5 models (Fasullo, 2020;Fasullo et al., 2020).
Shifts are tracked in greater detail in three models: CESM1-CAM5, NorESM1-M and MIROC-ESM. These have a larger number of shifts for sampling purposes and a sample of hits and misses in aligning shifts in TWP and TEP with GMST, so 185 are not the best or the worst performers available. These criteria are not very strict and other models could have been used.
Selected measures of radiation from measurement, reanalysis and models are investigated because it was unclear as to whether the deficit at the top of atmosphere, imbalance at the surface or a combination of both may influence and modelled observed changes. One question was whether changes in radiative responses to forcing within the climate are regime-like and whether regime shifts are present in climate models. We test NOAA Interpolated Outgoing Longwave Radiation (1979-190 2019), and variables from the NCAR-NCEP Reanalysis 1 (R1 1948-2019) and NOAA/CIRES/DOE 20th Century Reanalysis (V3 1836-2015) along with CESM1-CAM5. More details about data and data sources are in the SI. The early coupled model data used to show the emergence of regime shifts is archived from Jones (2004).
The remainder of the paper explores boundary conditions, especially meridional heat transport using examples form the literature before compiling a conceptual model for climate as a complex, self-organising system using findings drawn from 195 JR21 and this paper.

Model underdetermination
Werndl (2016a) discusses how to distinguish between determinism and indeterminism using observations and other indirect evidence. Determinism describes a state within a system that fixes past and future states or a model containing deterministic 200 functions that does so, an indeterministic model is one that cannot (Werndl, 2016a). This is related to transitive and intransitive change where transitive systems are those that will end up with the same statistics regardless of the starting point, intransitive systems may reach different states and those that may pass through different states over a finite time are almost intransitive (Lorenz, 1968(Lorenz, , 1976Lockwood, 2001;Lohmann, 2009). Climate projections where GCMs start with different initial conditions but end up in a predetermined state are considered transitive (Daron and Stainforth, 2013), whereas a 205 climate that passes through a series of 'rapid step-like changes' (Lockwood, 2001) will be almost intransitive.
Deterministic models are physically-based, e.g., equations of motion, and indeterministic models are probabilistic and/or parameterised, such as stochastic models. When the results of either cannot be distinguished with respect to independent https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. data, they are said to be observationally equivalent (Werndl, 2016b). Climate science has many such examples, global mean surface temperature (GMST) being prominent. 210 For chaos theory, outcomes are a combination of deterministic and probabilistic behaviour (Werndl, 2016a). Because these can combine differently across scales, the results need to be examined in a context appropriate to that scale. Observational equivalence can be overcome by providing auxiliary information that supports one alternative over others. This is the principle behind severe testing (Mayo, 1996(Mayo, , 2005Mayo andSpanos, 2010, 2011;Mayo, 2018) where probative criteria are applied to probabilistic testing so that one hypothesis able to pass test its rivals fail with high probability. We used this 215 approach to conclude that decadal climate change is nonlinear rather than gradual (Jones and Ricketts, 2017).
Model underdetermination can occur when different models are observationally equivalent, or when multiple outcomes present from a common starting point due to factors such as process and parameter uncertainty, incomplete knowledge and complex system behaviour, and there is no straightforward way to distinguish between the outcomes. In complex systems, multiple pathways to potential solutions will emerge for systems and processes with many degrees of freedom and strong 220 boundary conditions. Betz (2009) argues that climate models (GCMs) are underdetermined because there is no way to rationally distinguish between them, and because they are an ensemble of opportunity (Allen and Stainforth, 2002) rather than being systematically assembled. This limitation will always be present in some form, so adequacy becomes important, to avoid underdetermination leading to unsatisfactory outcomes. Examples include (i) simplifying models below levels of efficacy, especially for complex system behaviour; (ii) where there are multiple explanations for a result and the models 225 supporting them are all considered to be plausible, and (iii) where models consistently obtain an acceptable result through compensating errors. We cite examples of these and examine skill scores later in the paper.
ENSO is one case where a wide range of conceptual models and theories have been proposed (Section 3.2.2). Similar issues accompany thermodynamic and dynamic approaches to understanding climate variability (Palmer, 2013;Wehrli et al., 2018;Ghil and Lucarini, 2020), and the role of entropy in the climate system (Ozawa et al., 2003;Kleidon, 2009;Nicolis and 230 Nicolis, 2010;. Dijkstra (2019) summarises the challenges of dealing explicitly with emergence in model studies of complex system behaviour. Finding the appropriate level where specific phenomena emerge is vital. The climate system is rich in emergent structures yet the question as to why they are there is rarely asked (e.g., Clement et al., 2005;Dijkstra, 2019). Examples of robust emergent structures include regime shifts, ENSO, the tropical warm pools and the Atlantic Meridional Overturning Circulation (AMOC), discussed in the following section. 235

Emergence
Two levels of emergence are of interest: self-organisation and self-regulation. Self-organisation is widely accepted for climate, whereas self-regulation is less so. The former is associated with both temporary and permanent structures, whereas self-regulation implies the maintenance of preferred states. The following sections outline evidence for the latter.

Regime shifts 240
Decadal regime shifts and ENSO both emerged early in the development of coupled ocean-atmosphere models. The first generation became available during the late 1990s and the IPCC Third Assessment Report (TAR; 2001) drew on both coupled and mixed-layer models. ENSO emergence was the product of atmospheric and ocean coupling with sufficient resolution to distinguish features such as the tropical Pacific thermocline (Timmermann et al., 1999;Kim et al., 2014).
Some of the coupled models that contributed to IPCC TAR contain shifts in temperature whereas others warm gradually. In 245 Fig. 1, GMST from the HadCM3 model shows the most pronounced regime-like structure and the CCC model the least, producing an almost monotonic curve. Of a suite of five, the MPI model was the only one that produced stationary conditions at the regional scale in SE Australia, whereas the HadCM1 model contained shifts but warmed from the mid-19 th century (Jones, 2004). Observed climate shows a stationary period reflecting free to forced mode change. The two models showing shifts were amongst the first generation to produce an identifiable ENSO signal. Shifts are ubiquitous in the CMIP3 250 and CMIP5 model archives so they may originate from the same processes.

255
On this evidence, decadal regime shifts are a basic response forcing in the presence of a coupled ocean-atmosphere. The timing and location of shifts in JR21 suggests they originate in regions of surface energy imbalance. This is explored further in Section 3.2.2 where regime shifts are tracked in three GCMS under historical and projected radiative forcing.

The role of ENSO
The Granger analysis in JR21 identified El Niño as the major vehicle for dissipating heat during regime shifts. It may not be 260 the cause of a shift, but following an El Niño event, climate in affected regions cools to a warmer mean state than previously.
El Niño has also intensified over the historical period (Timmermann et al., 2018;Wang et al., 2019;An et al., 2020). This is consistent with palaeoclimate modelling undertaken by Clement and Cane (1999), where gradual changes in forcing produce a nonlinear response in the tropical Pacific, influencing temporal patterns over multiple scales while leaving spatial patterns intact. These changes then alter global climate via teleconnections (Cane and . 265 ENSO is a robust feature of the climate but is affected by model underdetermination, given the many different explanations for its behaviour. These can be grouped into two theoretical frameworks: an unstable, self-sustained natural oscillatory mode of the coupled system and a stochastically-forced stable mode triggered by climate noise (Wang and Picaut, 2004;Wang et al., 2017b;Timmermann et al., 2018). Four oscillators have been proposed: delayed, recharge, Western Pacific and advective-reflective (Wang et al., 2017b;Wang, 2018) and one that integrates all four (Wang, 2001). ENSO is produced in 270 climate models that are noise driven, and those that are neither noise-driven or chaotic, but show some skill (Ramesh and Cane, 2019), as well as in the Zebiack-Cane (ZC) intermediate complexity ENSO model (Zebiak and Cane, 1987), which is capable of representing chaotic behaviour (Tziperman et al., 1994;Ramesh and Cane, 2019). The choice between nonlinear deterministic chaos and linear, stochastic forcing remains unresolved . ENSO has undergone nonlinear dynamic warming, promoting El Niño events and suppressing La Niña Timmermann et al., 2018). Wang et al. (2019) reclassify El Niño events according to their evolutionary characteristics rather than peak expression, recognising strong basin-wide, moderate eastern Pacific and moderate central Pacific types. The 285 moderate eastern events initiate in the eastern Pacific and the central in the west, while the strong basin-wide events originate in the west but combine characteristics of both. Eastern events occur to 1978 and central events thereafter, the change p<0.001 . They measure an increased temperature gradient across the Pacific, but as we note, the TWP and TEP regions have maintained a constant gradient. They also note increased shoaling of the thermocline and increased vertical gradients, which also favour a western origin . 290 The intensification of ENSO in the past fifty years relative to historical and reconstructed pre-industrial records focuses attention on the role of external forcing (McGregor et al., 2013;Freund et al., 2019). The change around 1978 from eastern to central moderate events

coincides with regime shifts in TEP and TWP. A shift in the Southern
Oscillation Index in 1976-77 (Power andSmith, 2007) supports the idea that this is a response to regime change in the tropical Pacific. The first moderate central Pacific event in 1986-88 also coincided with a NH warming shift. This is 295 consistent with climate moves from free to forced mode in a series of regime shifts rather than in a single event.
According to Timmermann et al. (2018), both central and eastern Pacific El Niño operate in states of near criticality. The relationship between TWP and TEP may be the source of this near-criticality. The see-saw described in JR21 is characterised by the dynamic maintenance of a ~2 °C difference between TWP and TEP; the rapid transition to from positive current to negative lagged correlations on monthly timescales; switching between positive, negative and neutral sea level correlations 300 on a weekly to monthly timescale; switching between positive and negative temperature correlations on annual timescales as part of an intensification from free and forced climate modes (Jones and Ricketts, 2021b).

Self-organisation in coupled model studies
In JR21, we proposed that the two-sided nature of the heat engine as a cold-to-warm heat pump sustained by kinetic energy could play a governing role in a self-regulating system. Assessing the emergence of various parts of the heat engine and 305 broader climate network in idealised model studies, preferably using coupled models, may provide examples of how fundamental the various components are.
Aquaplanet experiments show that complex structures can emerge with no topography, up to five planetary-scale attractors in a recent example (Brunetti et al., 2019). The order of emergence shows that the simplest representation of an aquaplanet leads to mainly zonal dissipation but the introduction of zonal asymmetry is enough to introduce a reverse heat engine 310 structure and extratropical gyres (Wu et al., 2021). Wu et al. (2021) compare an aquaplanet and ridge planet in a coupled model. Both have polar land above 80° and the second a 1° wide ridge running N-S. Equatorial deep upwelling emerges on the aquaplanet but the ridge leads to the development of a cold tongue and warm pool, boundary upwelling and Earth-like patterns of meridional transport (Wu et al., 2021). A similar result is found with asymmetric thermodynamic forcing in a coupled mix-layer and atmospheric GCM along with the propagation of extratropical signals into the tropics (Zhang et al., 315 2016). The warm pool -cold tongue has the reverse structure of the heat pump. The simulation also shows a 4 °C difference between east and west, negative feedback on the surface typical of observations and pronounced decadal variability (Zhang et al., 2016).
An experiment altering topography in 20% increments from 0% to 140% assessed how ENSO is affected (Kitoh, 2007). No topography leads to the warm pool moving into the central Pacific, ENSO periodicity lengthening and becoming regular. 320 Increasing the altitude moves the warm pool to the west, ENSO becomes more aperiodic and frequency increases. The behaviour of the thermocline changes with topography exceeding 100% as the warm pool moves through the maritime continent. In another experiment, zero topography leaves the location of the warm pool unchanged but the area of uplift and the Walker Circulation extend all the way across the Indian Ocean (Naiman et al., 2017). ENSO frequency slows and becomes more regular as above. The Walker Circulation declines by 45% but the Hadley circulation by only 2% (Naiman et  325 al., 2017), leaving atmospheric meridional heat transport almost unchanged. Yang and Wen (2020) conduct an experiment where they remove the Tibetan Plateau, a significant heat pump during the northern summer in the CESM.1 coupled climate model. A fast and slow response on the AMOC is initiated. The fest response enhances westerlies over the north Atlantic, intensifying AMOC for a few decades. The slow response involves https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. movement of moisture from the Pacific to the Atlantic, freshening the surface and leading to the almost total shutdown of 330 AMOC in about 300 years (Yang and Wen, 2020). Other studies that remove all topography show that deep water formation moves to the Pacific Ocean (Schmittner et al., 2011;Sinha et al., 2012;Maffre et al., 2018). Idealised experiments show that meridional overturning is also readily produced in aquaplanet oceans (Ferreira et al., 2011;Jamet et al., 2016).
The location and structure of the heat engine in the current climate and of AMOC (Naiman et al., 2017;Yang and Wen, 2020) both seem to be a product of geologically recent orogeny in the Tibetan Plateau (Zhisheng et al., 2001). However, 335 Ferreira et al. (2011) argue that the overall structure is more fundamental and is intransitive, leading to multiple equilibria in a coupled aquaplanet experiment.
The heat pump is a basic product of asymmetric distributions of energy for dissipation on an ocean-dominated planet and decadal variability is emergent. Both ENSO and regime shifts are a product of coupling in current climate conditions and it is possible that they are more fundamental responses to asymmetry, emergent properties that respond to forcing. 340

Physical limits to gradual change
The radiative-convective model and linear response and related theories require warming to be gradual. The theory and practice supporting this have deep historical roots. Measurement methods were developed when the Earth system was thought to respond gradually to change, reflecting the cosmology of the time (Laplace, 1830;Stigler, 1986;Baker, 1998;Rudwick, 2005Rudwick, , 2010. In the earliest models, radiative transfer warmed the atmosphere and ground surface, 345 responding proportionally to the amount of CO2 (Arrhenius, 1896;Hulburt, 1931;Callendar, 1938). Successive assessments verified those early findings, subject to the buffering effect of the ocean and of well-mixed greenhouse gases on temperature distribution Landsberg et al., 1961), and the presence of atmospheric feedbacks (London, 1962;Schneider and Dickinson, 1974). The increasing complexity of climate models reinforced this view, as the core theory remained sound (Weart, 2008;Anderson et al., 2016;Benestad, 2017;Manabe, 2019). Models continued to support this position until the 350 development of coupled models, when nonlinear responses to forcing emerged as described above.
The simpler models therefore produce gradual change, but the complex models do not. Because of the earlier success of the core theory in addition to the additive nature of the forcing-response relationship (Marvel et al., 2015), the role of a dynamic ocean acting as a heat storage has been overlooked as climate models have improved. The following factors influence surface heat balance of the oceans and together do not support gradual warming over ocean or land: 355 Water has roughly 3,200 times the heat capacity by volume and 24 times the heat conductivity of air. When an airmass passes over a water surface, sensible and latent heat reach equilibrium within a 300 m distance (Morton, 1986). The heat capacity of the atmosphere is equivalent to the top 2.5 m of ocean (Deser et al., 2003) and the annual average mixing depth of the atmosphere into the ocean is 90 m (Hartmann, 1994). Therefore, the shallow ocean has ample capacity to absorb available atmospheric heat. The ocean constitutes 70.8% of the 360 global surface area and approximately 75% of the tropical surface area, so dominates surface heat exchange at the planetary scale. (ii) Circulation of the atmosphere over the ocean provides a strong negative feedback absorbing heat (Frankignoul and Hasselmann, 1977;Frankignoul and Kestenare, 2002). This returns anomalies of >1 W m -2 in the atmosphere (e.g., those due to ENSO) to the mean steady state within months -an effect especially strong over 365 the eastern equatorial Pacific (Park et al., 2005;Yoshimori et al., 2016). This is reflected in the stability of TEP between shifts. The monthly Granger analysis undertaken for JR21, also shows that lagged correlations between most regions return to background noise within months.  (Schmidt et al., 2010)). Combining points (i) to (iv) this poses no physical barrier for the ocean to absorb the small amount that is considered to warm the atmosphere in order to maintain steady state. (v) Heat content in the upper ocean shows strongly regime-like behaviour becoming more gradual with depth 385 (Ricketts, 2019). The shift/total warming ratio for the ocean globally is 0.88, and is between 0.73-1.05 for most zones except for the northern hemisphere extra tropics (0.58 30° N-60° N, 0.48 60° N-90° N), so most of the ocean is close to steady state between regime shifts.

(vi)
Under sustained radiative forcing, the ocean warms the land, even though the land warms by a greater amount (Dommenget, 2009;Boer, 2011;Byrne and O'Gorman, 2018). This is consistent with the Granger analysis in 390 JR21.
Warming is controlled by processes at the ocean surface, a product of the coupled atmosphere-ocean. Ozawa et al. (2003) describe climate in two parts: the radiative-transfer component, which is additive and responds linearly to forcing (Meehl et al., 2004;Marvel et al., 2015), and the dissipative system, which is multiplicative and responds nonlinearly. With the ocean https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License.
acting as a steady-state heat storage and release system, the amount of heat available for dissipation increases gradually, but 395 its release is a nonlinear process.
This changes the accepted order of the warming process. The current consensus is that the atmosphere warms directly, the rate being mediated by ocean heat uptake and other influences, such as natural forcing, land-use and land cover change, and decadal variability (Deser et al., 2012;Hawkins and Sutton, 2012;Anderson et al., 2016;Lehner et al., 2020). Instead, all heat not taken up by land and the cryosphere is absorbed by the ocean. The relationship between forcing and GMST δT = λδF, 400 where T is temperature, F is forcing and λ is a constant related to atmospheric feedback processes (Ramaswamy et al., 2001), therefore becomes a measure of potential and acts as a boundary condition. shown are feedbacks that transfer information back up this system at every level (and are essential parts of any complex system). Under standard practice, the first chevron would be labelled radiative forcing and link directly to the third chevron, 405 where heat is dissipated through mainly dynamic processes. A delay is factored in for ocean buffering but the relationship between forcing and changing temperature via direct atmospheric warming, ocean heat uptake and release is estimated from more complex models and applied to simpler models in a highly parameterised form. The simplest models, one-dimensional energy balance models, apply such parameters to estimate changes in GMST.
The missing step represents the storage and release process involved in regime shifts: the thermodynamic forcing 410 component. Two types of decadal regime shift are possible: the first is internally-driven and involves a redistribution of dissipative processes in steady state. The second is associated with warming or cooling depending on the direction of forcing and involves a change in energetic state.

Relationships between TWP, TEP and GMST
Two issues are explored: (1) how realistic the heat engine structure in coupled GCMs is and (2) can shifts in TWP and TEP be linked to those in GMST? TWP and TEP is calculated from an ensemble of 30 RCP4.5 GCMs from CMIP5 (Table S8), 420 and 29 available GMST records.  Figure S1). The lower average gradient may be due to systematic biases, with temperatures in the eastern tropics being too warm (Richter, 2015) 425 and the equatorial cold tongue being too cold (Zheng et al., 2012;Seager et al., 2019). CMIP5 models achieve the right ENSO amplitude due to a weak Bjerknes feedback (east-west atmospheric circulation via the Walker Circulation) and a weak negative surface heat flux feedback (Guilyardi et al., 2020).
The total number of shifts in TEP (n=122) is roughly half that in TWP (n=242). Warming is dominated by shifts rather than trends, with shift/total warming ratios averaging 0.70±0.24 in TEP and 0.67±0.14 in TWP. The average step size in TEP is 430 0.49±0.18 °C and in TWP 0.26±0.14 °C, both slightly higher than observations (0.46 °C, 0.24 °C) and showing a similar ratio east to west (Table S2, Figure S2).
From 1861-2100, 47% of shifts in TWP and/or TEP (167 from a total of 352) match those in GMST within ±1 year and 133 out of 279 or 48% of shifts in GMST are matched. The probability of shifts in TWP and TEP, matching GMST ±1 year calculated for each model ranges from 0.43 to 0.0005 with a median of 0.10, with 8 of 29 models p<0.05. This is calculated 435 from binomial probabilities compared with a random null, with further details provided in the SI (Section S2.1). The likelihood for observations using the same rules is p=0.049 because the ±1 year window limits shift initiated earlier. When surface area is taken in to account (5.7% of the globe for each of TEP and TWP) and no spatial skill is assumed, median probability of the null case reduces to p<0.01 and all outcomes are p<0.05 (Table S2). When weighted to tropical area, 25 out of 29 are p<0.05. Compared to the totally random case, the models do well. 440 Of the 27 GCM records with available estimates of ECS, the correlation between average TEP step size in each of the models with ECS to 2100 is 0.66 (p=0.0003) and TWP step size and ECS is 0.39 (p=0.06). This is higher than the same correlations with change in GMST (TEP 0.41 (p=0.04, n=29), TWP 0.19 (p=0.37, n=29)). The average TEP step size -ECS correlation in the models 1861-2018 is 0.56 (p=0.004). This latter correlation is higher than any of the diagnostics for comparing historical warming with ECS in the 94-member ensemble updated from JR2017, including total warming to 2018. 445 This is because the positive and negative forcing in the historical runs largely cancel each other out, but this is not the case for TEP, which is far more stable.
Shifts in both TWP and TEP produce an immediate atmospheric feedback response, with TEP being the stronger. This contrast is due to regions of subsidence (e.g., TEP) having high climate sensitivity and regions of divergence (e.g., TWP) having neutral to negative sensitivity (Zhou et al., 2017;. A patch experiment forcing SST in the 450 SE Pacific produced a warming response highly correlated with cloud feedback (0.93) compared to the western Pacific (-0.43) . Precise locations of forced regions forced was less important than maintaining an E-W gradient (Zhou et al., 2017;. A patch experiment applying historical climate over the warm pool transported heat to the upper troposphere, which spread to the whole troposphere, increasing outgoing longwave radiation https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. (Dong et al., 2019). Local inversion strength and low cloud decreased, but increased globally, raising albedo and resulting in 455 negative feedbacks. This led Dong et al. (2019) to suggest the warm pool plays a self-regulating role.
Average TEP and TWP from the model sample (n=28) is 0.49 °C and 0.28 °C for TEP and TWP respectively, but from the <3.5 °C sample (n=11) is 0.46±0.17 °C and 0.26±0.10 °C, respectively, similar to observations. Application of the regression relationship between TEP and ECS in the models to 2018 to observations estimates an ECS of 3.1±0.6 °C (adj r 2 =0.34, p=0.002) and 3.2 °C for TEP and TWP singly (Table S3). Using the full record to 2100, estimates 3.1±0.5 °C (adj r 2 =0.42, 460 p=0.001). A more sophisticated analysis of historical climate using spatial representations of sensitivity estimates 3.2 °C, but with a larger range of uncertainty (1.5 °C to 8.1 °C, 5% to 95% confidence) .
Skill scores for the CMIP5 models have recently become available (Fasullo, 2020;Fasullo et al., 2020), covering overall scores, energy, water, dynamics, annual and seasonal and contributing variables (n-23). These are available for 23 of the climate models with 20 having accompanying ECS estimates. Correlations are presented for measures including TEP and 465 TWP shift number and size, matches with GMST, the TWP-TEP gradient and ECS. Some results are strongly negative, especially for models with ECS>3.5 °C (Table 2; Tables S4 and S10). TWP size and TWP-TEP gradient are related to ECS rather than skill. The TEP shift/warming ratio (SWR) is affected by outliers. Skill has little effect on the TWP shift warming ratio, TWP number of shifts or in the probabilities of TWP and TEP matching GMST. Negative correlations with skill are most associated with TEP. Greater skill is associated with higher average TEP and fewer shifts, neither partitioned by ECS. It also leads to fewer matches with GMST, but this may be 475 because greater skill results in fewer shifts in TEP; models with more shifts in TEP tend to have higher matching rates.
The strongest relationships with model skill are for frequency of TEP shifts compared to TWP and the magnitude of TWP shifts compared to TEP. Both ratios decline with greater skill and are p<0.05 for 22 of the 23 skill measures (Table 2). Fig.   3a shows these as a limit relationship. Given higher skill for energy and overall metrics, the TEP/TWP ratio declines to an average of 0.4 equivalent to 2.5 shifts in TWP for every shift in TEP. The ratio of average shift size for TWP compared to 480 TEP has a similar relationship with skill scores, moving towards a limit of about 0.45, slightly lower than in observations (Fig. 3b). The small sample size and inhomogeneities in observed TWP makes direct comparison with observations difficult but greater model skill produces ratios similar to observations. Strong negative feedbacks in TEP maintain steady-state conditions. High atmospheric feedbacks accompanying regime shifts in TEP are influenced by skill measures for the energy and water cycles and with ENSO. TWP is associated with neutral to negative atmospheric feedback and responds weakly to 485 skill measures. This shows that skill resides in TEP, which also strongly influences TWP up the TEP-TWP gradient. ECS has a positive relationship with skills associated with the water cycle and longwave radiation. Skills most strongly 490 associated with heat engine metrics are energy, water, shortwave cloud forcing, overall, ENSO, precipitable water and longwave net radiation at the top of the atmosphere. This order changes slightly when ECS is a factor, but energy stands out further and vertical velocity at 500 hPa becomes more important. The probability of TWP and TEP matching GMST does not show any relationship with skill, so is not shown. This negative result suggests either that shifts propagate fairly randomly, perhaps following the path of least resistance, or that the areas of skill which represent shift propagation have not 495 been included in the assessment; e.g., modes of climate variability.

Tracking shifts in climate models
Shifts are tracked in three models, CESM1-CAM5, NorESM1-M and MIROC-ESM, using temperature data for ten regions (see SI S2.2.1), fewer than the 29 used for observations. Detection is carried out on annual and monthly timescales. Shifts on observations are mostly initiated in the extratropics during free mode and in the tropics in forced mode. In the models, some 500 shifts are initiated by TWP or TEP but others are not. The smaller sample size does not generate a reliable statistic. The ocean leads most of the time, land on a few occasions and where sequences are staggered, the source cannot be reliably identified.
Shifts in TWP and six-month anomalies subtracted from the regime mean are also tested to see whether the resulting peaks However, to 2018, none of the models are able to reproduce the influence of TEP on GMST seen in observations, the CESM-CAM1 model being closest. The curve to 2100 for the MIROC-ESM model is almost the same as for observations to 2018. Results for the stationary data are also not as closely aligned (see SI Fig. S3). Some relationships are quite well 520 reproduced by a single model but no model performs best across all relationships. The most problematic is between TWP and TEP, which none of the models capture well. The rapid reversals between positive and negative lagged correlations seen observations (Fig. 16, JR21) are less pronounced in model data, however, they intensify somewhat during the 21 st century.
The models show clear regime-like behaviour and the ratios of magnitude and frequency for TWP and TEP are similar to those for observations. These measures have close a relationship with skill metrics, while the timing of shifts for TWP, TEP 525 and GMST do not. Shifts are clustered together showing that climate regimes are episodic, but their order and timing is less organised than for observations. Heat engine behaviour intensifies with forcing but model climates do not appear to exhibit free and forced modes. And as many have observed, models respond more slowly to forcing in a number of areas, particularly with the representation of the tropical Pacific and ENSO structure (Flato et al., 2014;Guilyardi et al., 2020).

The heat engine pacemaker
Kosaka and Xie (2016)  After removing quadratic trends, the intermember variance in the POGA ensemble is 68% lower than the free-running historical ensemble (Kosaka and Xie, 2016). The ensemble average and members closely follow observations, as shown in their Figure 1a. Discrepancies with observations occur in the 1880s, WWII and with Arctic warming in the 1930s. The first is partly due to an over-estimation of the Krakatoa eruption, and the second is due to data quality during WWII (Kosaka and 550 Xie, 2016), which is discussed in JR21. The Arctic result is consistent with solar forcing in the higher northern latitudes being transmitted to the tropics in the data that is not well-captured by the model. Allowing for this, their close results to observations, especially after 1950, means they are capturing very similar regime changes as an ensemble. While the historical ensemble recreates the overall pattern of change, the timing of shifts in conventional historical runs has a stochastic element due to internally generated model uncertainty resulting in greater variance (Jones and Ricketts, 2017). 555 Kosaka and Xie (2016) find that a 1 °C rise in the eastern Pacific leads to a 0.33 °C increase in GMST, identifying decadal variability as more influential than ENSO variability (Kosaka and Xie, 2016;Wang et al., 2017a). This region covers <10% of global surface area so is highly influential (Kosaka and Xie, 2016), as our tracking analyses in JR21 and model analyses in Sect 3.3.1 show. The difference between the historical and POGA runs contains a clear signal of the IPO, and a series of EOFs shows the POGA ensemble mean captures observed decadal variability better than the historical simulation (Kosaka 560 and Xie, 2016;Zhang et al., 2018). This led Kosaka and Xie (2016) to conclude that the IPO is the main contributor to the GMST staircase, imposing variations on a secular trend. Further work has strengthened this conclusion (Xie and Kosaka, Zhang et al., 2018). The 2016 paper is highly cited, so this interpretation is widely accepted. Our interpretation is that the POGA pacemaker region pace-makes the heat engine, making this result even more significant.
For the IPO (and thus PDO) to be reproduced, TEP needs to be coupled with TWP and the broader Pacific (Henley et al., 565 2017;Zhang et al., 2018;Kim et al., 2020) -TEP cannot do it alone. The close correspondence between observations and the POGA ensemble shown in Figure 1a of Kosaka and Xie (2016) indicates that the model is reproducing regime shifts in close accord with observations. This suggests that not only is TEP pace-making the heat engine via the TEP-TWP relationship, but that the heat engine is also engaging with the network of teleconnections in the broader climate.

Regime changes in energy transport 570
This section explores whether regime shifts occur in selected measures of energy flux at the surface and top of the atmosphere in observations, reanalysis and climate models.

Satellite observations and model reanalyses
Outgoing longwave radiation from observations and reanalysis is analysed for shifts and compared. Data sources are NOAA The tropics are largely neutral except for 0-30° N, which shifts down by 5.9 W m -2 in 2015.
The reanalyses for the same period do not capture the large increases in the high latitudes. Correlations of zonal regions with observations are generally <0.5, with the tropics being poorest. The highest regional correlation is 0.76 between observations and NCAR-NCEP v1 for 60°-90° N 1979-2019 but the former increased by 6.4 W m -2 compared to 3.8 W m -2 . Correlation is high over TWP and TEP but estimating top of the atmosphere radiation from surface heat content from small areas reduces 585 the margin for error. Largely stationary diffusive processes and characterisation of surface temperatures and feedbacks also contribute to higher correlations (Trenberth and Stepaniak, 2003a;Gastineau et al., 2014). OLR in observations reduce slightly over TWP in 1999 and increase slightly over TEP in 2002, both statistically minor. The reanalyses reproduce the direction of change but not the magnitude. extratropics by 0.5 W m -2 . There are large differences in interannual variability between the observations and reanalyses by a 595 factor of 2.6 and 6.6. Within the reanalyses, heat flux changes do not follow shifts in temperature.
Shifts in both latent and sensible surface heat fluxes are also present in both reanalysis data sets looked at. The direction of change resembles those in the models, with surface latent heat increasing in the higher latitudes and responding to decadal variability in the lower latitudes, especially over the ocean. Surface sensible heat flux mostly shifts negatively. However, due to similar issues with the OLR data, these analyses were taken no further. 600 These analyses reflect the problems of reconciling different measures of energy-related variables (Li et al., 2013;Hinkelman, 2019;Liang et al., 2019). Most measures cannot be directly observed, depend on either remote sensing, modelling or both.
Observations are affected by instrumental inhomogeneities and changes in technology (Loeb et al., 2018a) and is an area where more confident estimates are urgently needed . 610 To overcome the issue of absolute measurement from satellites, Loeb et al. (2021) use relative measurements of surface and toa shortwave, longwave and net radiation from the CERES ED4.1 product mid-2005 to mid-2019. The absolute measurement providing the reference state is an estimate of energy uptake over that period from different sources (Johnson   von Schuckmann et al., 2020b), with the satellite measurements providing the relative changes. The average rate of net heat uptake is 0.77±0.6 W m -2 , the correlation the top 2,000 m of ocean heat uptake is 0.70 (Loeb et al., 2021). The 615 trend is 0.50±0.47 W m -2 decade -1 and for ocean heat uptake it is 0.43±0.40 W m -2 decade -1 . They attribute top of the atmosphere changes to increased absorbed shortwave radiation (ASR) due to clouds and surface albedo changes, and greater OLR (net toa 0.41±0.22, 0.65±0.17 less -0.24±13, all W m -2 decade -1 (Loeb et al., 2021)). Loeb et al. (2021) also attribute much of the change to the phase change in PDO. They link the higher SSTs in the eastern Pacific, which we attribute to a regime shift, to less low cloud and greater ASR, a factor of 4 increase compared to before 2014, 2.5 times when offset by the 620 increase in OLR (this is not the case in the NOAA record above, though it is for 60-90 °N). They note that SST began to warm in 2012 and attribute these changes to variability (Loeb et al., 2018b;Loeb et al., 2021). According to Loeb et al. (2021), the trend does not register below the p<0.05 threshold, which would also be the case for a regime shift (probabilities for the two methods are similar because they both utilise serial independence). A preliminary look at the data shows a shift occurring in 2012 from around 0.5 W m -2 to 0.95 W m -2 . Given the close correspondence with shallow ocean heat content 625 and SST, we conclude this change is likely to be a regime shift involving a thermodynamic response to forcing. A shift due to variability alone is likely to be accompanied by changes in ASR and OLR that compensate each other, which is not the case here.

Climate model output
Energy fluxes in the CESM1-CAM5 model are analysed for regime changes, one of the better performed models for these 630 variables in the CMIP5 ensemble (Fasullo, 2020). Global OLR exhibits negative shifts in 1883, 1963 and 1982 that can be associated with volcanic forcing. Positive shifts occur in 1996, 2030, 2043, 2064and 2080. This timing matches regime shifts in model GMST of 1998GMST of , 2014GMST of , 2029GMST of , 2043GMST of , 2054GMST of , 2066GMST of and 2081GMST of , with only 2054 From 2000, they occur within one year of shifts in tropical ocean (1970,1998,2013,2029,2043,2065,2081).
Outgoing shortwave radiation changes in almost mirror image to the longwave data, with historical increases in 1883 and 635 1959, then shifting downward in 2016, 2033, 2053, 2062 and 2079 (Fig. 5, purple). The 21 st century component forced by RCP4.5 is dominated by gradual changes, with shifts playing a secondary role. When net radiation is analysed, the two almost cancel each out, leaving a very regime-like response (Fig. 7, green).
Surface heat flux is represented by sensible and latent heat components. Sensible heat flux declines gradually over the century from about 1930 (Fig. 7, orange). Change in latent heat dominates surface heat flux (surface latent heat flux; SLHF, 640 Fig. 7, blue), declining gradually over the 20 th century and increasing in the 21 st . The record is dominated by regime shifts, increases occurring in 1998 (p<0.05), 2014, 2030, 2043, 2064 and 2080. These changes coincide with OLR. The decline over the 20th century in OLR to 1995 is almost three times the surface latest heat flux (0.14 W m -2 per decade compared to 0.05 W m -2 per decade) and the increase after 1995 is 3.8 W m -2 compared to 4.4 W m -2 , so the surface flux increases faster than OLR during this period. The difference is mediated by absorbed shortwave radiation.
The timing of surface latent heat flux and OLR follows shifts in GMST in CESM1-CAM5, sometimes by only a month.
Surface latent heat flux also follows shifts in the tropical ocean, except for one instance where it leads by a month. This shows closer internal coordination than the reanalysis data does with observations. Neither TWP or TEP show consistent behaviour with respect to these changes. Both dry and moist static energy in the atmosphere respond to forcing in a stepwise fashion, moist energy more strongly. 650 This is an illustrative example, and other models show similar behaviour. NorESM1, also one of the better ranked models, has a similar response for net top of the atmosphere radiation balance, but varying responses for outward longwave and outgoing shortwave radiation. In both cases there is a decrease of -0.54 W m -2 (CESM1) and -0.62 W m -2 (NorESM) in 1993 and 1995, followed by -0.54 and -0.49 in 2026 and 2034, respectively. More analysis would be needed to see whether this is a common response, but the longwave and shortwave radiation may be compensating for each other while maintaining 655 steady-state net radiation. This appears to be consistent with the relatively brief historical analysis provided by Loeb et al. (2021), where in their Figure 1, both ocean heat content and net top of the atmosphere radiation shift around 2012, moving net radiation from ~0.5 W m -2 to ~0.95 W m -2 . If that is the case the models produce slightly larger shifts than those in observations, similar to the results in Section 3.3.1 for TEP and TWP.

Boundary conditions -meridional heat transport
Strong boundary conditions provide physical limits for complex system behaviour. Here we look at meridional heat transport (MHT), which includes different rates of ocean transport and the dissipation of sensible and latent heat in the atmosphere.
Total ocean and atmosphere heat transport provide a constraint where ocean and atmospheric transport compensate for each 670 other (Bjerknes, 1964;Stone, 1978). Heat transport can be separated into quasi-stationary and transient, and dry static, latent and kinetic heat energy (Trenberth and Stepaniak, 2003b). The latter is fairly small. The dynamic aspect covers the transport of energy from the equator to the poles, partially controlled by the fundamental constraints of an atmosphere and ocean in orbit.
Total energy involved is equal to excess of heat at the equator counterbalanced by the loss of longwave radiation from the 675 top of the atmosphere in the high latitudes. High albedo towards the poles also decreases towards the tropics, producing a much steeper gradient of absorbed solar radiation compared to outward longwave radiation (Trenberth and Stepaniak, 2004).
The limits to atmospheric heat transport can be considered energetically, dynamically or dissipatively Donohoe et al., 2020). The energetic approach seeks to balance surface and top of the atmosphere radiation, assessed zonally and accounting for feedbacks. The dynamic approach measures energy transport by distinct atmospheric circulations 680 at different latitudes. Both vertical and horizontal transport occur in the atmosphere. Vertical transport also contributes to horizontal transport through mechanisms such as Hadley Cells and along the pressure gradient from the equator to poles (Wallace et al., 2015). The diffusive approach takes meridional gradients in temperature, separating out the dry and moist components . Total energy transport from pole to pole produces a smooth sinusoidal curve but the contribution of each varies both spatially and temporally (Fig. 6). The interannual variability of both dry static and latent energy is much greater for each than 690 the variability of the two combined, appearing 'remarkably seamless' (Trenberth and Stepaniak, 2003b). When ocean meridional transport is added, the relationship is even more smoothly sinusoidal (Fig. 4) (Fasullo and Trenberth, 2008;Yang et al., 2015). The relationship also leads to energy flux being rebalanced across the hemispheres, maintained through both circulation and teleconnection (Liu and Alexander, 2007;Liu et al., 2017;Hu and Fedorov, 2018). This is also the case in CMIP3 models, which contain quite large errors, but maintain hemispheric symmetry . 695 Donohoe et al. (2020) compare MHT for the last glacial maximum, pre-industrial control and abrupt 4 x CO2. The effect of the latter on total meridional heat transport (MHT) is shown in Fig. 7. Marginal changes in MHT amounting to 2% of the total for the difference between 4 x CO2 and pre-industrial in an ensemble of 7 models (Donohoe et al., 2020). The largest differences are due to incoming radiation and the equator-to-pole temperature. Lower incoming radiation measured as effective forcing results in a larger temperature deficit. Cooler climates preference sensible heat transport in the atmosphere, 700 largely because of the limited effect of the tropics in dissipating moist static energy. These results show that MHT provides relatively constant boundary conditions under a wide range of climates. of-the-atmosphere radiation, both dry and moist static energy respond. The diffusive structure gives the opposite result -715 moist static energy and temperature control produce a top-of-the-atmosphere response. Armour et al. (2019) tested these results further, finding: • Moist static energy is needed in the diffusion process to produce arctic amplification, dry static energy cannot do it alone, emphasising the importance of water vapour.
• AHT changes are strongly constrained by the meridional patterns of forcing, feedbacks and ocean heat uptake but 720 are largely insensitive to the details. It is energetically constrained while being mediated by moist static energy, with temperature adjusting to the relationship between the two.
• Meridional AHT responses are energetically constrained rather than being a product of meridional variations in forcing. Radiative feedback uncertainty at the equator affects equatorial processes whereas polar uncertainties depend on all latitudes. 725 They make two qualifications: the pattern of atmospheric feedback depends on ocean heat uptake and surface warming, and circulation must obey energy constraints . Their suggestions for the latter are the maximised conversion of available potential energy to kinetic energy (Lorenz, 1955) or maximised entropy production (Ozawa et al., 2003).
As shown here, the pattern of atmospheric feedback depends on regime changes surface warming and heat transport. Shifts in latent heat transport occur within the HadISDH marine (Willett et al., 2020) land data (Willett et al., 2014(Willett et al., ) 1973(Willett et al., -2020 For specific humidity, the marine global and tropical records shift in 1987, 1997 and 2015, the NH similar except the last date is 2019 and the SH 2015 only. For land, global dates are the same, the tropics shift in 1997, 1995 and 2015, the NH in 1987 and 2015 with a modest response in 1997 (<p=0.1) and the SH not at all. The changes coincide with temperature and mainly concentrated in the NH. This is consistent with observed OLR increasing in the NH but not the SH, where there is little increase in moisture transport measured as specific humidity. Shifts involving both dry and moist energy will cause a 735 rapid response in cloud feedback, affecting absorbed solar radiation. Our interpretation of the Loeb et al. (2021)

results in
Sect. 3.4.1 is consistent with this, although that record is brief and inconclusive. However, the two climate models looked are consistent with this. The limits to meridional heat transfer therefore provide very strong boundary conditions where internally climate has to adjust to internal and external energy imbalance.

The Pacific Ocean heat engine and self-regulating climate network 740
This section combines the analyses and findings from both papers to provide a conceptual model for climate as a selfregulating complex system. Complex system science provides the overarching framework and the reasoning is guided by the least action principle (Nicolis and Nicolis, 2012). The robust components of a complex dissipative system are physicallyinduced and self-organised by following paths of least resistance. Least action is a catch-all strategy for considering efficiency where the governing principles are unknown, multiple boundary conditions and critical limits are in play, and 745 responses to forcing involve a complex mix of dynamic and thermodynamic processes. The principal response to historical forcing, decadal climate regime shifts, emerged in coupled models at roughly the same time as ENSO, so are a basic feature of coupling. Earlier predictions of theory and models were for a gradual response to forcing, but with the advent of coupled models, nonlinear responses emerged (Fig. 1). Under severe testing, decadal regime shifts have been confirmed as the principal medium for climate change under sustained radiative forcing (Jones and Ricketts, 750 2017). This produces step-ladder-like change that over the long term, forms a complex trend.
The change process can be separated into radiative and dissipative components following Ozawa et al. (2003). The radiative process governs the energy balance component of forcing which is additive, and the dissipative process, the thermodynamic component, is multiplicative. We propose adding thermodynamic forcing to the explanatory sequence (Fig. 2). The following subsections describe the historical evolution of climate modes (4.2.1), the heat engine structure and performance 755 (4.2.2) which builds on the description in JR21 Section 3.5, decadal oscillations and network behaviour (4.2.3) and selfregulation (4.2.4).

Climate modes
Global climate occupies different states that reflect the strength of external forcing and degree of coupling between the heat engine and broader climate network. Historically, climate has occupied free and forced modes that depend on the rate of 760 forcing and the capacity of the system to store and dissipate heat. These terms follow Lorenz (1979), but instead of referring to specific processes, internal and external influences are considered at the whole-of-climate scale. From the pre-industrial period to the late 1950s to early 1970s, climate was in free mode. Coupling was weaker, with major dissipative activity mainly restricted to the tropics, temperature was subject to random walks influenced by the AMO and both positive and negative shifts are recorded. 765 The forced state emerged as a series of regime shifts between 1957 and 1972, led by the global ocean and manifesting as white noise. Ocean temperatures had earlier whitened 1914-38 during the recovery from cooling, but then relaxed through to 1957. Land temperatures were unaffected during this period despite experience greater warming than oceans, whitening in 1972 under the influence of the oceans. By this time, climate was fully in forced mode, characterised by increased teleconnections, a more intense ENSO, the relationship between TWP and TEP intensified, resulting in successive positive 770 regime changes in response to thermodynamic forcing, caused by an excess of heat in the tropical ocean available for dissipation. Selected characteristics for free and forced modes are listed in Table 2.
No similar evolution is detected in the three climate models described in Section 3.2, which seem to be in a state somewhere between the two modes. Granger analysis shows that El Niño is less responsive to forcing in these models compared to observations, which is likely to be linked to broader concerns about the performance of GCMs in the tropics (Richter, 775 2015;Xiang et al., 2017;Eyring et al., 2019) and possibly coupling in general.
While the climate is in free mode, both negative and positive shifts can occur. The looser ocean-atmosphere coupling and more relaxed state of the heat pump in free mode may allow shifts to readily follow the direction of forcing, whereas in forced mode the momentum is such that a return to a free state may be required before a reversal is possible. Palaeoclimatic https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. evidence of rapid shifts suggests there may also be a cold forced mode during periods of strong negative forcing, such as the 780 Younger Dryas. A shift in response to internal forcing or combined internal and external forcing is also more likely in free mode, whereas competing directions in forced mode will tend to extend periods of steady-state conditions, as was the case for the 1997-2014 hiatus. This shows the advantage of a heat pump that can work in two directions, in contrast with a conventional heat engine.

Heat engine structure and performance
With a reverse structure of cold-to-warm reservoirs with throughflow sustained by kinetic energy, the Pacific Ocean heat engine is technically a heat pump. It sits within a tropical-hot to polar-cold framework of a more conventional heat engine, so is a heat pump within a heat engine. The heat pump is linked to the broader climate network through teleconnections so governs the performance of the heat engine in transporting energy from the tropics to the poles. 790 The two reservoirs of the Pacific Ocean heat engine have different thermodynamic characteristics. TEP, the cold reservoir, is characterised by high negative feedback at the ocean surface and is very stable. TWP, the hot reservoir has a lower negative feedback at the surface (Song and Yu, 2013). Its shallow thermocline, limited circulation and high divergence makes TWP much more sensitive to changes in surface heat flux than TEP (Seager et al., 1988;Harrison, 1991). The maintenance of a ~2 °C gradient under two climate modes, with negative regime shifts followed by a succession of positive regime shifts, shows 795 a capacity for self-regulation. This is supported by the maintenance of a constant gradient between TEP and TWP within climate models (Sect. 3.3.1). A Pacific Ocean heat engine-like structure emerges in a coupled aquaplanet model when zonal asymmetry in the form of a simple land ridge is introduced (Wu et al., 2021). The maintenance of Earth-like meridional heat transport suggests it plays a similar role in dissipating heat, so the heat pump emerges when asymmetry requires the capacity https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License.
to balance meridional heat transfer across both hemispheres. Deep ocean overturning and upwelling also plays a part. The 800 location of land and its topography influences where these features are located.
Shorter-term variability also plays a part in these responses. The intensification of El Niño sees rapid reversals in lagged correlations between TWP and TEP and between these and the global ocean, land and combined temperature. The locus of El Niño evolution has transferred from the eastern to western Pacific and pan-Pacific events have intensified . El Niño is the main vehicle for the propagation of heat through the climate system but this role is not causal: regime 805 shifts are caused by an energy imbalance (see next section) while the specific mechanism remains unknown.
The evidence from tracking shifts in observations and some models implies that shifts are initiated in the region where that imbalance is most acute. Historically, some shifts in free mode originated in the mid-latitudes, whereas in forced mode they originate in the tropics where energy is accumulating. A downward shift in the early 1900s in the S Pacific Ocean 30°-60° S was echoed in the NH mid-latitude oceans, the Atlantic and TWP followed by the Pacific and global ocean. This created a 810 deficit not made up until 1936-39. All shifts in the historical record originated in the oceans except 1920-21, traced to 60°-90° N land before propagating to the tropics. Solar forcing and/or a delayed response to staggered mid-latitude warming in the oceans in the preceding decade are possible drivers. Shifts in forced mode originate in the tropics, all but one in the heat engine. For models, 47% of all shifts in the heat engine coincide with GMST ±1 year, but not all shifts originate in the tropics during the 21 st century of the three models tested. There was no clear sign of change from free to forced mode in the 815 models, although El Niño intensifies in the latter part of the record.
The following aspects are robust in the CMIP5 models: • The TEP region is represented by shifts that are about half as frequent and double the size as those in the west. In models, these reflect the historical pattern and the TEP/TWP frequency and size ratios are strongly related to model skill. 820 • Shifts in TEP are more highly correlated with ECS. This fits in with patch and pacemaker experiments, which show that warming in the eastern Pacific has high feedback and in the western Pacific has low or negative feedback.
Applying regressions between TEP and TWP shift size and ECS suggests a climate sensitivity of 3.1±0.6 °C.
• The gradient between TEP and TWP in the models remains constant under warming.

Decadal oscillations and network behaviour 825
The interactions between patterns of SST across different ocean basins form a "network of teleconnections" (Cassou et al., 2018), with the heat engine at its centre. The major shifts in temperature coincide with phase shifts in the PDO and AMO, positive phase shifts generally leading and negative shifts lagging. Phase shifts in the major oscillations usually coincide with an ENSO event of the same sign.
The PDO phase modulates ENSO, the TWP-TEP gradient and the rate of regime shifts. The positive and negative phases 830 oscillate between atmospheric zonal transport and meridional subsurface transport of heat. Since 1880, the PDO has been positive 57% of the time for 80% of total warming, negative for 43% of the time for 20% of total warming. Granger analysis https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. detects no influence between PDO and other regions or with the AMO. The exception is a lag-15 influence on the ocean to the PDO and from the PDO to ocean and land that may represent the 1998 and 2014 coincident PDO mode and regime shifts (1997 and 2013), rather than being directly causal. The recent negative mode lasted for 15 years compared to previous 835 periods of 22 and 29-years suggesting that PDO periodicity is influenced by forcing. A major role of the PDO may be to maintain symmetry across both hemispheres via the IPO. The north and south sides of the warm pool have a similar role (Feng et al., 2018;Zhang et al., 2018). Therefore, while the PDO shows the influence of teleconnections in real time, no delayed influences have been detected.
Linkages between the major ocean basins play an essential role as part of climate variability (JR21 Sec. 3.3). If considered as 840 part of a self-regulating complex system, this role becomes even more important. The AMO is understood to be linked to AMOC (Kim et al., 2021;Sun et al., 2021) and telecommunicates with the northern and tropical Pacific (Sun et al., 2017).
Forcing may have influenced the AMO more recently, notably during the last phase shift in 1995, but little external influence is detected in earlier free mode. In forced mode, the AMO has an ENSO-like link to TEP during stationary periods peaking at lag-2-3 and TEP forces AMO at lag-2 during regime shifts. The NH hemisphere also has an influence at lag-1, 845 presumably a flow-on effect. The AMO has a flow-on effect to TWP at lag-1. Cross-basin linkages between the Atlantic and Pacific suggest that the influences in the PDO are on a ratio of two internal to one from the Atlantic for a total of 52-73% although some of that may be due to interdependency (Johnson et al., 2020).
The timing of shifts in the AMOC index in 1969, around the time that climate moved from free to forced mode and to PDO phases in 1997 and 2014, suggests an information exchange between the Pacific and Atlantic. AMOC is usually seen as 850 influencing the AMO (Marini and Frankignoul, 2014;Zhang et al., 2019) but two-way oscillatory relationships have been observed in model studies (Zhang and Wang, 2013;Kim et al., 2021). The recent changes in a range of surface variables that show a relationship between Atlantic multidecadal variability and AMOC are consistent with recent direct estimates of AMOC flow volumes (Sun et al., 2021).
Regime changes switch between slow and fast dissipation paths. The largest increase is in the transport of moist static energy 855 at the expense of slower ocean heat transport. Circulation also makes a transition from direct circulation to being more teleconnected, ENSO plays an increasingly important role and the behaviour of the PDO is more pronounced (shifts are more distinct). Upward mass transport is reduced (Wu et al., 2019). The AMOC Index shifts in 1968, consistent with slowing and also shifts with the last two phase changes in the PDO. Shifts in specific humidity are also regime-like and coincide with temperature with a strong NH bias. Regime shifts from slow to fast dissipation pathways include from the 860 ocean to the atmosphere, from dry to moist heat transport, from direct circulation to teleconnection and the increased involvement of radiative feedbacks such as those involved in cloud forcing. Shifts in temperature in the ocean transmitted from the tropics move into the extratropics and from ocean to land. These result in rapid adjustments to atmospheric feedback and absorbed solar radiation subject to cloud feedbacks and increases in water vapour. Although not included in the data archive for JR21, Granger analyses for zonal regions show interhemispheric influences in response to forcing; for 865 example, 60-90° N to Australia in the SH mid latitudes. The increased transport of atmospheric heat poleward results in https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. polar amplification. As the high northern latitudes warm, ocean buffering in the southern hemisphere requires complementary warming over land to maintain equivalent dissipation rates in both hemispheres.

Self-regulation
Based on the analysis of regime changes in observed climate, JR21 propose that climate is a self-regulating system, governed 870 by the heat pump configuration of a heat engine in the tropical Pacific, teleconnected to a broader climate network. The difference with a conventional heat engine is the application of work to a permanent cold reservoir which is heated on the way through to a hot reservoir, allowing it to shift warmer or cooler depending on the supply of kinetic and potential energy.
These are the hallmarks of a self-regulating system. Climate regimes occupy non-equilibrium steady states. Perturbations over the ocean return to steady-state in a matter of months, even at basin scale where negative feedbacks are weaker 875 (Frankignoul and Kestenare, 2002;Park et al., 2005). The central Pacific is the most stable of the regions tested (note we have not included 60-90° S because of data quality and availability). It forms a highly stable attractor and the size of the shifts in observations of 0.46 °C can be used to represent the level of instability required to shift it into a warmer state. TWP is less stable, requiring up to 0.25 °C. For comparison, in models TEP is 0.49±0.18 °C and TWP 0.26±0.14 °C. The ratio of frequency and magnitude between the two shows a close relationship with model skill. Observations overall average 0.22 for 880 steps in SST and 0.37 °C for steps over land. Shifts in SST precede those over land in most cases, When moving from free to forced climate. the Granger analyses in JR21 show stronger coupling, with increasing influence of temperature from higher latitudes moving through the heat engine, which becomes the epicentre for regime shifts. Weaker coupling in the climate models leads to weaker relationships between the heat engine and GMST compared to observations, but all models are p<0.05 when links between TEP, TWP and the global domain ±1 year are tested against a spatially 885 random null, with area weighting.
The order of emergence in coupled models, levels of complexity and specific features provides a guide as to which aspects of complex behaviour are more fundamental. The heat engine arises in a symmetric aquaplanet with dissipation forming zonal bands. With asymmetry, a heat pump structure arises (Zhang et al., 2016;Wu et al., 2021), and decadal variability and meridional overturning has been identified (Ferreira et al., 2011;Jamet et al., 2016). Whether regimes emerge at the same 890 time is yet to be determined. The location of deep ocean overturning depends on the placement of continents and basins. The current configuration of the Pacific Ocean heat engine, ENSO AMOC are due to the positions of land and its topography, especially the Tibetan Plateau (Kitoh, 2007;Naiman et al., 2017;Yang and Wen, 2020).
In a perfectly zonal world both hemispheres would be identical, with symmetrical meridional heat transport driven by a conventional equator-to-poles heat engine. Any orbital tilt, moon, equatorial barrier, land or asymmetric sea floor appears to 895 cause a warm pool to form, prompting the emergence of upwelling, a cold tongue and a heat pump structure. The presence of decadal variability and attractors in such models (Zhang et al., 2016;Brunetti et al., 2019) shows that complex system behaviour is induced by the need to maintain symmetrical radiative balance between the hemispheres through meridional https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License. heat transport, which induces a teleconnected network with non-equilibrium steady states. On these criteria any orbiting body with an atmosphere and ocean in internal disequilibrium will be self-regulating. 900 The current configuration of meridional heat transport is due to tectonic processes influencing land, oceans and ice. There is a bias towards the NH because of high albedo in the southern polar regions, and a bias towards northward atmospheric heat transport because of the larger area of land in the NH. Oscillations that help maintain energy symmetry in the dissipative process include the IPO, Atlantic decadal variability and AMOC, and potentially many of the other shorter-and longer-lived The two GCMs analysed for net change in top of the atmosphere radiation decrease by 0.62 W m -2 in 1996 and 0.52 W m -2 in 1993 from the pre-industrial steady state. Both models maintain steady-state to 2100 with two and three shifts, despite 910 different pathways in top-of-the-atmosphere changes for shortwave and longwave radiation. It is possible that the observed deficit was established at around the time climate moved from free to forced mode.
The organising principle in terms of how energy is governed is unclear but needs to be able to explain both free and forced states. One possibility is energy rate density where regimes shift from high-mass, slow-moving transport to low-mass, fastmoving transport such as latent-heat transfer, teleconnections and cloud feedbacks. This has been proposed as an organising 915 principle for the atmosphere where it shows scaling properties over space and time (Lovejoy, 2019). It has also been proposed as a more general metric for complexity in energetic systems (Chaisson, 2011).
Climate is currently defined statistically by both the IPCC and World Meteorological Organization, which is unsatisfactory if it is to be considered as more than the collective statistics of weather (Frigg et al., 2015;Werndl, 2016b;Katzav and Parker, 2018). These findings offer the possibility to develop a physical explanation that focuses on the self-organisation of weather 920 into a series of states, with decadal regimes forming the basic unit, within a hierarchy of longer-lasting states to those potentially lasting millions of years. These mega-states include snowball, waterbelt, melancholia, temperate and moist greenhouse states (Lucarini and Bódai, 2017;Wolf et al., 2017). Self-regulation also invites comparison with the Gaia hypothesis, which proposes that life has modified the climate to fall into the more favourable latter states favouring the development of a complex biosphere (Lovelock and Margulis, 1974;Margulis, 1981;Watson and Lovelock, 1983). Lovelock 925 and Margulis originally referred to Gaia as homeostatic but later changed this to homeorhetic, where the system has the capacity evolve to a more favourable stable state (Lovelock, 1986) and self-regulates for function rather than form (Margulis, 1990).
In comparison, climate is homeostatic, occupying an energetically-preferred state until forcing is sufficient to trigger a shift to a new state. To distinguish this from the Gaia hypothesis/principle we propose Gharma, a Sanskrit word. This has many 930 meanings relevant to the heat engine including heat, warmth, the hot season, cauldron, sweat, a cavity in the earth shaped like a boiler (Hiemstra, 2021). This is fitting for a mechanism that collects and distributes heat energy. General descriptions https://doi.org/10.5194/esd-2021-62 Preprint. Discussion started: 22 July 2021 c Author(s) 2021. CC BY 4.0 License.
of Gaia tend to make life the active participant that acts on climate but have largely limited that to cybernetic control (Rubin and Crucifix, 2019). The strong biological version of Gaia is guided internally by a regenerative network that leads to autopoiesis -self regeneration. Rubin and Crucifix (2019) address this through a biological lens but it could equally apply to 935 ecosystems, where large ecosystems are very stable and the biosphere as a system of ecosystems. Alternatively, Gharma is representative of a cybernetic system where feedback closes the loop between incoming and outgoing energy via the heat engine and network of circulation and teleconnection. In this sense, there is a two-way relationship between Gaia and Gharma, where Gharma provides steady-state conditions that fit in with most life cycles, and Gaia has contributed to a planetary atmosphere and biogeochemical environment that favours a temperate greenhouse state much more than a 940 snowball or waterbelt greenhouse state.

Conclusion
JR21 began by distinguishing between complicated and complex systems and how they apply to the scientific understanding of climate. The standard approach treats climate as a complicated system. Decadal-scale climate responses to forcing, feedbacks and internal variability are considered to be additive, allowing them to be deconstructed and reconstructed to 945 explain observed fluctuations (e.g., Schmidt et al., 2014;Risbey et al., 2018). However, if the climate system exhibits complex system behaviour on these time scales, this approach may no longer be valid. Rind (1999) asks whether complexity is a matter of concern. He describes climate as self-organising into quasi-stable states, but also separates the complex and forced responses in his commentary, addressing possible interactions between the two. This is a widely shared view that considers variability to be complex, and radiatively-forced change to be deterministic. 950 Interactions between change and variability are increasingly being recognised as complex system behaviour (e.g., Nicolis and Nicolis, 2010;Tantet and Dijkstra, 2014;Tsonis, 2018;Agarwal, 2019;Dijkstra, 2019;Ghil and Lucarini, 2020), but this is mostly being applied to versions of linear response theory (Ragone et al., 2016;Lucarini et al., 2017;Tsonis, 2018;Ghil and Lucarini, 2020). This allows the constructionist hypothesis to be applied, except where it is disrupted by emergent behaviour; for example, where forcing results in a tipping point being crossed (Lenton and Williams, 2013;Drijfhout et al., 955 2015;Bathiany et al., 2018;Lenton et al., 2019).
These papers show that the climate system is fundamentally complex. The identification of the Pacific Ocean heat engine, the importance of its reverse heat-pump structure and links with the broader climate network via teleconnection reveals climate as a self-regulating, homeostatic system. The entire climate system resembles a conventional heat engine bridging from the hot tropics to the colder poles, but its inner structure produces more complex behaviour. JR21 concentrates on 960 observations and this paper addresses heat engine performance and complex system behaviour in climate models. It describes how the presence of emergent behaviour challenges standard analytic approaches, identifies the importance of thermodynamic forcing and explores boundary conditions that constrain dissipative processes. Sect. 3.2.4 lays out the reasons why change and variability cannot be divided into different physical pathways. In response to internal and external forcing, climate operates as a single system. The distinction between change and variability on 965 historical and statistical grounds, or for policy purposes all have their place, but they do not provide an adequate physical description. Climate consists of a succession of non-equilibrium steady states that operate within an envelope of available potential energy. If the energy available for dissipation falls above or below a critical limit, climate will shift to a new steady-state, its location and size depending on where and how much an imbalance exists.
Understanding the role of emergence and how it can be analysed and understood needs to guide the overall research 970 approach. This is understood through the emergentist hypothesis, where weak emergence is dependent on basic physical phenomena and amenable to reasoning via first principles, and strong emergence by the presence of phenomena and rules that cannot be predicted from their base (O'Connor, 2020). The above association with variability is weak emergence whereas climate self-regulating between non-equilibrium steady states in response to forcing, nominated as the Gharma hypothesis, qualifies as strong emergence. 975 The emergence of complex system behaviour in climate is a product of coupling between the atmosphere and oceans.
Understanding the process of emergence provides a more fundamental understanding of climate behaviour. Boundary conditions provide critical limits to this behaviour. These include energy limits provided by external forcing that are linearly proportional to radiative forcing, dissipative limits (thermodynamic) set by meridional heat transfer limits and physical limits set by the position of oceans, continents and topography. 980 Emergent phenomena include the heat engine, ENSO, decadal variability and regime shifts in various forms. Coupled aquaplanet experiments show that the heat pump, ocean upwelling and decadal variability emerge with asymmetric circulation. They provide the capacity to balance meridional heat transfer between the hemispheres in order to ensure equal amounts of energy loss to space. Thermodynamic forcing is driven by energy imbalance at the planetary surface. A deficit at the top of the atmosphere may result in the heat engine 'working' harder, much like an engine moving to a higher gear (free 985 to forced 1960s-2014, to forced post-2014), but this remains to be determined.
Model underdetermination and observational equivalence both affect the understanding of climate change.
Underdetermination occurs when models provide more than one explanation for an event, phenomenon or process. Current methods cannot readily distinguish between dynamic and thermodynamic influences for many phenomena (e.g., ENSO Sect. 3.2.2). The explicit recognition of complex system behaviour would aid this. For example, instead of climate being separated 990 into linear and random/stochastic components, assessing thermodynamic forcing through regime shifts can help distinguish between dynamic and thermodynamic influences. Thermodynamic influences contribute to energetic states and state change, and dynamics are distributive. This subtly redefines the free and forced variations of Lorenz (1979) from internal and external to forced and distributive. For example, they both may show stochastic behaviour, but the 1946-47 PDO phase change was redistributive and the following two phase changes associated with shifts in climate (1996-98, 2014-15) 995 involving both dynamic and thermodynamic influences. Full recognition of complex system behaviour in climate would require the following issues to be considered: • Ensure that the observed climate is the principal area of interest rather than models. Models are often preferred 1000 because observational data is limited but important behaviour may be overlooked because it is not in the model or not well simulated.
• Pre-industrial climate runs with coupled models currently have limited utility as the null climate case because of insufficient internal sensitivity. Consequently, the behaviour of teleconnected network and ensuing variability is subdued. 1005 • Efforts to improve climate model performance should focus on improving heat engine behaviour in the tropics and linkages with the extratropics as a greater priority. Reproducing the transition from free to forced mode is an important goal. Tighter internal coupling will improve the hydrological cycle and make tropical expansion more realistic.
• Use the principles of complex system behaviour to inform idealised model studies, rather than the reductive-1010 constructive methods that currently dominate.
• Place less emphasis on model ensemble consensus except for long-run results and instead investigate regime shifts and complex system behaviour likely to result in rapid changes in climate-related risk over the coming century.
• Investigate emission reduction and overshoot scenarios to see what radiative forcing pathways support stabilisation in the tropics and a potential reversal to free mode, as opposed to those that are unlikely to. 1015 • Change the emphasis on decadal prediction from producing trends and develop methods to detect and predict regime shifts.
• Urgently understand what these changes mean for risks being experienced now and those that may affect socialecological systems (including the economy) within their strategic planning horizons.

Code availability 1020
The experimental version of the MSBV code used here can be found in Jones and Ricketts (2021a), in addition to a manual version in Microsoft Excel©. As this material is still in review, reviewer access is via: https://datadryad.org/stash/share/7OmkMfNbdgFdPIxvD5qyqhmrzh69ug_tT5MNZ2JMPTQ