Multi-century dynamics of the climate and carbon cycle under both high and net negative emissions scenarios

. Future climate projections from Earth system models (ESMs) typically focus on the timescale of this century. We use a set of five ESMs and one Earth system model of intermediate complexity (EMIC) to explore the dynamics of the Earth’s climate and carbon cycles under contrasting emissions trajectories beyond this century, to the year 2300. The trajectories include a very high emissions, unmitigated fossil-fuel driven scenario, as well as a mitigation scenario that diverges and in one model in the overshoot scenario. While ocean carbon cycle responses qualitatively agree both in globally integrated and zonal-mean dynamics in both scenarios, the land models qualitatively disagree in zonal-mean dynamics, in the relative roles of vegetation and soil in driving C fluxes, in the response of the sink to CO 2 , and in the timing of the sink-source transition, particularly in the high emissions scenario. The lack of agreement among land models on the mechanisms and geographic 40 patterns of carbon cycle feedbacks, alongside the potential for lagged physical climate dynamics to cause warming long after CO 2 concentrations have stabilized, point to the possibility of surprises in the climate system beyond the 21st century time horizon, even under relatively mitigated global warming scenarios, which should be taken into consideration when setting global climate policy.

and in one model in the overshoot scenario. While ocean carbon cycle responses qualitatively agree both in globally integrated and zonal-mean dynamics in both scenarios, the land models qualitatively disagree in zonal-mean dynamics, in the relative roles of vegetation and soil in driving C fluxes, in the response of the sink to CO2, and in the timing of the sink-source transition, particularly in the high emissions scenario. The lack of agreement among land models on the mechanisms and geographic 40 patterns of carbon cycle feedbacks, alongside the potential for lagged physical climate dynamics to cause warming long after CO2 concentrations have stabilized, point to the possibility of surprises in the climate system beyond the 21st century time horizon, even under relatively mitigated global warming scenarios, which should be taken into consideration when setting global climate policy.

Introduction 45
Climate change is characterized by long timescales, associated with the accumulation of carbon in the atmosphere and other reservoirs of the Earth system due to emissions of CO2 by anthropogenic activities, and the response of the climate system to the accumulated atmospheric CO2 burden. The long lifetime of CO2 in the atmosphere (Archer et al., 2009;Joos et al., 2013) and the proportionality between global warming and long-term cumulative CO2 emissions are central features of the dynamics of the climate system (Matthews et al., 2009;Allen et al., 2009). These features underlie the widely used policy framework 50 that proposes a 'budget' of remaining carbon emissions that would enable the climate system to remain below a given temperature . Future transient climate change scenarios using comprehensive Earth system models (ESMs) have typically focused on the timescale to the end of the 21st century, in order to inform near-term policy actions that may mitigate climate change. This end date of 2100 for these simulations has remained fixed, even though over 30 years have elapsed since the first IPCC assessment (IPCC, 1990).The start date of future scenarios has accordingly progressed from 1990 55 to 2015, shortening the length of these scenarios. Longer-term dynamics have been explored mainly using Earth system models of intermediate complexity (EMICs) (Zickfeld et al., 2013) and climate system emulators (Meinshausen et al., 2011(Meinshausen et al., , 2020Nicholls et al., 2020b), which allow exploring such dynamics without the computational costs of resolving the full physical and biogeochemical dynamics of an ESM. EMICs (and even more so emulators) typically represent land and ocean biogeochemical processes relevant to the long-term carbon cycle with less detail than comprehensive Earth system models, 60 and so risk missing critical interactions and feedbacks. In contrast, ESMs have prioritized representing processes relevant on timescales to 2100, and may exclude or simplify processes important on longer timescales, such as permafrost carbon feedbacks on land, or sediment biogeochemistry in the ocean.
Initial studies using ESMs on this longer time horizon suggest that the proportionality of warming to carbon emissions that is 65 both historically observed and projected on shorter timescales also holds on multi-century timescales, in unmitigated high-end warming scenarios (Randerson et al., 2015;Tokarska et al., 2016). It is less clear whether these relationships hold under mitigated or overshoot scenarios, where net negative carbon emissions are assumed later in the scenario, but the expectation is that the proportional relationship approximately holds for negative carbon emission as well (Zickfeld et al., 2016). Simple models show that the existence of a cumulative emissions to warming proportionality in such scenarios is sensitive to the 70 response timescales of physical and biogeochemical feedbacks in the Earth system (Sanderson, 2020). Existing experiments using ESMs and EMICs suggest that during a positive emissions phase, marine and terrestrial carbon cycles tend to absorb some fraction of added CO2. During a removal phase, however, they tend to release CO2 and thus partially offset the decline in atmospheric CO2. As a result, we expect that under a scenario with positive emissions followed by net-negative emissions, warming remains approximately proportional to cumulative CO2 emissions, but with an additional delay possible due to lags 75 in the carbon cycle and thermal response to changing CO2 (Tokarska and Zickfeld, 2015;Jones et al., 2016;Zickfeld et al., 2016;Tokarska et al., 2019).
To better understand the long-term dynamics of the carbon and climate systems, here we compare a set of five ESMs and one EMIC under a pair of high emissions and overshoot future climate scenarios that diverge in emissions in the mid-21st century, 80 and explore the dynamics of carbon and climate under these contrasting trajectories. Further, because these models all report more detailed information that can allow some degree of process attribution to the dynamics, we separate the carbon cycle responses geographically, land from ocean, and on land we separate the soil and vegetation responses. We thus also explore whether and where the predicted carbon and climate responses are relatively robust, both within any one model over time and between scenarios, as well as across models for any given scenario and time period. This allows us to explore where model 85 agreement does and does not exist, both in the globally integrated response, as well as in the regional and process drivers of that response. Where ESM or EMIC behavior shows either fundamental disagreement on geographic or process drivers for feedbacks, or shows global dynamics deviating significantly from the expected linearity between warming and cumulative emissions, we interpret it as showing a potential for surprises in the future dynamics of the Earth system.

Scenario Descriptions
All models were forced using the SSP5-8.5 and SSP5-3.4-overshoot scenarios (Riahi et al., 2017;Gidden et al., 2019;Meinshausen et al., 2020). These scenarios were constructed as part of the CMIP6 set of coordinated experiments for ESMs , and arose out of the ScenarioMIP and SSP design effort (O'Neill et al., 2014(O'Neill et al., , 2016 to cover a wide range of socioeconomic and policy scenarios and resulting trajectories of greenhouse gas forcings to the Earth system. The simple 95 extensions beyond 2100 were adapted from those originally conceived in O'Neill (2016), as described in Meinshausen et al., (2020). Both of these scenarios follow the SSP5 21st century 'storyline' , which is premised on strong economic growth, relying largely on fossil fuels in the no-climate-policy baseline. However, they diverge in the year 2040: the SSP5-8.5 scenario continues on to long-term emissions growth, an 8.5 W m -2 anthropogenic greenhouse gas radiative forcing by 2100, and CO2 increases until the mid-23rd century. The SSP5-3.4 scenario dramatically changes course in 2040 after 100 emissions peak, and continues with sustained net-negative CO2 emissions until the mid-22nd century before stabilizing to nearzero emissions thereafter; the scale of negative emissions required for this overshoot (>5 PgC/yr for several decades) is much larger than in the RCP2.6 scenario reported by Jones et al. (2016). These net negative emissions are largely driven by biomass energy with carbon capture and sequestration (BECCS), which in the land-use drivers of the scenarios is associated with a large conversion of pasture to crop lands (O'Neill et al., 2016). However none of the models explicitly track BECCS-related 105 harvest fluxes.
All models were forced with specified atmospheric greenhouse gas concentrations (not greenhouse gas emissions) (Meinshausen et al., 2020), and land use change forcings (Ma et al., 2020;Hurtt et al., 2020). In a concentration-forced ESM simulation, the land and ocean carbon cycles respond to the CO2 concentrations in the atmosphere, which are specified via a 110 global-mean timeseries (Fig. 1a), but do not feed back on atmospheric CO2. By integrating the total atmospheric CO2 reservoir, alongside the prognostic carbon reservoirs on land and ocean, compatible fossil fuel and industrial CO2 emissions can be calculated to satisfy the conservation of carbon within the Earth system, such that anthropogenic fossil fuel and industrial emissions equal the sum of total changes in land, atmosphere, and ocean carbon stocks (Liddicoat et al., 2021). Land-usedriven carbon emissions are directly reflected in changes in the terrestrial carbon inventories and thus cannot be separately 115 inferred based on terrestrial model dynamics themselves, as they are mixed with the model responses to changing climate and CO2. For the analysis of global temperature change as a function of cumulative emissions, we include a land-use term as well as the inferred fossil fuel term in the emissions; this land use emissions term comes from the REMIND-MAgPIE IAM used to specify the SSP5-8.5 and SSP5-3.4-overshoot scenarios  as harmonized in Gidden et al., (2019), and thus does not differ between the models. Because of differences in carbon cycle feedbacks and in the representation of land use 120 fluxes, the CO2 emissions inferred by ESMs to be consistent with a given CO2 concentration pathway will not generally be equal to the CO2 emissions that were provided by the IAM community for each scenario (Riahi et al., 2017;Gidden et al., 2019;Meinshausen et al., 2020), as shown below. The reason for this is that the representation of C cycle in ESMs is different from the model (MAGICC7) used to convert the IAM emissions into atmospheric CO2 concentrations in the first place (Meinshausen et al., 2020). In the long-term extensions, land use is held constant after 2100, and the land-use fluxes used to 125 calculate atmospheric CO2 concentrations in the scenario specification go linearly from their 2100 value to zero at 2150, as described in Meinshausen et al., (2020).

Model Descriptions
Here we use results from five ESMs and one EMIC to explore the responses of the Earth system to the two long term scenarios.
The models used here were the only models that had performed and archived the necessary experiments as of the time of 130 writing. The five ESMs, all from the CMIP6 generation of models, are the Canadian Centre for Climate Modelling and Analysis fifth-generation Earth System Model, (CanESM5) (Swart et al., 2019d), Community Earth System Model, version two, Whole Atmosphere Configuration (CESM2-WACCM6) (Danabasoglu et al., 2020), the Centre National de Recherches Météorologique (CNRM) CNRM-ESM2-1 , the Institut Pierre Simon Laplace (IPSL) IPSL-CM6A-LR (Boucher et al., 2020), and the U.K. Earth System Model (UKESM1) . The EMIC is the University of 135 Victoria Earth System Climate Model, version 2.10 (UVic-ESCM) (Mengis et al., 2020). Below we list some salient features of these models, and include more detailed model descriptions in Appendix A. In addition, further details on the ocean and marine biogeochemical components of these ESMs can be found in Arora et al., 2020;Canadell et al., 2021).

140
Of the models used here, there are several key differences in their land surface representations that may in principle govern the responses under these scenarios. Dynamic vegetation maybe particularly important both on longer timescales and in response to larger climate forcings, as ecosystems shift and reorganize in response to the changes; of the models here, only two (UKESM1 and UVic-ESCM) include a dynamic vegetation component while the rest assume fixed vegetation distributions. A terrestrial nitrogen cycle is particularly important in governing the response to both CO2 and warming, as 145 nutrients may limit the ability of plant productivity to increase under CO2, and nutrient release due to warming soils may increase productivity; here the CESM2-WACCM6 and UKESM1 models both include a nitrogen cycle. A representation of carbon in permafrost layers may allow for large carbon releases from high latitudes in response to warming, and here two of the models (CESM2 and UVic-ESCM) include some representation of this process. Three of the models here (CESM2-WACCM, CNRM-ESM2-1, and UKESM1) distinguish between crop and pasture lands, which is relevant to the overshoot 150 scenario and its large expansion of croplands from pasture.
We apply a seven-year running mean to all global timeseries in order to remove the short-term dynamics and focus on longerterm variability.

Climate Responses
In the historical period and SSP5-8.5 scenario, global mean temperature change relative to the preindustrial (Fig. 1b) increases monotonically in all models, with a wide range of responses by 2300 from ~18 ºC in the CanESM5 to ~8 ºC in the UVic-ESCM model. Here, a notable difference arises between the ESMs and the UVic-ESCM EMIC, with much higher transient warming in the ESMs than in the EMIC. This is at least in part due to a sampling bias related to the set of models that have 160 performed these long-term scenarios: four of the five ESMs used here report transient climate response (TCR) greater than 2.3 ºC and a transient climate response to emissions (TCRE) greater than 2 o C/EgC (mean of 2.16 o C/EgC) versus the CMIP6 mean of 2.0 ºC TCR and 1.8 o C/EgC TCRE ; whereas the specific version of UVic-ESCM used here reports a TCR of 1.8 ºC and a TCRE of 1.8 o C/EgC , closer to the CMIP6 mean. The one ESM with lower sensitivity, CNRM-ESM2-1, reports a TCR of 1.84 o C and a TCRE of 1.63 o C/EgC . During the period of 165 CO2 stabilization and decline in the 23rd century, four of the ESMs continue to warm substantially, while in one ESM (UKESM1) and the EMIC, the global temperature stabilizes. Since these are concentration-forced experiments, this divergence in long-term warming after stabilization of CO2 concentration implies a substantial slow component to the physical climate feedback in the models that continue to warm, beyond the effective transient values reported above, which reflect short-tomedium term feedback processes that dominate the TCR (and implicitly the TCRE) (Proistosescu and Huybers, 2017). 170 In contrast, in the SSP5-3.4-overshoot scenario, global temperatures follow the CO2 concentration trajectory to first peak and then cool during the 21st century in all models. Subsequent dynamics vary between the models: most stabilize at a cooler temperature than the peak 21st century value, while one model (CESM2-WACCM) reaches a minimum temperature at ~2200, and then resumes warming, albeit at a slower rate, during the 23rd century. As in the very high emissions scenario, there is a 175 separation in the amount of warming between the relatively less sensitive EMIC and more sensitive ESMs both at the peak and in the subsequent overshoot and stabilization period.

Carbon Cycle Responses
Responses of the globally-integrated terrestrial and marine carbon cycle to the two scenarios for all models are shown in fig.   1c-d, and reported in table B1. Under both the SSP5-8.5 and SSP5-3.4-overshoot scenarios, the terrestrial carbon cycle (Fig  180   1c) in all models shifts at some point from being a net sink of carbon from the atmosphere to a neutral or net source of carbon to the atmosphere. In the SSP5-8.5 scenario, the timing of this transition varies widely between models, from ~2100 in UVic-ESCM to ~2220 in CESM2-WACCM. The magnitude of the carbon fluxes also varies widely between models, with CanESM5 showing strongest terrestrial uptake, peaking around 2100, and then reversing to become the strongest terrestrial carbon source out of the models examined here during the 23rd century. Model spread of the land sink increases substantially from the 21st 185 to the 22nd century in the SSP5-8.5 scenario, as indicated by the increasing standard deviation across the ensemble of cumulative sink from 264 ±172 Pg C for the period 2015-2100 to -29 ±264 Pg C for the period 2100-2200, shown in table B1.
Overall, the pattern of terrestrial sink-to-source transition under long-term high emissions is qualitatively consistent with the results of Tokarska et al. (2016), which show a similar transition in all of the models examined in the RCP8.5 extension 190 experiment. This pattern follows from the dynamics described by Randerson et al. (2015) whereby terrestrial carbon-climate feedbacks strengthen over time, at the same time that the terrestrial carbon-concentration feedbacks weaken, although the experimental protocol followed here, which does not separate CO2 climate and physical effects as in , does not allow this feedback decomposition to be performed.

195
For the SSP5-3.4-overshoot scenario, model agreement of the terrestrial carbon cycle is much higher, with all models transitioning from sink to source during the late 21st or early 22nd centuries, which counteracts some of the net-negative anthropogenic emissions by that time in terms of their effect on lowering atmospheric CO2 concentrations. The ensemble spread in cumulative carbon uptake also narrows from the 21st century (146 ±78 Pg C) to the 22nd century (-60 ±48 Pg C).
This change in sign is consistent with the CMIP5 RCP2.6 results shown in Jones et al. (2016). The timing of the biospheric 200 switch from sink to source follows by decades the change in the sign of CO2 emissions from net positive to net negative. All of the models then revert to a roughly carbon-neutral terrestrial biosphere during the 23rd century. Notably, models across the ensemble show a reduced range of variation in the magnitude of carbon fluxes for the SSP5-3.4-overshoot scenario, relative to the SSP5-8.5 scenario.

205
Over ocean (Fig 1d, table B1), inter-model agreement is in general much higher than over land, although ensemble spread does increase beyond 2100 in the SSP5-8.5 scenario, from a cumulative uptake of 392 ±31 Pg C in the period 2015-2100 to 445 ±71 Pg C in the period 2100-2200. Peak carbon uptake for both scenarios occurs prior to 2100 in all models, with an earlier and smaller-magnitude peak in the SSP5-3.4-overshoot than the SSP5-8.5 scenario. In the models, the ocean carbon uptake then gradually weakens but remains positive through the 22nd and 23rd centuries in the SSP5-8.5 scenario, while in the SSP5-210 3.4-overshoot scenario, uptake rapidly reverses to become a source through most of the 22nd century (lagging behind the change in the sign of net CO2 emissions by decades), before then reversing again in the late 22nd century to become a weak sink again through the remainder of the scenario.

Diagnosed CO2 Emissions
Annual ( concentration pathway in these simulations, follow the overall trajectory of the fossil fuel emissions used to generate the concentrations scenario using the MAGICC7 model, as was also found by Liddicoat et al. (2021) for the full set of SSP scenarios through 2100. The additional spread in ESMs and the UVic EMIC is due to the difference between the models' carbon cycles and the carbon cycle in the MAGICC7 model (section 2.1).

220
In the SSP5-8.5 scenario, the model ensemble spread in compatible emissions is widest at the end of the 21st century when emissions also peak, and declines during the 22nd century. In one model under SSP5-8.5 (CanESM5), negative emissions are required in the 23rd century to balance the strong and sustained terrestrial carbon source active at that time in that scenario, whereas in the rest of the models slightly positive or roughly zero emissions are inferred for the scenario. In the SSP5-3.4overshoot scenario, the ensemble spread in compatible emissions peaks first at the time of peak positive CO2 emissions, and 225 then increases again during the period of strongest negative CO2 emissions, as models disagree on the magnitude of carbon cycle responses to each of these phases. carbon cycles in each model to positive and negative CO2 emissions, with, e.g. the IPSL-CM6A-LR model requiring higher cumulative emissions to balance its stronger sink throughout the entirety of the SSP5-3.4-overshoot scenario, and the CanESM5 model requiring higher amounts of cumulative CO2 emissions to balance its high sink in the SSP5-8.5 scenario until ~2200, when that model's terrestrial system reverses from a strong sink to a strong source.

235
Each of the fluxes, averaged across the models, are shown together in figure 2c-d. Here, for each scenario, the pink line shows the inferred emissions time series, the black line shows the change in atmospheric CO2, and the accumulation in land (green) and ocean (blue) are shown by the area of sinks (hatching) or sources (stippling). This shows lags in the land and ocean carbon fluxes in response to changes in emissions, particularly for the SSP5-3.4-overshoot scenario ( fig. 2d), where the terrestrial and ocean systems remain sinks for several decades during the period of declining and negative CO2 emissions, before they switch 240 to become sources, which partially offset the negative emissions. In the SSP5-8.5 scenario, lags are less evident, but the net behavior of the ocean is to at least partially offset the net carbon losses on land during the period after the mid-22nd century.

Temperature Response to Cumulative Emissions
Plotting global mean temperature change as a function of diagnosed cumulative CO2 emissions ( fig. 3) reproduces the near linear relationship between temperature change and cumulative CO2 emissions described in (Tokarska and Zickfeld, 2015;245 Jones et al., 2016;Zickfeld et al., 2016). Note, however, that the temperature change shown here includes the response to non-CO2 forcings, whereas the linear relationship is strictly defined only for CO2 (Matthews et al., 2009;Collins et al., 2013), and thus the relationship shown here represents an "effective TCRE" (Matthews et al., 2017) that includes these non-CO2 forcings.
Further, following, e.g., Canadell et al. (2021), we add an estimated land use CO2 flux from the IAM-derived scenario specifications to the diagnosed fossil CO2 emissions for each model. 250 There is some deviation from linearity in the cumulative carbon emissions to temperature relationship in SSP5-8.5 scenario.
Initially, up to approximately 2000 PgC, temperature increases less than linearly with cumulative CO2 emissions. There are two potential explanations for this curvature. The first potential explanation is the role of non-CO2 forcers, which contribute a larger fraction of the total greenhouse gas forcing in the early than late part of this scenario ( fig. C1). The second potential 255 explanation is lags in the carbon and climate systems relative to emissions. An analysis with CO2-only experiments up to 2100 (Nicholls et al., 2020a) found a similar slight negative curvature as observed here, suggesting that lags in the carbon and climate systems are dominant up to around 2000 PgC. The temperature vs cumulative emissions relationship is approximately linear for cumulative emissions between 2000 PgC and 4000 PgC, except for the UVic ESCM. The less-than-linear relationship in EMICs was noted before and attributed to more efficient ocean heat uptake and/or a stronger saturation of CO2 radiative 260 forcing at high cumulative emissions (Herrington and Zickfeld, 2014;Tokarska et al., 2016). In the final half-century of the SSP5-8.5 scenario, temperature in the ESMs continues to increase in response to approximately stable radiative forcing. This continued warming reflects the lags in the carbon and thermal response to CO2 emissions and non-CO2 forcings. The one EMIC shows a more linear response in this part of the scenario than the ESMs. In the ESMs shown here for the SSP5-8.5 scenario, this lagged-warming tail is larger, particularly in the case of CanESM5, than the corresponding behavior shown in 265 Tokarska et al., (2016).
By breaking the cumulative emissions plots into roughly centennial-length segments, figure 3 shows the dynamics for the two scenarios over time for all of the models. This underscores the continuity of the cumulative emissions curve through the 22nd century in the SSP5-8.5 scenario ( fig. 3a), and the break in that relationship for several of the models during the 23rd century. 270 For the SSP5-3.4-overshoot scenario ( fig. 3b), the separation by centuries further shows the slight nonlinearity evident in some-but not all-of the models during the peak and initial overshoot period. In the overshoot scenario, there is not a consistent deviation from linearity at the point of overshoot and negative CO2 emissions. Some models (UVic-ESCM, CanESM5) follow roughly the same trajectory in temperature vs cumulative emissions space in the initial period of negative emissions, while others (IPSL-CM6A-LR, CESM2) follow a lower-temperature trajectory after peak warming, and one model 275 (UKESM1) follows a higher-temperature curve in temperature vs cumulative space after peak emissions. This could be due to different roles of carbon versus thermal inertia in the declining CO2 phase (Zickfeld et al., 2016). CESM2-WACCM also shows a distinctly different 23rd-century response to the other models, with a significant increase in temperatures in response to nearconstant radiative forcing, and nearly zero inferred emissions, over this period.

280
As these scenarios are concentration-driven rather than emissions-driven, the uncertainty due to carbon cycle processes shows up in figure 3 as a spread in cumulative CO2 emissions (horizontal axis) between ensemble members, rather than a vertical divergence as it would appear in an emissions-driven scenario. However, the self-consistency between the climate and carbon cycles that results from the inferred-emissions approach, as well as the qualitative consistency between the models and the emulator that was used to translate scenario fluxes to atmospheric CO2 concentrations in the scenario specification, together 285 ensure that the behavior will be similar between concentration-driven and emissions-driven dynamics, even under these extreme scenarios with either very high or net negative emissions.

Terrestrial carbon cycle
Aggregated globally, there is some commonality and a large degree of divergence between models across these two contrasting 290 scenarios. While all models show some consistent patterns (e.g., a shift on land from sink to source), individual models also show differing dynamics both in the patterns, the timing, and the magnitudes of the carbon and temperature response. It is possible to disaggregate these dynamics regionally to better understand the mechanistic basis of the carbon and temperature response, and to explore whether any qualitative similarity holds at the more regional scales. We thus focus on zonal-mean trajectories of carbon and temperature as a way to further understand the degree of similarity in results across models and 295 within any model over time and scenarios. Fig. 4 shows zonal-mean terrestrial carbon flux dynamics for the five models and two scenarios over the full historical to future period. The value for a given latitude is the average over all land cells in that latitude, regardless of the fractional coverage of land in grid cells. In the historical and near-future (prior to 2040) time period that are shared between the scenarios, the five 300 models already show a strong divergence in behavior: CanESM5 projects 21st century carbon sinks in both the tropics and northern high latitudes; CESM2-WACCM has one main sink area in the tropics and a much weaker sink at northern mid and high latitudes; IPSL-CM6A-LR and UKESM1 also have one main sink area, but in the northern mid-and high latitudes, and UVic-ESCM shows a weak sink in the tropics and growing carbon source in the higher latitudes. Over time in the SSP5-8.5 scenario ( fig. 4a), each of these models show further divergent results: in CanESM5, both the tropical and northern high latitude 305 regions shift from sinks to sources at roughly the same time, becoming sources by the mid-22nd century; in CESM2-WACCM, the tropics remain a sink through the end of the 22nd century, while the northern high latitudes shift to become a source by the end of the 21st century, with the source peaking during the 22nd century and weakening thereafter; in IPSL-CM6A-LR the northern sink weakens gradually over time to become neutral by the mid-22nd century, while the tropics become a strong source of carbon during the 22nd century; in UK-ESM1, the northern sink is sustained while the tropics shift to become a 310 source, and in UVic-ESCM the northern high latitudes become a strong source and the tropics a weak source. Thus the regionally-disaggregated dynamics show even greater divergence than the global integral, with differing locations-and thus mechanisms-driving the overall shift from sink to source across models. Further, the areas of most active terrestrial carbon cycle dynamics shift from one region to the next across centuries within any one model.

315
Zonal-mean dynamics are both more muted in magnitude and more similar between models for the SSP5-3.4-overshoot scenario ( fig. 4b). In both the CanESM5 and CESM2-WACCM models, the early sinks weaken in favor of a source of carbon in the tropics during the net negative CO2 emissions period from roughly 2050 to 2150. The IPSL-CM6A-LR and UKESM1 models show similar dynamics, but with a larger overlay of interannual variability. The UVic-ESCM model shows a relatively brief but strong loss of carbon from northern high latitudes during the period of peak warming, and a slower and weaker loss 320 of carbon from the tropics during the subsequent period of net negative CO2 emissions.
Further disaggregating the dynamics into zonal-mean vegetation and soil carbon pools ( fig. 5) shows even greater divergence between the models. Vegetation carbon pools accumulate in both the tropics and northern mid-high latitudes in the SSP5-8.5 scenario in both CanESM5 and CESM2-WACCM; in IPSL-CM6A-LR, the northern latitudes gain vegetation carbon but the 325 tropical latitudes lose large amounts of vegetation carbon; in UKESM1 vegetation carbon accumulates at mid-high latitudes of both hemispheres but is roughly neutral in the tropics; in UVic-ESCM, northern vegetation is a weak sink and tropical vegetation is roughly neutral. For soils, CanESM5 gains carbon in both the tropical and mid-high latitude belts, albeit with a delay relative to vegetation pools, through the mid 22nd century, but then shifts to lose carbon in soils from both belts by the end of the 23rd century; CESM2-WACCM gains soil carbon through most of the world but also projects substantial carbon 330 losses from the northern high latitude soils beginning in the late 21st century; IPSL-CM6A-LR loses soil carbon, mainly from the tropics, starting mainly during the 22nd century; UKESM1 shows a stronger tropical soil carbon loss and a higher latitude soil carbon gain; and UVic-ESCM shows strong losses of carbon at northern high latitudes and gains of soil carbon in the northern mid-latitudes. Thus, under the SSP5-8.5 scenario, both soil and vegetation dynamics differ markedly across the models, as well as regionally within each model. 335 For the SSP5-3.4-overshoot scenario, zonal-mean disaggregation of vegetation and soil carbon ( fig. 5c-d) shows some greater degree of similarity between model dynamics. All five models agree that northern mid-high latitudes would gain carbon in vegetation in this scenario. In the tropics, three models (CanESM5, CESM2-WACCM, and IPSL-CM6A-LR) predict that carbon gains in tropical vegetation peak by the end of the 21st century, while UKESM1 projects sustained tropical vegetation 340 carbon losses from the historical through the end of the scenario, and UVic-ESCM shows more neutral behavior of vegetation globally. CESM2-WACCM and UKESM1 also show substantial losses of vegetation carbon in subtropical ecosystems. For soil carbon dynamics, the patterns are much more muted than in the high-emissions case, with weaker but sustained carbon gains in soils of northern high latitudes in the CanESM5, IPSL-CM6A-LR, and UKESM1 models, and a weaker loss of carbon from northern high latitudes and gain of carbon in the northern mid-latitudes in the CESM2-WACCM model, and for the 345 UVic-ESCM model, northern soil carbon losses are weaker than in the very high emissions case but still stronger than any of the other models.

Ocean carbon cycle
Zonal-mean breakdowns of the ocean carbon cycle are much more consistent between models (Fig. 6). All models show nearterm sinks in the mid and high latitudes of both hemispheres, with sources in the tropics. Under the SSP5-8.5 scenario, all 350 models show a poleward migration of the Southern Ocean sink, and a weakening followed by a strengthening of the tropical source. In five of the six models, the northern mid-and high-latitude sinks weaken during the 22nd century, while they remain strong in the IPSL-CM6A-LR model. While the zonal-mean patterns of the ocean carbon flux are broadly consistent across the models, the magnitudes and meridional extents of the source and sink regions vary significantly, leading to the large spread in net global fluxes across models seen under the SSP5-8.5 scenario (Fig 1d). In the SSP5-3.4-overshoot scenario, all models 355 show roughly similar dynamics: the tropical source strengthens, the northern mid-and high-latitudes sinks weaken, and the southern ocean shifts from sink to source, although differences in the timing, strength and meridional extent of these transitions are again evident between models.

Distribution of ensemble-mean land and ocean carbon changes
Spatial patterns of the ensemble-mean time-integrated carbon changes over both land and ocean (Fig. 7) exhibit some 360 consistent patterns across the ensemble over successive periods of time; hatching in the figure is indicated where two or more of the models disagree in sign with the ensemble mean. During the historical period, models agree on a carbon sink in the tropical forests of all three continental regions, as well as a sink in the northern mid-high latitudes, and an ocean sink in most regions with a higher sink strength in Under the SSP5-8.5 scenario, for the 21st century, tropical forest sink strength is projected to increase in the ensemble mean, but the area of model agreement decreases relative to the historical period. Boreal forest sink strength also is projected to increase strongly in the 21st century under SSP5-8.5, with high model agreement on sign. In the 22nd century under SSP5-8.5, South American and African tropical forest regions switch from sink to source in the ensemble mean, with high agreement, while southeast Asian tropical forests remain a sink in the ensemble mean, but with low model agreement; high latitude 370 terrestrial regions also lose any consistent signal. In the 23rd century under SSP5-8.5, South American and African tropical forests continue as sources, with the African tropical forest region becoming a stronger source than the South American region, and the Asian forest region now also switches from sink to source in the ensemble mean. Overall ocean sinks strengthen, particularly in the Southern Ocean, from the 21st century to the 22nd, and stay roughly constant into the 23rd, while the North Atlantic sink weakens and becomes a slight source by the 23rd century. 375 Under the SSP5-3.4-overshoot scenario, 21st century integrated uptake is weaker in the Amazon forest region, southeast Asian tropical forests, and northern mid-high latitude forests, while the African tropical forest region acts as a source. In the 22nd century under SSP5-3.4-overshoot, all tropical forest regions transition from sink to source, and model agreement elsewhere is low. In the 23rd century under SSP5-3.4-overshoot, the entire land surface has a roughly neutral carbon balance, with low 380 agreement on sign. The ocean carbon cycle acts as a progressively weakening sink from one century to the next in the SSP5-3.4-overshoot scenario.

Temperature
Regional temperature dynamics are roughly similar between ESMs (Fig. 8). All models show polar amplification, and thus warming proceeds faster at high latitudes, particularly in the northern hemisphere. Under the high emissions scenario, global 385 warming is overwhelming, with >10 degrees C warming at all latitudes and much higher warming at the poles, with warming reaching 10 degrees at the northern polar region within this century in all five models. Under the overshoot scenario, polar amplification is still present in all models, and global warming peaks and then declines and stabilizes after the peak CO2 period in all models except CESM2-WACCM.

390
In CESM2-WACCM, for the overshoot scenario, the northern mid-to high latitudes return to almost the preindustrial temperature during the 22nd century, and then subsequently warm again in the 23rd century, despite no further CO2 concentration increases; this area is responsible for the vertical tail to the cumulative emissions-temperature change plot in the 23rd century in that model (Fig. 3b). Plotting the 100-year mean temperature difference between the 23rd and 22nd centuries ( fig. 9a), shows that the 23rd century warming in the model is centered on the Northern Atlantic, suggesting a control by 395 Atlantic Meridional Overturning Circulation (AMOC). To explore this hypothesis, we calculate AMOC as the maximum value of the annual-mean meridional mass flow stream function in the Atlantic basin north of 20N, and plot time series of this for all ESMs and scenarios in fig. 9b. This shows that CESM2 AMOC starts out with stronger AMOC than the other ESMs, which partially collapses during the period of warming to reach a minimum at around 2100 and recovers thereafter in the SSP5-3.4overshoot scenario, with a much stronger rebound than other ESMs considered here. This AMOC recovery response is 400 consistent with earlier long-term overshoot scenarios (Nakashiki et al., 2006) as well as long-term constant 2xCO2 experiments (Manabe and Stouffer, 1994). Thus this supports the interpretation that the AMOC recovery in that model drives the 23rdcentury warming. An additional piece of evidence to support the transient shutdown and subsequent recovery of AMOC as being the key driver of the CESM-WACCM temperature vs cumulative emissions nonlinearity shown here comes from the comparison of Hu et al. (2020) between CESM2 and a closely related model, E3SM. They show that both models have similar 405 ECS but CESM has a substantially lower TCR, which they attribute to its higher sensitivity of AMOC strength to warming.
The 23rd century warming in CESM2-WACCM thus appears to reflect an AMOC that is transiently weakened during the 22nd century due to freshwater influx associated with warming, leading to relative cooling around the North Atlantic and throughout the high latitudes, but which then recovers and removes that cooling anomaly that was present during the weakened-AMOC period. 410

Discussion
The concept of proportionality of global warming to cumulative emissions and the related metric of transient climate response to cumulative CO2 emissions (TCRE) are enormously valuable in understanding the expected response of global temperature change to anthropogenic emissions. At the same time, the utility of this framework is limited by both the persistent spread in TCRE across model ensembles Jones et al., 2013), as well as the possibilities of behavior in the coupled 415 climate and carbon cycle systems that may give rise to nonlinear trajectories of temperature as a function of cumulative CO2 emissions. Recent IPCC reports (Canadell et al. 2021;Rogelj et al., 2018) use a framework described in  to identify the remaining carbon budget consistent with stabilization of global temperatures at or below a given level. This abstraction of the climate system allows for two additional terms beyond the TCRE: a zero emission commitment (ZEC), which is any warming which arises after the point that CO2 emissions reach net zero, and thus would lead to vertical tails 420 (either positive or negative) in the cumulative emissions plots (although part of the tail warming shown here may be in response to near-constant non-CO2 forcing), and an allowance for Earth system feedbacks that are unrepresented or underrepresented in existing Earth system models and thus not included in the spread of TCRE from ESMs. Physical or biogeochemical lags in the Earth system, beyond those quantified by the ZEC or specifically enumerated as unrepresented feedbacks, are not accounted for in the Rogelj et al., (2019) framework, though the updated framework of Nicholls et al., (2020a) allows for non-linearities 425 between cumulative CO2 emissions and CO2-induced warming in remaining carbon budget calculations. Longer-duration and overshoot scenarios may be useful in identifying whether further complexity in the relationship between global temperature and cumulative CO2 emissions exists and should be considered in remaining carbon budget or other policy frameworks.
The pair of scenarios explored here bracket a wide range of possible dynamics in the Earth system over the next few centuries, 430 from a high CO2 concentration world with continuous and overwhelming global warming over the coming centuries to one in which CO2 is stabilized and reduced following a peak warming during this century (Figs. 1-2). In each of these scenarios, the models studied in general follow the expected linearity in TCRE, before 2200 (Fig. 3). At the same time their internal dynamics vary widely from each other, particularly in the terrestrial carbon cycle and under high levels of global warming, where there is little agreement on the geographic and mechanistic drivers of the terrestrial carbon cycle responses to the warming . Possibly this is due to some degree of tuning (either implicit or explicit) to capture the observed globally-integrated 20th century carbon balance trajectory, a constraint whose influence weakens at regional levels and over time into the future (Hoffman et al., 2014).
The five models used here vary widely in the representation of their terrestrial biospheres: two (UKESM1, and UVic-ESCM) 440 include vegetation dynamics, while the other three (CanESM5, CESM2-WACCM, IPSL-CM6A-LR) use prescribed and static distributions of plant functional types. Two models (CESM2-WACCM and UVic-ESCM) include the dynamics of deep and frozen soil carbon, while the others drive soil biogeochemistry using near-surface soil temperatures and thus exclude the possibility of permafrost carbon feedbacks to climate change. Two models (CESM2-WACCM and UKESM1) include the nitrogen cycle on land, while the others do not. While there does not appear to be a general signature associated with the 445 inclusion of vegetation dynamics or nitrogen here, the inclusion of permafrost carbon in both models here does lead to a signature of large soil carbon losses at high latitudes under the high-warming scenario. Overall, the models differ widely in the aggregated magnitude of their responses to climate and elevated CO2 . The structural differences likely underlie the diversity of global and regional responses, although given the myriad structural and parametric differences between the models it is not possible to attribute the dynamics in a more rigorous way (Fisher and Koven, 2020). It is also not 450 possible with the limited sample size considered here to assess whether model agreement in shorter term response arises due to common representation of relevant processes or calibration constraints imposed by historical global carbon-climate dynamics. Nonetheless, the diverse potential for global and regional carbon cycle dynamics to change sign under these scenarios highlights the continued need for improved comprehension of the major drivers of terrestrial carbon cycle dynamics.

455
The ocean carbon cycles of the models, and the thermal response of the climate system to greenhouse gas forcing, in general shows better qualitative agreement with each other (Figs. 1c, 6), but again ensemble spread increases after the 21st century in the high-emission scenario, and other surprises may be in store. In particular, two distinct types of lags in the physical system may lead to further warming beyond the time period in which greenhouse gases increase: if carbon emissions cease without overshoot, lags in the physical climate may lead to continued warming after cessation, while in the overshoot case, mechanisms 460 such as AMOC slowdown may temporarily obscure some of the warming, but then upon recovery of AMOC this temporary regional cooling may dissipate, leading to a resumption of warming long after the CO2 has stabilized ( fig. 9). If such dynamics are real features of the Earth system, this would be of critical concern -even if we deploy large negative emissions, we still would have to have a plan for a world in which all they do is stabilise, rather than reduce, temperatures. Of particular note is that CESM2 showed a negative zero emission commitment in , despite showing the large 23rd-465 century warming with near-zero inferred emissions in the overshoot and stabilization scenario here, indicating that the ZEC framework as currently defined by MacDougall et al. (2020) as the temperature change evaluated 50 years following net-zero emissions may be insufficient for quantifying such lagged effects of CO2 on climate. Further, we note that CESM2 shows the highest effective ECS of any of the models whose transient climate response is within the "likely range" as constrained by observed warming trends (Nijsse et al., 2020), because of this role of AMOC sensitivity acting to separate transient from 470 equilibrium sensitivity (Hu et al., 2020). That the model satisfies the transient constraint underscores the possibility for nonlinearities in temperature versus cumulative emissions, although the long-term sensitivity may separately be constrained by paleoclimate evidence (Sanderson, 2020;Tierney et al., 2020).
Given that the plant-physiological and other CO2 concentration-dependant processes represented in models are not routinely 475 tested against observations from the highly out-of-sample conditions experienced under each of these scenarios (e.g. very high atmospheric CO2 concentrations under SSP5-8.5 or rapidly decreasing CO2 concentrations under SSP5-3.4-overshoot), it is to be expected that model differences will be large. Despite this, it is important to note that the ensemble spread in compatible emissions, which include all these uncertain carbon-cycle feedbacks, is relatively small when compared to the mean magnitude of the emissions themselves, particularly under the overshoot scenario ( fig. 2a-b). Thus the uncertainty associated with carbon 480 cycle feedbacks is relatively small compared to the anthropogenic emissions themselves. Further, the uncertainty does not conflict with the central result that warming is roughly proportional to cumulative emissions, at least for this century and the following one, and these results support the need for rapid reductions in CO2 emissions to prevent the extreme impacts associated with warming.

485
Nonetheless, these results also suggest that there may be longer-term surprises in the coupled climate-carbon system to be encountered both in high-emissions and overshoot warming scenarios. The evident lack of consistent predictions in the terrestrial models, combined with the known structural differences, and the fact that none of the models include a complete set of processes that may be considered likely to affect the terrestrial carbon cycle, support the approach of accounting for feedbacks present in the Earth system but not included in ESMs, at least until greater convergence in terrestrial carbon cycle 490 models can be shown. The ocean carbon cycle as well shows greater uncertainty on this time horizon than on the pre-2100 dynamics, for which there is greater agreement. The wide range of results shown here, despite a small number of models analyzed, also underscore the need for further testing of model dynamics on these longer timescales, the inclusion of more models and more systematic exploration of parameter and structural uncertainty on these longer-term dynamics, as well as the identification and use of observational constraints that are relevant to these longer-term dynamics of the coupled carbon and 495 climate systems. At the same time, unanticipated physical dynamics, such as the transient weakened-AMOC-driven cooling and its subsequent reversal may also be relevant on long timescales. Thus, we should continue to anticipate that surprises in the long timescale climate response are possible, even under relatively mitigated global warming scenarios, and which should be taken into consideration when setting global climate policy.

Conclusions 500
We examine five CMIP6 ESMs, alongside a reduced-complexity EMIC, in a pair of experiments that extend to the year 2300, to explore the dynamics of the coupled carbon and climate systems on this timescale, which is longer than those typically considered in ESM analysis. We show that under contrasting high-emissions and overshoot scenarios warming is approximately proportional to total cumulative CO2 emissions, but also that distinct deviations from linearity arise in some of the ESMs on post-2200 timescales. These deviations underscore the limits to our ability to coherently project the dynamics of 505 the Earth system on these longer timescales. We note that, as on shorter timescales, the projections of terrestrial carbon dynamics differ most strongly between models, and that on the longer timescales there is still enormous uncertainty in projected carbon dynamics. This uncertainty is evident in multiple ways: between the model projections of global carbon changes; between the model projections of the geographical regions contributing to feedbacks; between the pools responsible for the basic mechanisms of carbon cycle variability; and between one century to the next within models. We also show that lagged 510 temperature effects leading to warming after cessation or reversal of emissions, beyond what has been shown in earlier or simpler models, may be possible outcomes in these projections. These results show that a greater emphasis on identifying, attributing, and reducing uncertainty is needed on the wider range of possible futures that can be explored on these longer timescales, and that until such uncertainty can be reduced, we must anticipate and allow for surprises such as these in formulating global climate policy over these longer timescales. 515

Author Contributions
C.K. and K.Z. conceived of study, with contributions from D.L.. All authors contributed to the design and/or use of one of the

Competing Interests
The authors declare that they have no conflict of interest. 530    (a) 910  in CESM2, which left a set of grid cells in the High Arctic with unrealistically high values, here we apply a mask to exclude all the grid cells where vegetation productivity was equal to zero during a 100-year period of the preindustrial control 970 simulation. This, alongside other model differences, including snow biases in the coupled model, also had the effect of reducing the permafrost carbon pool in CESM. Thus, while permafrost dynamics are permitted in CESM2 and CLM5, their feedback to warming is weaker than in the earlier CLM4.5 model as described in Koven et al. (2015).

A.3 CNRM-ESM2-1
CNRM-ESM2-1 is the second generation Earth System model developed by CNRM-CERFACS for CMIP6 (Séférian et al., 975 2019). The atmosphere component of CNRM-ESM2-1 is based on version 6.3 of the global spectral model ARPEGE-Climat (ARPEGE-Climat_v6.3). ARPEGE-Climat resolves atmospheric dynamics and thermodynamics on a T127 triangular grid truncation that offers a spatial resolution of about 150 km in both longitude and latitude. CNRM-ESM2-1 employs a ''hightop'' configuration with 91 vertical levels that extend from the surface to 0.01 hPa in the mesosphere; 15 hybrid σ-pressure levels are available below 1500 m. The surface state variables and fluxes at the surface-atmosphere interface are simulated by 980 the SURFEX modeling platform version 8.0 over the same grid and with the same time-step as the atmosphere model.
SURFEXv8.0 encompasses several submodules for modeling the interactions between the atmosphere, the ocean, the lakes and the land surface.
Over the land surface, CNRM-ESM2-1 uses the ISBA-CTRIP land surface modeling system (Decharme et al., 2019;Delire et 985 al., 2020) to solve energy, carbon and water budgets at the land surface. To simulate the land carbon cycle and vegetationclimate interactions, ISBA-CTRIP simulates plant physiology, carbon allocation and turnover, and carbon cycling through litter and soil. It includes a module for wildfires, land use and land cover changes, and carbon leaching through the soil and transport of dissolved organic carbon to the ocean. In the absence of nitrogen cycling within the vegetation, an implicit nitrogen limitation scheme that reduces specific leaf area with increasing CO2 concentration was implemented in ISBA following the 990 meta-analysis of (Yin, 2002). Additionally, there is an ad-hoc representation of photosynthesis down-regulation. During the decomposition process, some carbon is dissolved by water slowly percolating through the soil column. This dissolved organic carbon is transported by the rivers to the ocean. A detailed description of the terrestrial carbon cycle can be found in (Delire et al., 2020). Ecosystem Studies model volume 2 version trace gases (PISCESv2-gas), which derives from PISCESv2 .