Articles | Volume 12, issue 4
Research article
 | Highlight paper
15 Nov 2021
Research article | Highlight paper |  | 15 Nov 2021

Trivial improvements in predictive skill due to direct reconstruction of the global carbon cycle

Aaron Spring, István Dunkl, Hongmei Li, Victor Brovkin, and Tatiana Ilyina

State-of-the art climate prediction systems have recently included a carbon component. While physical-state variables are assimilated in reconstruction simulations, land and ocean biogeochemical state variables adjust to the state acquired through this assimilation indirectly instead of being assimilated themselves. In the absence of comprehensive biogeochemical reanalysis products, such an approach is pragmatic. Here we evaluate a potential advantage of having perfect carbon cycle observational products to be used for direct carbon cycle reconstruction.

Within an idealized perfect-model framework, we reconstruct a 50-year target period from a control simulation. We nudge variables from this target onto arbitrary initial conditions, mimicking an assimilation simulation generating initial conditions for hindcast experiments of prediction systems. Interested in the ability to reconstruct global atmospheric CO2, we focus on the global carbon cycle reconstruction performance and predictive skill.

We find that indirect carbon cycle reconstruction through physical fields reproduces the target variations. While reproducing the large-scale variations, nudging introduces systematic regional biases in the physical-state variables to which biogeochemical cycles react very sensitively. Initial conditions in the oceanic carbon cycle are sufficiently well reconstructed indirectly. Direct reconstruction slightly improves initial conditions. Indirect reconstruction of global terrestrial carbon cycle initial conditions are also sufficiently well reconstructed by the physics reconstruction alone. Direct reconstruction negligibly improves air–land CO2 flux. Atmospheric CO2 is indirectly very well reconstructed. Direct reconstruction of the marine and terrestrial carbon cycles slightly improves reconstruction while establishing persistent biases. We find improvements in global carbon cycle predictive skill from direct reconstruction compared to indirect reconstruction. After correcting for mean bias, indirect and direct reconstruction both predict the target similarly well and only moderately worse than perfect initialization after the first lead year.

Our perfect-model study shows that indirect carbon cycle reconstruction yields satisfying initial conditions for global CO2 flux and atmospheric CO2. Direct carbon cycle reconstruction adds little improvement to the global carbon cycle because imperfect reconstruction of the physical climate state impedes better biogeochemical reconstruction. These minor improvements in initial conditions yield little improvement in initialized perfect-model predictive skill. We label these minor improvements due to direct carbon cycle reconstruction “trivial”, as mean bias reduction yields similar improvements. As reconstruction biases in real-world prediction systems are likely stronger, our results add confidence to the current practice of indirect reconstruction in carbon cycle prediction systems.

1 Introduction

Predicting variations in weather and climate yields numerous benefits for economic, social, and environmental decision-making (Merryfield et al.2020). Carbon cycle prediction systems have the ability to predict the near-term evolution of CO2 fluxes (Li et al.2019; Lovenduski et al.2019a, b) and atmospheric CO2 (Spring and Ilyina2020; Ilyina et al.2021) to constrain the large internal variability of the global carbon cycle (Spring et al.2020). Predictions require a forecasting model and initial conditions representing observations. However, due to sparse and temporally incomplete records, there is currently no global biogeochemical reanalysis product to initialize Earth system models (ESMs). Therefore, direct initialization of the carbon cycle, i.e., assimilating carbon cycle variables in ESMs, is not possible. State-of-the-art carbon prediction systems initialize the carbon cycle indirectly by nudging the physical climate only, assuming that carbon cycle follows the initialized climate indirectly. However, this indirect carbon cycle initialization leaves the initial conditions of the carbon cycle unconstrained.

Here, we test how well indirect and direct carbon cycle reconstructions in an ESM initialize the carbon cycle in a perfect-model framework (Table 1 presents an overview of which variables are reconstructed in which simulation). We use the term reconstruction to describe methods of initialization of climate and the carbon cycle. Reconstructions aim to reproduce the evolution of the target, like a reanalysis product, in the ESM. Furthermore, we use the term “carbon cycle” to describe the processes exchanging carbon across the surface boundary between land, atmosphere and ocean, represented here by the air–land and air–sea CO2 fluxes. We ask the following research questions.

  • How well can initial conditions be reconstructed in the global carbon cycle?

  • Can initialization of the carbon cycle improve the predictive skill of the carbon cycle?

In this perfect-model framework, we have perfect knowledge about the ground truth and a perfect model. Literally speaking, this study ask how well perfect observations can be reconstructed in an ESM.

Originally, data assimilation is used to align the model state to an observations-based state, generally a reanalysis product (Schneider and Griffies1999; Meehl et al.2009). However, here we use the same data assimilation technique to assess how well variables can be reconstructed in an idealized setup.

Thus, reconstruction in a climate model interferes with the freely running climate model, yielding gains and drawbacks. The main advantage of climate reconstruction is that the reconstruction forces the climate model to follow the target (Jeuken et al.1996; Meehl et al.2009). The main handicap associated with reconstruction is that the mass conservation is violated and that the model dynamics and feedbacks are obstructed (Zhu and Kumar2018). Consequently, circulation fields may change, and this has severe consequences for the biogeochemical tracer distributions in the ocean and carbon pools on land because they are so sensitive and adapted to the previous climate state (Toggweiler et al.1989). Therefore, reconstructions often lead to biases. A partial solution can be bias removal by post-processing, which is feasible if the bias does not change the climate or ecosystem regime all together. Another solution is omitting nudging in regions strongly biased by reconstruction such as the tropics, as demonstrated by (Park et al.2018). Even if biogeochemical reanalysis products were available, it is unclear whether the reconstruction benefits correct these handicaps.

The lack of reanalysis products available for the reconstruction of carbon cycle initial conditions is often assumed to be a weakness of current predictions systems (Li et al.2016; Séférian et al.2018; Lovenduski et al.2019b, a; Li et al.2019; Ilyina et al.2021), but to our knowledge an elaborate assessment is missing. The literature presents two alternative approaches to test the quality of reconstructed initial conditions.

In a perfect-model study, Servonnat et al. (2015) nudge only ocean surface temperature, salinity and sea ice and assess how well this surface reconstruction penetrates into the subsurface ocean physics without addressing biogeochemistry in their analysis. This target reconstruction approach allows us to directly assess the quality of reconstructed initial conditions, which is useful and practical to know for forecasters issuing a forecast. Luo et al. (2017) use an equivalent simulation design, so-called observing system simulation experiments (OSSEs), in which they assimilate sea surface temperature, sea surface salinity and sea surface height.

In a recent study, Fransner et al. (2020) ask whether the initial conditions of ocean biogeochemistry or the initial conditions of ocean physics have a stronger influence on multi-year predictions using perfect-model twin perturbed initial conditions experiments. In the first set of hindcasts, they take identical initial conditions of ocean physics to ensure identical climate evolution but completely different states from different members for ocean biogeochemistry. In the other set of hindcasts, they slightly perturb the ocean physics to force members on differing climate evolutions while keeping the ocean biogeochemistry initial conditions identical. They find that ocean biogeochemistry initial conditions did not affect predictive skill later than the first lead year. Their approach asks the more theoretical question of whether initial conditions of ocean biogeochemistry matter compared to ocean physics initial conditions.

We go beyond previous studies by using the methodology of Servonnat et al. (2015), with the aim of understanding the quality of initial condition reconstructions. In contrast to Fransner et al. (2020), we aim to answer the questions about the quality of the initial conditions produced by different reanalysis approaches. We expand the scope by addressing the global carbon cycle, including the land, ocean and atmospheric compartments and the interactive exchange of CO2 fluxes between them. We then assess the influence of these previously reconstructed carbon cycle initial conditions for initialized predictions of the natural carbon sinks and atmospheric CO2. We focus on the global carbon cycle because the land and ocean carbon cycle control the internal variability of atmospheric CO2 (Friedlingstein et al.2020).

After explaining the approach of target reconstruction in Sect. 2, we separate reconstruction and its implication on predictive skill into two parts. We first evaluate reconstruction performance. We start with the physical reconstruction in Sect. 3.1. Then we show how the ocean and land carbon cycles are reconstructed indirectly and how direct reconstruction can improve initialization in Sect. 3.2 and 3.3. We analyze the combined effects of the ocean and land reconstruction in the atmosphere in Sect. 3.4. Following this, we assess the impact of different reconstruction methods on initial condition predictive skill in Sect. 4. Finally, the main findings and conclusions of this study are summarized in Sect. 5.

2 Methods

2.1 Model description

We use the Max Planck Institute ESM (Mauritsen et al.2019, MPI-ESM), which was also used in the Coupled Model Intercomparison Project Phase 6 (CMIP6) framework (Eyring et al.2016). We run the model MPI-ESM1-2-LR, the low-resolution configuration with 63 spherical harmonics in the atmosphere, a horizontal resolution of about 1.8 on land and about 1.5 in the ocean with daily coupling of the compartments. The time steps of the atmosphere–land and the ocean are 600 and 4320 s, respectively. We run the model with a prognostic atmospheric CO2 mixing ratio under preindustrial conditions (esm-piControl).

The marine biogeochemical cycle model HAMOCC (Ilyina et al.2013) is embedded in the ocean general circulation model MPIOM (Jungclaus et al.2013). HAMOCC includes carbonate chemistry and an extended NPZD-type cycle, including nutrient–light–temperature co-limitation and nitrogen-fixating cyanobacteria (Paulsen et al.2017). The land carbon cycle model JSBACH includes dynamic vegetation, wildfires, and soil carbon decomposition and storage (Schneck et al.2013). The atmospheric general circulation model ECHAM6 transports the three-dimensional atmospheric prognostic atmospheric CO2 tracer with a flux-form semi-Lagrangian scheme (Lin and Rood1996; Stevens et al.2013).

2.2 Perfect-model target reconstruction framework

Simulations in a perfect-model target reconstruction framework aim to reproduce the target climate evolution (Griffies and Bryan1997; Servonnat et al.2015) but are started from an independent initial state. Therefore, the initial conditions of the reconstruction simulation and the target do not match, but both the target and initial conditions share the same climatology. We choose a 50-year target period from model years 1850 to 1900 and an uncorrelated restart file from model year 2005 from the preindustrial control simulation (esm-piControl) submitted for the MPI-ESM1-2-LR model for C4MIP (Jones et al.2016) in CMIP6 (Eyring et al.2016).

Table 1Overview of different reconstruction simulations. The first column title marks the labels of the experiments as used in the paper. The reconstruction strength of relaxation timescales is noted in brackets, where h denotes hours and d days. The land carbon cycle is not dynamically reconstructed at each time step but by a hard reset of restart files each 1 January from the target run. These land restart files include carbon and nitrogen pools, soil physics (moisture, temperature, snow cover), vegetation cover (plant functional types distribution) and canopy (leaf area index).

Download Print Version | Download XLSX

In order to assess how many variables are needed to sufficiently reconstruct climate and biogeochemical cycles, we first perform reconstruction simulations only reconstructing physical state variables in atmosphere and/or ocean (Table 1). In these simulations, the carbon cycle is only indirectly affected by the reconstruction of physical variables. In further simulations, we test how much carbon cycle states improve with respect to the target when carbon cycle state variables are reconstructed directly.

Newtonian or Haney (1974) relaxation, which is often called “nudging”, is a simple four-dimensional assimilation technique that dynamically reconstructs variables in an ESM. A non-physical relaxation term with relaxation coefficient R (units  1 s−1) is added to the prognostic equation to drag the model variable X, which is subject to model forcing Fm towards its target Xt:

(1) δ X δ t = F m ( X ) + R ( X t - X ) .

For reconstruction of the dynamics of the ocean, we reconstruct three-dimensional temperature, salinity, and sea ice concentration and thickness (Table 1). We label this reconstruction as indirect (Table 1) from the carbon cycle's perspective, as the carbon cycle is not reconstructed directly but instead indirectly follows the reconstructed physical climate. Observational ocean data are often not available at each model time step. Therefore, we interpolate (without adjustments preserving the temporal mean) monthly model target output to daily frequency as has been done in previous studies (Pohlmann et al.2009). We choose a 60 d ocean relaxation time (converted to units 1 s−1) like Servonnat et al. (2015) did in their perfect-model target reconstruction study. Reconstructions towards observations usually choose a stronger nudging strength (Pohlmann et al.2009; Keenlyside et al.2008).

We reconstruct the physics of the atmosphere by nudging temperature, vorticity, divergence and the logarithm of surface pressure (Pohlmann et al.2019). The high-frequency 6-hourly output serves as the target and is nudged into all 63 spherical harmonics. Temperature and the logarithm of surface pressure are nudged with a relaxation timescale of 24 h, vorticity is nudged with a relaxation timescale of 6 h, and divergence is nudged with a relaxation timescale of 48 h. Relaxation coefficients are converted to units of 1 s−1 and are taken from previously used setups (Rast et al.2012; Pohlmann et al.2019; Li et al.2019). Nudging the atmosphere with these quite short relaxation times is similar to the forced simulations, such as the Model Intercomparison Projects for ocean (OMIP) (Griffies et al.2016; Orr et al.2017) and land (LMIP) (van den Hurk et al.2016) and Global Carbon Budget (Friedlingstein et al.2019) simulations, where (atmospheric) external boundary forcing drives the carbon cycle.

For reconstructions of oceanic carbon cycle, we use the same nudging approach and strength as for physical ocean reconstruction but on different variables. To reconstruct the components of the carbonate system, we nudge three-dimensional dissolved inorganic carbon (DIC) and total alkalinity (Table 1).

Unfortunately, there is no nudging module available in the land surface model JSBACH. The current structure of JSBACH code is not flexible enough to allow frequent rewriting of physical variable fields, such as soil moisture or temperature, with external data. Here, we choose to manually reset the initial conditions every 1 January to the target values instead of the dynamic reconstruction at each time step. We thereby reconstruct land biogeochemistry and land surface physics such as soil moisture by resetting all restart variables every year. In Appendix Sect. D, we provide several sensitivity analyses by resetting land only every 2 or 5 years and resetting the ocean every year in the same way.

We compare the target with reconstructions in the various metrics showing different attributes of tracking performance: bias, anomaly correlation coefficient and root-mean-square error. The non-physical relaxation terms in the prognostic equations can disturb the dynamics in the ESM and introduce biases defined as the differences in the reconstruction compared to the freely running target over time. The anomaly correlation coefficient skill score (ACC) shows the linear association between the reconstruction and the target over time and therefore measures synchronous evolution while ignoring bias. The root-mean-square error (RMSE) takes into account bias and measures the second-order Euclidian distance between reconstruction and target simulation over time. Under the assumption that persistent biases can be removed by post-processing, we also assess RMSE after having the mean monthly bias removed. For equations please consult Appendix A. We calculate tracking performance over running 10-year chunks to capture the variability within tracking performance and reduce the influence of drifts over time.

How do we evaluate that a reconstruction is good enough? While good enough is a subjective judgment, we resample the target simulation along the time dimension with a block length of 10 years to check the metric of two randomly compared 10-year chunks. We consider the 95th quantile threshold for ACC and 5th quantile threshold for the remaining distance-based metrics as a baseline of internal variability to be a good-enough reconstruction (Efron and Tibshirani1993), which we will refer to as “resampling threshold” in the following.

2.3 Perfect-model predictive skill framework

In the second part of this study, we perform initialized perfect-model experiments (as in Spring and Ilyina2020). The simulations in the perfect-model framework are started from the indirect and direct reconstructions as well the target representing perfect initial conditions. We take 19 initialization states chosen every second 1 January between 1860 and 1896, after allowing a 10-year adjustment phase after reconstructions were started. From each of those states from different reconstruction simulations, we fork five ensemble members and simulate 3 lead years. The perfectly initialized ensembles are started from the target initial conditions without any previous reconstruction simulation. We generate ensemble members by perturbing the stratospheric horizontal diffusion by a factor of 1.0000{member} in the first year, e.g., the factor is 1.00005 for the fifth ensemble member. This member-generating approach provokes only tiny initial perturbations to the climate system as the ocean and land initial conditions remain identical.

We compute predictive skill as the RMSE between the ensemble mean and the target as verification (Wilks2006; Jolliffe and Stephenson2011) (Appendix A). Please find additional details about the predictive skill metrics and the uninitialized bootstrapping in Spring and Ilyina (2020). Acknowledging that our reconstruction simulation developed biases and that biases are commonly reduced by post-processing in predictability research, we also apply a simple lead-time-dependent mean bias reduction to the initialized ensembles to show whether skill improvements go beyond what a simple post-processing could deliver. For each initialization in turn, we first calculate the mean bias for all but that given initialization and then remove that mean bias from the given initialization. This implies using information about future initializations as in bias-reduced hindcasts (Marotzke et al.2016). We also evaluate predictive skill from a perfectly initialized ensemble, which are started from the perfect initial conditions taken from target simulation, whereas the ensembles from reconstructed initial conditions are biased with respect to the target (Fig. 5). This initialized predictive skill is also compared with uninitialized ensembles randomly generated from the target simulation representing ensembles without common initialization and hence no memory. This uninitialized reference skill is used in predictability research community to assign whether the skill increase stems from initialization.

3 Reconstruction in an Earth system model

As the carbon cycle is sensitive to the climate evolution, we first assess how well the physical climate is reconstructed. Therefore, we first evaluate the physical climate state after reconstruction in Sect. 3.1. Afterwards, we assess how these different reconstructions of physical climate indirectly reconstruct the ocean, land and atmospheric carbon cycle in subsections and how direct reconstruction could improve initial conditions in Sects. 3.23.4.

Figure 1Spatial distribution of the bias (construction–target) (a–f) and anomaly correlation coefficient (ACC) (g–l) of different indirect carbon cycle reconstructions relative to the target over 10-year running windows of annual means (see Appendix A). The reconstruction metrics for 2 m temperature are shown for the indirectATMonly (d, j), indirectOCEAN only (e, k) and indirect reconstruction (f, l). Because of identical reconstruction skill for all indirect methods, only one indirect reconstruction is shown for other variables, zonal westward 10 m wind (a, g) and meridional northward 10 m wind (b, h) and precipitation (c, i). Gray stippling shows where the metric exceeds the 5th-percentile (for a–f) or 95th-percentile (for g–l) threshold from random target block resampling, i.e., the reconstruction is not significantly better than internal variability.

3.1 Reconstruction of physical climate

Reconstructing the ocean and/or the atmosphere systematically disturbs the freely evolving model, which leads to annual mean biases with respect to the original target. We identify atmospheric circulation represented by winds and resulting precipitation and temperature to be descriptive for the impact of circulation on the carbon cycle. The gray stippling in Fig. 1 shows where this reconstruction bias is larger than the randomly resampling fifth-percentile mean absolute error threshold and therefore labels the reconstruction as being not significantly better than internal variability.

All reconstructions yield identical results for winds and precipitation tracking performance. Reconstructing the ocean and/or the atmosphere introduces biases of up to 0.6 m/s in zonal and 0.9 m/s in meridional 10 m wind speed, depicting a southward shift of the Intertropical Convergence Zone (ITCZ). This bias results in a significant weakening of the Equator-ward latitudinal winds, whereas extratropical latitudinal winds intensify (Fig. 1a). The intensification and Equator-ward shift of the easterly trade winds and weakening of the Southern Hemisphere westerlies are both not significant (Fig. 1b). Precipitation is heavily impacted by these biases in atmospheric transport across many regions of the globe. Precipitation significantly shifts southward at the Equator with changes of more than 1 mm d−1 and increases in western Canada, western Russia and southern Australia (Fig. 1c). Unlike the previously described variables, the 2 m temperature bias depends on whether the ocean is reconstructed or not. Just reconstructing the ocean temperature and salinity (indirectOCEAN only) leads to small, negative and significant biases in the tropical Atlantic and West Pacific. In addition, northern and southern Africa, as well as the Amazon and China, are subject to a small cold bias, whereas Saharan Africa and Southeast Asia get substantially warmer. The polar regions cool significantly (Fig. 1d). Only reconstructing the atmosphere (indirectATMonly) leads to a warm bias across nearly all oceans but less cold bias over northern and southern Africa and China (Fig. 1e). Combining atmosphere and ocean reconstruction (indirect) reduces the overall temperature bias, especially over the oceans (Fig. 1f).

While the biases explained above are liabilities of reconstructions, the linear association measured by the anomaly correlation coefficient (ACC) benefits from reconstruction. Reconstruction recreates climate variability of the target (Fig. 1g–l). The running 10-year correlation between the target and the reconstruction in atmospheric variables is in most grid cells above 0.4 and significantly better than the randomly resampling threshold. Reconstruction over the oceans is more successful in the tropics than in the extratropics, where the Northern Hemisphere and Southern Hemisphere midlatitude westerlies have low but still significant correlation. Generally, the atmosphere above the ocean is better reconstructed than above land, showing the stabilizing effect of an internally consistent ocean reconstruction on the atmosphere (Fig. 1g–l). The Southern Hemisphere tropical convergence of winds is well reconstructed, but the meridional winds in central Canada and tropical Africa are not significantly reconstructed (Fig. 1g). In addition, zonal winds across North America, southern Africa and Siberia have low correlation with the target, but the tropical zonal winds are very well reconstructed (Fig. 1h). Precipitation from the central Atlantic over central Africa is reconstructed worse than the resampling threshold, and the extratropical westerlies have low correlation with the target (Fig. 1i). Temperature is well reconstructed in the tropical oceans (Fig. 1j–l). Reconstructing both atmosphere and ocean (indirect) improves 2 m temperature correlation better than only reconstructing a single realm. The indirect carbon cycle reconstruction is significantly better than the resampling threshold except in central Africa, where the ITCZ shift changes the climate regime (Fig. 1l).

This physical bias due to reconstruction, especially in the tropics, can be explained by the sensitivity of atmosphere–ocean coupling to perturbation induced by nudging (Milinski et al.2016). Additionally nudging sea surface height might improve the El Niño–Southern Oscillation (ENSO) thermocline feedback (Luo et al.2017). The reconstruction of ocean and atmospheric variables is perfectly aligned with the model climatology into that same model. Hence, the reconstruction error does not arise from inconsistent observations but from the perturbed interaction of atmospheric and oceanic dynamics. While reconstructing an increasing set of variables shows that nudging can be an efficient way to reconstruct variability (Jeuken et al.1996), this reconstruction is biasing the climate state in the tropics at the same time (also explained in Zhu and Kumar2018).

Nudging atmospheric and ocean dynamics including sea ice all at once (indirect reconstruction), as is often done in state-of-the-art carbon cycle prediction systems, brings large-scale improvements over random resampling and atmosphere-only (indirectATM only) reconstruction, but strong regional biases remain (Fig. 1).

3.2 Reconstruction of the oceanic carbon cycle

How do these regional physical biases affect the reconstruction of oceanic carbon cycle? In order to assess the tracking performance in the indirect reconstruction of the oceanic carbon cycle, we focus on air–sea CO2 flux and surface oceanic pCO2 as the state variable of the ocean carbon sink, which is the oceanic driver of air–sea CO2 flux (Lovenduski et al.2019b).

Figure 2Spatial distribution of the bias between the target and different indirect carbon cycle reconstruction methods over 10-year running windows of annual means (see Appendix A). Columns show the different carbon cycle reconstruction methods (see Table 1). Rows show the different variables: the ocean carbon cycle is represented by (a–c) the partial pressure of surface CO2 in the ocean (pCO2) and (d–f) surface air–sea CO2 flux (negative values indicate carbon uptake by the ocean); the land carbon cycle is represented by (g–i) the vegetation carbon pools and (j–l) air–land surface CO2 flux (negative values indicate carbon uptake by land); and the atmospheric carbon is represented by (m–o) the atmospheric CO2 mixing ratio (XCO2). Gray stippling shows where the bias exceeds the 5th-percentile mean absolute error threshold from random target block resampling, i.e., the reconstruction is not significantly better than internal variability.

Reconstructing only the atmospheric dynamics (indirectATMonly) leads to strong positive biases across large parts of the global ocean, which can be reduced by also reconstructing oceanic temperature and salinity (indirect) (Fig. 2a, b, d, and e). The weakening of the Southern Hemisphere westerly winds decreases the magnitude of air–sea CO2 flux, but more importantly reduces the Southern Hemisphere overturning circulation and upwelling of carbon-rich waters, which leads to increased Southern Ocean carbon uptake (Fig. 2b and e). The intensification of easterly trade winds (Fig. 1b) strengthens upwelling and therefore higher pCO2 in the tropical Atlantic (Fig. 2b) (Lefèvre et al.2013). The bias pattern of air–sea CO2 flux is dominated by the bias of pCO2 (Lovenduski et al.2019b) (Fig. 2b and e).

Figure 3The same as Fig. 2 but for the anomaly correlation coefficient (ACC). Gray stippling shows where the ACC is lower than the 95th-percentile ACC threshold from random target block resampling, i.e., the reconstruction is not significantly better than a resampling internal variability threshold.

The variations in the oceanic carbon cycle, described by the correlation coefficient, are better reconstructed than the resampling threshold. Indirect reconstruction of oceanic and atmospheric dynamics greatly improves tracking performance over atmosphere-only indirectATMonly reconstruction. The additional reconstruction of the physical ocean (Fig. 1e and f) largely enables a correlation above 0.7 (Fig. 3b and e). Only the carbon cycle in the tropical oceans remains difficult to reconstruct due to the strong biases in atmospheric circulation (Fig. 1a–c). Note that the land and atmospheric carbon bias due to indirect reconstruction are discussed in Fig. 2g–o in Sects. 3.3 and 3.4).

Next, we compare the previously shown indirect carbon cycle reconstruction with direct carbon cycle reconstruction by nudging dissolved inorganic carbon (DIC) and alkalinity (ALK) towards the target.

While direct oceanic carbon cycle reconstruction reduces the magnitudes of the bias across the ocean, biases are still evident (Fig. 2c and f). These biases are caused by the physical biases, which the dynamical oceanic carbon cycle model is sensitive to. Hence, the biased ocean physics inhibits additional improvements in tracking performance from direct ocean carbon reconstruction.

Direct oceanic carbon cycle reconstruction improves the already high correlations across the oceans (Fig. 3c and f). The resampling threshold is surpassed nearly everywhere. Only coastal areas, especially those in the eastern tropical Atlantic with strong wind and precipitation biases, have a correlation below 0.7.

Section 3.2 shows how well indirect and direct reconstruction of the ocean carbon cycle work overall. While the direct reconstruction has slightly larger biases in air–sea CO2 flux, direct reconstruction also brings higher correlation. Note that the land and atmospheric carbon biases due to direct reconstruction are discussed in Fig. 2g–o in Sect. 3.3 and 3.4).

3.3 Reconstruction of the land carbon cycle

How do these regional physical biases affect the reconstruction of the land carbon cycle? In order to assess the tracking performance in the best indirect reconstruction of the land carbon cycle, we focus on the state variable cVeg, which represents carbon storage in vegetation (leaves, stems, roots) and drives air–land CO2 flux and hence the land carbon sink.

Figure 4Evolution in global annual mean of (a) surface ocean pCO2 (b), air–sea surface CO2 flux (negative values indicate carbon uptake by the ocean) (c), vegetation carbon pools (g–i), air–land surface CO2 flux (negative values indicate carbon uptake by land) (d) and atmospheric CO2 mixing ratio (e). The target (gray) is quite well tracked by the indirect (green) and direct (orange) carbon cycle reconstruction. The solid line shows the different reconstruction simulations, the dashed lines show the initialized ensembles started from the different reconstructions.


For the land carbon cycle, the reconstruction of the ocean temperature and salinity did not matter, when atmospheric temperature was also reconstructed (Figs. 2 and 3). Indirect reconstruction leads to biases compared to the target in carbon storage, and in particular cVeg (Fig. 2g and h), as the land carbon cycle is very sensitive to changes in atmospheric circulation, which are strongest in the tropics due to the ITCZ shift. In the Amazon and southern Africa, the air–land CO2 bias increases, most likely caused by the strong positive precipitation bias in these regions (Figs. 1c and 2j and k). Conversely, the carbon sink in Southeast Asia and central Africa has a carbon release bias due to less precipitation and a warm bias (Fig. 2j and k).

The reconstruction correlations in the land carbon cycle are much lower than for the oceanic carbon cycle. cVeg is well reconstructed in the extratropics, but the biases in the tropics result in correlations with the target lower than the resampling threshold (Fig. 3g and h). Air–land CO2 shows the same patterns with lower correlations, which are below the resampling threshold in the tropics (Fig. 3j and k).

Direct reconstruction of the land carbon cycle, which is here performed by resetting all restart files of the land carbon sub-model to the target every 1 January, greatly enhances tracking performance of cVeg by simulation design. A sensitivity analysis for less frequent resetting can be found in the supplementary information (Sect. D).

This direct resetting reconstructs cVeg much better than the resampling threshold in the extratropics. However, the physical climate biases during the course of a year even introduce cVeg biases stronger than the resampling threshold in the tropics (Fig. 2i). In addition, the biases in the air–land CO2 flux are not improved (Fig. 2l), which indicates that this hard reset of restart files introduces a shock to the dynamical land model.

On the other hand, correlations in cVeg and air–land CO2 flux increased to above 0.5 everywhere except in the tropics, where the ITCZ shift changes the climate regime (Fig. 3i and l).

Section 3.3 shows the direct land carbon cycle reconstruction yields stronger correlation improvements than ocean direct carbon cycle reconstruction because the indirect reconstruction of the ocean was already quite good. Direct reconstruction reduces biases in land carbon cycle state variables, but the resulting air–land CO2 flux biases becomes worse.

Figure 5The 10-year running mean annual reconstruction skill in bias (a, d, g, j, and m), anomaly correlation coefficient (ACC, b, e, h, k and n) and root-mean-square error (RMSE, c, f, i, l, and o) for global aggregation of carbon cycle variables: (a–c) surface oceanic partial pressure of CO2, (d–f) air–sea CO2 flux (negative values indicate carbon uptake by the ocean), (g–i) vegetation carbon pools, (j–l) air–land CO2 flux (negative values indicate carbon uptake by land) and (m–o) mixing ratio of atmospheric CO2. Error bars show ±σ standard deviation of the running skill over time. Columns show different reconstruction methods: indirect (green) and direct (orange). The gray bar marks the magnitude of the 95th percentile for ACC and 5th percentile for bias and RMSE of a random reconstruction skill block-bootstrapped from the target control simulation as an unskillful reference. Gray stars indicate perfect skill. Thin black error bars with crosses show RMSE skill after a mean bias reduction.


3.4 Reconstruction of the global carbon cycle and atmospheric CO2

Tracking performance for prognostic atmospheric CO2 integrates the air–sea and air–land CO2 fluxes over time (Spring and Ilyina2020; Spring et al.2020). As atmospheric CO2 mixes quickly across the globe, we first examine globally aggregated quantities driving globally averaged atmospheric CO2 (Fig. 4).

We first examine the indirect reconstruction represented by the green error bars in Figs. 5 and C1. The indirect reconstruction has a negative bias in global pCO2 in the annual mean (Figs. 4a and 5a). This bias is slightly higher than the magnitude of the resampling mean absolute error threshold, which resembles the temporal standard deviation (Fig. 5a). The global oceanic CO2 flux is biased low but within the resampling threshold magnitude range (Figs. 4b and 5d).

On the other hand, the variations of the global oceanic carbon cycle measured by ACC are well reconstructed, surpassing the resampling threshold (Fig. 5b and e).

When biases are persistent, they can be reduced by a bias reduction procedure, which is often done when applying climate model output to a real-world application. After applying a simple mean bias reduction, RMSE is well below the resampling threshold (Fig. 5c and f).

The indirect reconstruction also leads to biases in the land carbon cycle (Fig. 4c and d). Vegetation carbon pools (cVeg) have a strong positive bias that is much larger than the resampling threshold (Fig. 4c). The bias of global air–land CO2 flux is very small in the annual mean.

Global annual cVeg has a 0.5 correlation with the target, which is lower than the resampling threshold. Global air–land CO2 variations are well reconstructed, surpassing the resampling threshold (Fig. 5h and k).

Without bias reduction, accuracy measured by RMSE is worse than the resampling cVeg threshold. After bias reduction, cVeg accuracy is still slightly worse than the threshold, but accuracy improves from 5 Pg C to below 1 Pg C, which is the magnitude of the resampling threshold (Fig. 5i). Global air–land CO2 flux accuracy is below the resampling threshold (Fig. C1l).

Global atmospheric CO2 has larger variations in reconstruction skill depending on which 10-year chunk is used to calculate the metric. And the skill has a nearly constant level throughout the year (Fig. C1m–o). The mean bias is close to zero (Figs. 4e and 5m). Correlation with the target is above 0.7 and in the range of the resampling threshold (Fig. C1n). accuracy is at 0.7 ppm in the range of the resampling threshold. Mean bias reduction improves accuracy to below 0.5 ppm (Fig. 5o).

Understanding the tracking performance of the ocean and land carbon cycle, we can now evaluate the spatial distribution of globally averaged atmospheric CO2. Reconstructing only the atmosphere warmed the globe and also increased atmospheric CO2 globally (Figs. 1k and 2m). Additionally reconstructing the ocean keeps the temperature stable but introduces a less than 1 ppm low bias across the Southern Hemisphere, reflecting the higher uptake of the Southern Ocean carbon sink and the Southern Hemisphere land carbon sink (Fig. 2e, k, and n). The variations in atmospheric CO2 are well reconstructed with correlation coefficients above 0.6 in the Southern Hemisphere, but across the Northern Hemisphere extratropics and land regions with strong physics biases correlation is at 0.5 below the resampling threshold (Fig. 2m and n).

Now, we assess the potential improvements in the global carbon cycle due to direct reconstruction of the global carbon cycle variables shown in orange in Figs. 5 and C1.

The global ocean carbon cycle improves after direct DIC and alkalinity reconstruction (Fig. 5a). Monthly biases remain but are now within the resampling threshold (Fig. C1a). Correlation improves from 0.8 to above 0.9 in surface pCO2. Air–sea CO2 correlation does not improve but only because correlations above 0.9 for the indirect reconstruction were already very high (Fig. 5b). Correlation for boreal winter is above 0.95, indicating that initial conditions in winter are well reconstructible for initializing forecasts of oceanic carbon sink (Fig. C1b). Direct reconstruction improves pCO2 accuracy to 0.2 ppm. Mean bias reduction can hardly improve accuracy after direct reconstruction (Fig. 5c). Air–sea CO2 flux accuracy degrades in comparison to indirect reconstruction. This degradation is removed by the mean bias reduction (Fig. 5f).

All results for the direct reconstruction of the land carbon cycle must be understood in the context of the method chosen for the direct reconstruction. Because we reset the restart files in 1 January to the target, the metrics are near perfect in January by design. However, then the biogeochemistry is not modified directly for 12 months and only follows the physical climate reconstruction indirectly, and thus biases triggered by physical biases unaligned with the reset land biogeochemistry pools quickly build up and may approach the metric of the indirect reconstruction. Likewise, there is no bias in global cVeg in January by design. The bias increases with the physical biases until it surpasses the resampling threshold in August and continues increasing until the end of the year (Fig. C1g). Annual cVeg bias is strongly improved by direct reconstruction (Fig. 5g). Global air–land CO2 flux has a stronger bias than the indirect reconstruction (Fig. 5j). Correlation in the global cVeg is near perfect in January by design and slowly decreases to 0.8 in December while still being better than the resampling threshold (Fig. C1h). Annual cVeg variations are reconstructed much better by the direct method compared to the indirect method (Fig. 5h). Global air–land CO2 flux variations increase by 0.2 (Fig. 5k). Direct reconstruction improves global cVeg accuracy. Accuracy is better than the resampling threshold after mean bias reduction. Direct reconstruction slightly improves CO2 flux accuracy. Furthermore, a mean bias reduction slightly improves accuracy (Fig. 5i and l).

The global CO2 bias in the direct reconstruction increases to +1.8 ppm (Fig. 5m), but correlation increases from 0.7 to 0.9 (Fig. C1n). The direct reconstruction has worse accuracy than the indirect due to established bias, but after mean bias reduction the accuracy is below 0.3 ppm (Fig. 5o).

How does direct carbon cycle reconstruction affect tracking performance in prognostic atmospheric CO2? The time series already indicate that there is a 1–2 ppm atmospheric CO2 positive bias in the direct reconstruction (Fig. 4e). This bias is very homogeneous over the oceans (Fig. 2o). However, correlation strongly increased to 0.9 above the oceans and above 0.7 on land, except for central Africa with its persistent biases, where the reconstruction is not better than the resampling threshold.

Section 3.4 shows that atmospheric CO2 follows the reconstructed land and ocean carbon cycle, integrating their respective fluxes over time. The direct carbon cycle reconstruction introduces a large bias in the atmospheric CO2 distribution that the indirect reconstruction did not suffer from even after mean bias reduction (Fig. B3). Globally averaged atmospheric CO2 after direct reconstruction had a better accuracy tracking performance after the mean bias reduction, showing how global aggregation can balance regional biases. The direct land and ocean carbon cycle reconstructions track targets much better than the indirect reconstruction when measured by correlation.

Figure 6Predictive skill measured by (a–e) root-mean-square error (RMSE), (f–j) RMSE after bias reduction, and (k–o) anomaly correlation coefficient (ACC) between the initialized ensemble mean and the target as a function of lead year for different initialization setups: perfect indicating no reconstruction and hence perfect initial conditions to predict the target (gray), indirect (green), and direct (orange) reconstructions. Columns show global variables for the ocean carbon cycle, i.e., (a) oceanic surface pCO2 and (b) air–sea CO2 flux; for the land carbon cycle, i.e., (c) total land carbon pools and (d) air–land CO2 flux; and in the atmosphere, i.e., (e) atmospheric CO2 mixing ratio. Initialized ensembles are resampled with replacement (N= 500) along the initialization dimension to account for initialization sampling uncertainty (see Spring and Ilyina2020), where error bars show the resampled initialization skill uncertainty (± 1σ). Uninitialized ensembles, shown at lead 0, are resampled from the target control simulation and show the reference skill without initialization.


Generally speaking, this first part showed how direct carbon cycle reconstruction improves linear association between reconstruction and target (measured by ACC) but often increases biases degrading accuracy (measured by RMSE). Only after bias reduction does accuracy improve with respect to the indirect carbon cycle reconstruction.

4 Impact of reconstruction on global carbon cycle predictive skill

The second part of the paper assesses how predictive skill improves due to direct initialization of global carbon cycle variables. Specifically, we verify the RMSE between the five ensemble members initialized from the indirect and direct reconstructions across all initializations based on raw and lead-time-dependent bias-corrected time series (Figs. 4 and 6).

4.1 Oceanic carbon cycle

The RMSE between the initialized ensembles and the target simulations in annual globally averaged pCO2 continuously increases from lead year 1 to lead year 3 as expected. While perfectly and indirectly initialized ensembles stay below the resampling uninitialized threshold for the first 2 lead years, indicating that global pCO2 is predictable due to initialization (Fig. 6a), the direct initialization has a larger error due to the offsets in global atmospheric CO2, which pCO2 tries to equilibrate to (Fig. 4e). Therefore, this persistent bias causes lead year 3 to be not predictable. A simple mean bias reduction resolves this issue, making all 3 lead years predictable. Direct initialization only beats indirect initialization for lead year 1 with RMSE of 0.35 ± 0.05 versus 0.45 ± 0.05 ppm (Fig. 6f).

Global air–sea CO2 flux is predictable for 3 years in all initialization methods, which is 1 year longer than in Spring and Ilyina (2020), possibly because here we use more and more equally distributed initialization dates. Direct initialization is advantageous over the indirect initialization because the initial lead offset is smaller (0.14 ± 0.01 versus 0.18 ± 0.02 Pg C yr−1) (Fig. 6b). The simple mean bias reduction improves the skill of the non-perfect initializations to identical magnitudes (Fig. 6g).

4.2 Land carbon cycle

Indirect initialization makes cVeg not predictable. The physical reconstruction biases drive larger errors in lead year 1 than in later lead years and also to a lesser extent for the direct reconstruction, where some biases are corrected. But both reconstructed initialized ensembles show decreasing distances towards the target, whereas increasing distances are expected for vanishing predictive skill as in the perfectly initialized ensembles (Fig. 6c). Mean bias reduction eliminates the differences between direct and perfect reconstruction, making both predictable unlike the indirect reconstruction (Fig. 6h).

Global air–land CO2 flux is predictable for 3 years, again 1 year longer than found in Spring and Ilyina (2020). Both reconstructed initializations start with a higher error of 1.1 ± 0.2 Pg C yr−1 in lead year 1 compared to perfectly initialized 0.7 ± 0.1 Pg C yr−1 (Fig. 6d). Mean bias reduction brings non-perfect initializations within the error bars of the perfect initialization after lead year 1 (Fig. 6i). A recent analysis focused on process-based understanding of land carbon predictability using JSBACH indicates that soil moisture and soil carbon storage, both reconstructed by the direct method, influence the air–land CO2 flux the most (Dunkl et al.2021).

4.3 Atmospheric CO2

Perfect and indirect initialization atmospheric CO2 predict the target for 3 years, as found in Spring and Ilyina (2020). While the perfect initialization error grows continuously from zero, the indirect initialization error stays nearly constant at 0.7 ± 0.1 ppm, but the error stays below the direct initialization error, which suffers from the bias in the direct reconstruction simulation (Fig. 6e). Mean bias reduction improves RMSE, making direct initialization better but still within the margins of the indirect initialization. After lead year 1, indirect and direct initializations are similar to perfect-initialization predictive skill at 0.7 ppm (Fig. 6j).

The anomaly correlation coefficient measures how predictable variations are and is independent of the mean bias (Fig. 6k–o) (Jolliffe and Stephenson2011). Measuring predictive skill with ACC shows very similar behavior across all variables. While perfect initialization is the most predictable, indirect and direct carbon cycle initialization are fairly similar. Predictive ACC skill seems to saturate after lead year 2.

These initialized predictive skill results show that indirectly initialized ensembles predict the target quite reasonably. Direct initialization suffers strong shocks in some variables, when reconstruction is started and stopped, but these shocks can be partly reduced by a mean bias reduction. The improvements of direct reconstruction over indirect reconstruction in the global carbon cycle predictive skill after bias reduction are not significant, except for vegetation carbon pools (cVeg) (Fig. 6f–j).

5 Summary and conclusions

In this study, we assess how well the global carbon cycle is reconstructed in an ESM and how well a ground truth target simulation can be predicted by these initializations.

The main limitation of land carbon cycle reconstruction potential is the hard reset of restart files which is fundamentally different to the dynamical nudging applied for ocean and atmospheric physics. Our study represents a first attempt to quantify whether reconstruction of initial conditions in the land carbon cycle is indeed needed for addressing predictive skill of the global carbon sinks and atmospheric CO2 concentration. For a real-world application, our direct land carbon reconstruction method should not be used. In practice, satellite products of carbon cycle variables could be assimilated into the model periodically or at each time step. However, strong interference with the model alone will likely result in strong drifts, especially in dependent variables. For useful real-world applications of land carbon cycle assimilation, sequential (Evensen1994; Balmaseda et al.2007; Zhang et al.2007) or variational (Han et al.2004) data assimilation techniques could be used for initialization. However, the problem of data availability for the reforecast period still remains. Haney (1974) reconstruction is the simplest approach to data assimilation, allowing little flexibility in the model. Many centers are now transitioning towards ensemble Kalman filter data assimilation, which allows more variability (Park et al.2019; Brune and Baehr2020). Applying such techniques to the carbon cycle may lead to better reconstructions. A final limitation of the method is that we use a model to reconstruct itself. Therefore, we do not have any structural uncertainty other than the reconstruction method itself or processes missing in our framework. When reconstructing the real world, our model lacks processes and resolution contributing to structural uncertainty.

We find that reconstruction, which is an interference with the freely evolving model, leads to biases in physical climate. Because of its sensitivity to physical climate, the global carbon cycle is itself heavily biased by these physical biases. In ESMs, first the atmosphere then the ocean and only then the carbon cycle is equilibrated and tuned for preindustrial control conditions. Once reconstruction slightly modifies the mean state in the physical climate, the sensitive carbon cycle deviates from the near-equilibrium state. A previous study reported biases after reconstruction (Zhu and Kumar2018). However, to our knowledge, we present the first attempt at reconstructing it in a perfect-model framework, where no biases due to climatology differences are expectable. Zhu and Kumar (2018) also mention that reconstruction ability likely depends on the model and application area, and hence there seems to be no out-of-the-box solution for all ESMs. However, additionally nudging sea surface height might improve the ENSO thermocline feedback (Luo et al.2017).

We furthermore find that the commonly used indirect reconstruction of carbon cycle, in which only climate physics are reconstructed and the carbon cycle follows indirectly, tracks the target reasonably well. A resampling threshold corresponding to internal variability is surpassed across large parts of the globe. Only the areas with strong physical biases and consequently carbon cycle biases miss that benchmark occasionally. For the ocean carbon cycle, the reconstruction of the physical ocean fields is critical to reconstruct carbon cycle initial conditions, which explains why current state-of-the-art carbon cycle prediction systems have skill despite not initializing the ocean carbon cycle with ocean carbon cycle observations (Séférian et al.2014; Park et al.2018; Li et al.2019; Lovenduski et al.2019b).

Direct reconstruction of ocean and land carbon cycle improves bias, association and accuracy on a grid cell level; however, when aggregated on the global scale, direct reconstruction does not significantly improve over the indirect reconstruction. In addition, after a mean bias reduction, which is a common post-processing technique applied to model output for real-word use, accuracy measured in RMSE after direct reconstruction is only slightly better, often still overlapping with indirect reconstruction. Because the advantage of direct reconstruction can similarly be achieved by a simple mean bias reduction, we label these direct reconstruction improvements as trivial with respect to the indirect method on the global scale. More advanced data assimilation methods may yield better reconstruction skill for the carbon cycle (Han et al.2004; Balmaseda et al.2007; Zhang et al.2007).

When the success of atmospheric CO2 reconstruction is evaluated, caution is needed. Reconstruction of the ocean and land carbon sink can easily introduce offsets from the target because reconstruction violates conservation of mass by creating or erasing carbon. This can easily lead to offsets in the sinks that quickly accumulate in atmospheric CO2. If CO2 reconstruction is the focus, i.e., in reconstructing the transient climate from CO2 emission, and offsets appear, adjustments of atmospheric CO2 might be needed to correct for these offsets. However, we find that these offset biases are only of the order of 1–2 ppm in a perfect-model framework, which is small compared to the range of carbon feedbacks seen in atmospheric CO2 in transient simulations. Hence, these offsets due to the restart files are not in our focus. Instead, equilibrated land and ocean carbon sinks with reconstructed climate determine realistic reconstructed atmospheric CO2.

In the second part, we find that predictive skill after indirect initialization is similar in quality to after direct initialization. This means that oceanic carbon cycle initial conditions are much less important than physical ocean initial conditions for oceanic carbon cycle predictions, which confirms the findings of (Fransner et al.2020). Reconstructed initialized predictive skill is close to perfectly initialized predictive skill after mean bias reduction, especially after lead year one.

Because the improved global predictive skill after direct reconstruction can similarly be achieved by a simple mean bias reduction and predictive skill after both reconstructions mostly overlaps, we label these direct reconstruction predictive skill improvements trivial with respect to the indirect method on the global scale. This result is similar to Fransner et al. (2020), who find that ocean carbon cycle initial conditions matter much less than physical ocean initial conditions for annual carbon cycle predictions.

We conclude that the indirect carbon cycle reconstruction serves its purpose of reconstructing variation in the global carbon cycle. However, our study is designed and conducted in an idealized framework. When transferring our results into assimilation of real-world observations and its implications on predictability, structural uncertainties (model resolution in space and time) and missing ecosystem processes additionally need to be dealt with. Future studies, especially those aiming to address regional marine ecosystems, could consider a wider range of assimilation techniques and data breadth. Furthermore, more advanced data assimilation techniques (Evensen1994; Han et al.2004; Balmaseda et al.2007; Zhang et al.2007) should be explored. Reducing the physical climate bias with its consequences for the carbon cycle holds more potential for improvements in initial conditions and predictive skill than direct carbon cycle initialization (Saito et al.2011; Lee and Biasutti2014; Hua et al.2019).

Nevertheless, our results add confidence to the current practice of indirect reconstruction in carbon cycle prediction systems (Ilyina et al.2021).

Appendix A: Metrics


The anomaly correlation coefficient (ACC) assesses the synchronous evolution over time of the forecast, here reconstruction x(t), and the reference, here target x^(t), (Jolliffe and Stephenson2011) and is defined as follows:

(A1) ACC ( x ( t ) , x ^ ( t ) ) = cov ( x ( t ) , x ^ ( t ) ) var ( x ( t ) ) var ( x ^ ( t ) ) = 1 T t = 1 T ( x ( t ) - x ( t ) ) ( x ^ ( t ) - x ^ ( t ) ) t = 1 T ( x ( t ) - x ( t ) ) 2 T t = 1 T ( x ^ ( t ) - x ^ ( t ) ) 2 T .


In the initial conditions reconstruction component, the root-mean-square error (RMSE) measures the second-order distance between forecast x(t), here reconstruction x(t), and the reference, here target x^(t), (Jolliffe and Stephenson2011) and is defined as follows:

(A2) RMSE ( x ( t ) , x ^ ( t ) ) = T = 1 T ( x ( t ) - x ^ ( t ) ) 2 T .

As a predictability metric, the RMSE measures the second-order distance between forecast x(t) and the target x^(t) over lead time t (Jolliffe and Stephenson2011). RMSE is calculated over all initializations N, and every member M is used as a forecast and verified against the target. RMSE is defined as follows:

(A3) RMSE ( x ( t ) , x ^ ( t ) ) = i , j = 1 N , M ( x i , j ( t ) - x ^ j ( t ) ) 2 N M .

A3 Bias

We set the target as the ground truth. Therefore, any deviation from the reconstructions x(t) to the target x^(t) is seen as a bias, analogous to the bias between a model simulation (reconstruction) and observations (ground truth).

(A4) bias(t ) = x ( t ) - x ^ ( t )

A4 Removing the bias

After removing the mean bias from reconstruction x(t) and target x^(t), the RMSE is also calculated as debiased RMSE.

(A5) RMSE debiased ( t ) = RMSE ( x ( t ) - x ( t ) , x ^ ( t ) - x ^ ( t )

A5 Running metric

We calculate the mean tracking performance (mtp) over time for all metrics as a running mean over s= 10 years. This reflects that reconstructions are supposed to reconstruct the given climate states for periods from months to a couple of years, and the metric should not be prone to long-term trends that are not captured by the reconstruction. We ignore the first c= 10 years (out of tmax= 48 years) of reconstruction, where the model experiences an initial shock after adjusting to the new reconstructed climate (Kröger et al.2017).

(A6) tpm ( metric ) = 1 t max - s - c t = c t max - s metric ( x ( t = t . . t + s ) , x ^ ( t = t . . t + s ) )

A6 Resampling threshold

To get an estimate of random tracking performance due to internal variability, i.e., how well one 10-year chunk tracks another random 10-year chunk, we randomly resample 10-year chunks from the target simulation and apply the same tracking metrics. As a baseline skill from this random resampling in the figures, we take the 95 % threshold for ACC and the 95 % for the remaining distance-based metrics to ensure that the tracking performance from a reconstruction simulation is only worse compared to 1 out of 20 randomly resampled 10-year chunks.

Appendix B: Reconstruction RMSE maps

Figure B1The same as Fig. 1 but for RMSE (a–f) and for RMSE after bias reduction (g–l).

Figure B2The same as Fig. 2 but for the RMSE. Gray stippling shows where the RMSE is worse than the 5th-percentile RMSE threshold from random target block resampling, i.e., the reconstruction is not significantly better than internal variability.

Figure B3The same as Fig. B2 but for RMSE after bias reduction.

Appendix C: Monthly global tracking performance

In order to explain the effect of the direct reconstruction in the land carbon cycle on global reconstruction performance, Fig. C1 shows the tracking performance for monthly time series, whereas Fig. 5 show only results for annual time series.

Figure C1The 10-year running mean reconstruction skill per month in bias (a, d, g, j and m), anomaly correlation coefficient (ACC, b, e, h, k and n) and root-mean-square error (RMSE, c, f, i, l and o) for global aggregation of carbon cycle variables: (a–c) surface oceanic partial pressure of CO2, (d–f) air–sea CO2 flux (negative values indicate carbon uptake by the ocean), (g–i) vegetation carbon pools, (j–l) air–land CO2 flux (negative values indicate carbon uptake by land) and (m–o) mixing ratio of atmospheric CO2. Whiskers show the 5th and 95th percentile of the running skill over time. Colors show different reconstruction methods: indirect (green) and direct (orange). Gray stars indicate perfect skill. Gray dots mark the 95th percentile for ACC and 5th percentile for the remaining distance-based metrics of random reconstruction skill block-bootstrapped from the target control simulation as an unskilled reference skill. Crosses show reconstruction skill of annual mean time series. Thin lines show monthly RMSE skill after a mean bias reduction.


Appendix D: Sensitivity analysis for different reconstruction time steps

D1 Land Carbon Cycle

We perform sensitivity reconstructions of the land restart file to understand how sensitive this reconstruction method is to the frequency of resetting. We performed additional simulations, resetting the land model on 1 January every second or every fifth year (orange triangles in Fig. D1).

Global cVeg starts by definition with perfect skill in January after a reset. When resetting only every second year, the mean January tracking performance is already decreased, and decreases further. The negative correlations for 5-year resetting shows the shock to the system if not immediately balanced by further resetting in the every (second) year case.

The global air–land CO2 flux correlation degrades for less frequent resetting towards the indirect performance, but bias and accuracy improve.

Global atmospheric CO2 aggregates these results and is also sensitive to biases developing in both sinks. Here, less frequent resetting of the land carbon cycle reduces the bias and therefore accuracy.

The tracking accuracy is of similar magnitude after mean bias reduction.

Figure D1The same as Fig. C1 but for sensitivity simulations of the restart file resetting reconstruction. In all simulations the physical climate is nudged as in indirect simulations (Table 1). DirectLR1ON describes land resetting every year and ocean nudging and is the indirect simulation. DirectLR2ON describes land resetting every second year and ocean nudging. DirectLR5ON describes land resetting every fifth year and ocean nudging. DirectLxOR1 describes no land reconstruction and ocean setting every year.


D2 Ocean carbon cycle

We perform the same kind of restart file resetting reconstruction with the ocean model (blue line in Fig. D1). The motivation here is to see whether a resetting of the ocean carbon cycle also yields perfect accuracy (RMSE) skill for January. However, the ocean carbon cycle is sensitive to the physical climate, and hence the direct ocean carbon cycle resetting accuracy degrades compared to the indirect tracking bias and accuracy, and only correlation increases (Fig. D1a–f). Contrary to resetting restart files in the land model, initial conditions accuracy measured by RMSE does not approach a perfect skill of 0 because the physical climate did not experience this hard reset but is nudged dynamically.

In general, this hard reconstruction also seems to work for the ocean carbon cycle because the tracking performances are not very different from the indirect method (Fig. D1a-f).

The tracking accuracy is of similar magnitude after mean bias reduction.

Appendix E: Seasonality

FigureE1 is provided below as a reference for Fig. C1 to allow the reader to better understand reconstruction skill in the context of target seasonality.

Figure E1Seasonality of the target simulation for global aggregated carbon cycle variables.


Appendix F: Schematics

Figure F1(a) Schematic of nudging with relaxation constant. (b) Schematic of reconstruction towards a target, where reconstructions are started from temporally independent restart files from the same simulation but 155 years later in time, i.e., 2005.


Figure F2Schematic overview of perfect-model target reconstruction simulations showing which variables are reconstructed in which simulations.


Appendix G: Climatology

Figure G1Mean climatology of the control simulations for all variables.

Figure G2Temporal internal variability expressed as temporal standard deviation from the control simulations for all variables.

Appendix H: Predictive skill of leaf area index (LAI)

We used LAI in a previous internal iteration of the paper but chose to replace LAI with cVeg. In our model JSBACH, LAI depends on climate, it is not a carbon variable. Therefore, we did not want to use this variable in this paper. However, there is an indirect link from LAI to air–land CO2 flux because LAI reflects droughts and the soil physics. A recent analysis focused on process-based understanding of land carbon predictability using JSBACH indicates that soil moisture and soil carbon storage influence the air–land CO2 flux the most (Dunkl et al.2021).

Figure H1The same as Fig. 6 but with leaf area index (LAI) instead of carbon vegetation pools (cVeg).


Code and data availability

Forecast verification was performed with the Python package CLIMPRED (Brady and Spring2021) (, last access: 13 October 2020, Brady et al.2021), which was co-developed with Riley X. Brady from University of Colorado, Boulder. Scripts and data to reproduce this analysis are archived at (Spring2021).

Author contributions

AS and TI conceived the study. AS performed the simulations and analysis, created the figures, and drafted the manuscript. TI, ID, HL and VB contributed to manuscript editing and provided feedback.

Competing interests

The authors declare that they have no conflict of interest.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


We acknowledge funding from European Union's Horizon 2020 Research and Innovation Programme under grant agreement no. 821003 “Climate-Carbon Interactions in the Current Century (4C)”, no. 820989 “COMFORT” and no. 641816 “CRESCENDO”. Simulations were performed at the German Climate Computing Center (DKRZ). We thank Jürgen Bader for internal review.

Financial support

This research has been supported by Horizon 2020 (grant nos. 821003, 820989, and 641816).

The article processing charges for this open-access publication were covered by the Max Planck Society.

Review statement

This paper was edited by Ning Zeng and reviewed by John Dunne and one anonymous referee.


Balmaseda, M. A., Dee, D., Vidard, A., and Anderson, D. L. T.: A Multivariate Treatment of Bias for Sequential Data Assimilation: Application to the Tropical Oceans, Q. J. Roy. Meteor. Soc., 133, 167–179,, 2007. a, b, c

Brady, R., Spring, A., Huang, A., Banihirwe, A., and Bell, R.: pangeo-data/climpred: Release v2.1.4 (2.1.4), Zenodo,, 2021. a

Brady, R. X. and Spring, A.: Climpred: Verification of Weather and Climate Forecasts, Journal of Open Source Software, 6, 2781,, 2021. a

Brune, S. and Baehr, J.: Preserving the Coupled Atmosphere–Ocean Feedback in Initializations of Decadal Climate Predictions, WIREs Clim. Change, 11, e637,, 2020. a

Dunkl, I., Spring, A., Friedlingstein, P., and Brovkin, V.: Process-based analysis of terrestrial carbon flux predictability, Earth Syst. Dynam. Discuss. [preprint],, in review, 2021. a, b

Efron, B. and Tibshirani, R. J.: An Introduction to the Bootstrap, 1st edn., available at: (last access: 9 November 2021), Chapman and Hall/CRC, New York, 1993. a

Evensen, G.: Sequential Data Assimilation with a Nonlinear Quasi-Geostrophic Model Using Monte Carlo Methods to Forecast Error Statistics, J. Geophys. Res.-Oceans, 99, 10143–10162,, 1994. a, b

Eyring, V., Bony, S., Meehl, G. A., Senior, C. A., Stevens, B., Stouffer, R. J., and Taylor, K. E.: Overview of the Coupled Model Intercomparison Project Phase 6 (CMIP6) experimental design and organization, Geosci. Model Dev., 9, 1937–1958,, 2016. a, b

Fransner, F., Counillon, F., Bethke, I., Tjiputra, J., Samuelsen, A., Nummelin, A., and Olsen, A.: Ocean Biogeochemical Predictions–Initialization and Limits of Predictability, Frontiers in Marine Science, 7, 386,, 2020. a, b, c, d

Friedlingstein, P., Jones, M. W., O'Sullivan, M., Andrew, R. M., Hauck, J., Peters, G. P., Peters, W., Pongratz, J., Sitch, S., Le Quéré, C., Bakker, D. C. E., Canadell, J. G., Ciais, P., Jackson, R. B., Anthoni, P., Barbero, L., Bastos, A., Bastrikov, V., Becker, M., Bopp, L., Buitenhuis, E., Chandra, N., Chevallier, F., Chini, L. P., Currie, K. I., Feely, R. A., Gehlen, M., Gilfillan, D., Gkritzalis, T., Goll, D. S., Gruber, N., Gutekunst, S., Harris, I., Haverd, V., Houghton, R. A., Hurtt, G., Ilyina, T., Jain, A. K., Joetzjer, E., Kaplan, J. O., Kato, E., Klein Goldewijk, K., Korsbakken, J. I., Landschützer, P., Lauvset, S. K., Lefèvre, N., Lenton, A., Lienert, S., Lombardozzi, D., Marland, G., McGuire, P. C., Melton, J. R., Metzl, N., Munro, D. R., Nabel, J. E. M. S., Nakaoka, S.-I., Neill, C., Omar, A. M., Ono, T., Peregon, A., Pierrot, D., Poulter, B., Rehder, G., Resplandy, L., Robertson, E., Rödenbeck, C., Séférian, R., Schwinger, J., Smith, N., Tans, P. P., Tian, H., Tilbrook, B., Tubiello, F. N., van der Werf, G. R., Wiltshire, A. J., and Zaehle, S.: Global Carbon Budget 2019, Earth Syst. Sci. Data, 11, 1783–1838,, 2019. a

Friedlingstein, P., O'Sullivan, M., Jones, M. W., Andrew, R. M., Hauck, J., Olsen, A., Peters, G. P., Peters, W., Pongratz, J., Sitch, S., Le Quéré, C., Canadell, J. G., Ciais, P., Jackson, R. B., Alin, S., Aragão, L. E. O. C., Arneth, A., Arora, V., Bates, N. R., Becker, M., Benoit-Cattin, A., Bittig, H. C., Bopp, L., Bultan, S., Chandra, N., Chevallier, F., Chini, L. P., Evans, W., Florentie, L., Forster, P. M., Gasser, T., Gehlen, M., Gilfillan, D., Gkritzalis, T., Gregor, L., Gruber, N., Harris, I., Hartung, K., Haverd, V., Houghton, R. A., Ilyina, T., Jain, A. K., Joetzjer, E., Kadono, K., Kato, E., Kitidis, V., Korsbakken, J. I., Landschützer, P., Lefèvre, N., Lenton, A., Lienert, S., Liu, Z., Lombardozzi, D., Marland, G., Metzl, N., Munro, D. R., Nabel, J. E. M. S., Nakaoka, S.-I., Niwa, Y., O'Brien, K., Ono, T., Palmer, P. I., Pierrot, D., Poulter, B., Resplandy, L., Robertson, E., Rödenbeck, C., Schwinger, J., Séférian, R., Skjelvan, I., Smith, A. J. P., Sutton, A. J., Tanhua, T., Tans, P. P., Tian, H., Tilbrook, B., van der Werf, G., Vuichard, N., Walker, A. P., Wanninkhof, R., Watson, A. J., Willis, D., Wiltshire, A. J., Yuan, W., Yue, X., and Zaehle, S.: Global Carbon Budget 2020, Earth Syst. Sci. Data, 12, 3269–3340,, 2020. a

Griffies, S. M. and Bryan, K.: A Predictability Study of Simulated North Atlantic Multidecadal Variability, Clim. Dynam., 13, 459–487,, 1997. a

Griffies, S. M., Danabasoglu, G., Durack, P. J., Adcroft, A. J., Balaji, V., Böning, C. W., Chassignet, E. P., Curchitser, E., Deshayes, J., Drange, H., Fox-Kemper, B., Gleckler, P. J., Gregory, J. M., Haak, H., Hallberg, R. W., Heimbach, P., Hewitt, H. T., Holland, D. M., Ilyina, T., Jungclaus, J. H., Komuro, Y., Krasting, J. P., Large, W. G., Marsland, S. J., Masina, S., McDougall, T. J., Nurser, A. J. G., Orr, J. C., Pirani, A., Qiao, F., Stouffer, R. J., Taylor, K. E., Treguier, A. M., Tsujino, H., Uotila, P., Valdivieso, M., Wang, Q., Winton, M., and Yeager, S. G.: OMIP contribution to CMIP6: experimental and diagnostic protocol for the physical component of the Ocean Model Intercomparison Project, Geosci. Model Dev., 9, 3231–3296,, 2016. a

Han, G., Zhu, J., and Zhou, G.: Salinity Estimation Using the TS Relation in the Context of Variational Data Assimilation: SALINITY ESTIMATION, J. Geophys. Res.-Oceans, 109, C03018,, 2004. a, b, c

Haney, R. L.: A Numerical Study of the Response of an Idealized Ocean to Large-Scale Surface Heat and Momentum Flux, J. Phys. Oceanogr., 4, 145–167,, 1974. a, b

Hua, W., Zhou, L., Nicholson, S. E., Chen, H., and Qin, M.: Assessing Reanalysis Data for Understanding Rainfall Climatology and Variability over Central Equatorial Africa, Clim. Dynam., 53, 651–669,, 2019. a

Ilyina, T., Six, K. D., Segschneider, J., Maier-Reimer, E., Li, H., and Núñez-Riboni, I.: Global Ocean Biogeochemistry Model HAMOCC: Model Architecture and Performance as Component of the MPI-Earth System Model in Different CMIP5 Experimental Realizations, J. Adv. Model. Earth Sy., 5, 287–315,, 2013. a

Ilyina, T., Li, H., Spring, A., Müller, W. A., Bopp, L., Chikamoto, M. O., Danabasoglu, G., Dobrynin, M., Dunne, J., Fransner, F., Friedlingstein, P., Lee, W., Lovenduski, N. S., Merryfield, W. J., Mignot, J., Park, J. Y., Séférian, R., Sospedra-Alfonso, R., Watanabe, M., and Yeager, S.: Predictable Variations of the Carbon Sinks and Atmospheric CO2 Growth in a Multi-Model Framework, Geophys. Res. Lett., 48, e2020GL090695,, 2021. a, b, c

Jeuken, A. B. M., Siegmund, P. C., Heijboer, L. C., Feichter, J., and Bengtsson, L.: On the Potential of Assimilating Meteorological Analyses in a Global Climate Model for the Purpose of Model Validation, J. Geophys. Res.-Atmos., 101, 16939–16950,, 1996. a, b

Jolliffe, I. T. and Stephenson, D. B.: Forecast Verification: A Practitioner's Guide in Atmospheric Science, John Wiley & Sons, Ltd, Chichester, UK,, 2011. a, b, c, d, e

Jones, C. D., Arora, V., Friedlingstein, P., Bopp, L., Brovkin, V., Dunne, J., Graven, H., Hoffman, F., Ilyina, T., John, J. G., Jung, M., Kawamiya, M., Koven, C., Pongratz, J., Raddatz, T., Randerson, J. T., and Zaehle, S.: C4MIP – The Coupled Climate–Carbon Cycle Model Intercomparison Project: experimental protocol for CMIP6, Geosci. Model Dev., 9, 2853–2880,, 2016. a

Jungclaus, J. H., Fischer, N., Haak, H., Lohmann, K., Marotzke, J., Matei, D., Mikolajewicz, U., Notz, D., and von Storch, J. S.: Characteristics of the Ocean Simulations in the Max Planck Institute Ocean Model (MPIOM) the Ocean Component of the MPI-Earth System Model, J. Adv. Model. Earth Sy., 5, 422–446,, 2013. a

Keenlyside, N. S., Latif, M., Jungclaus, J., Kornblueh, L., and Roeckner, E.: Advancing Decadal-Scale Climate Prediction in the North Atlantic Sector, Nature, 453, 84–88,, 2008. a

Kröger, J., Pohlmann, H., Sienz, F., Marotzke, J., Baehr, J., Köhl, A., Modali, K., Polkova, I., Stammer, D., Vamborg, F. S. E., and Müller, W. A.: Full-Field Initialized Decadal Predictions with the MPI Earth System Model: An Initial Shock in the North Atlantic, Clim. Dynam., 51, 2593–2608,, 2017. a

Lee, D. E. and Biasutti, M.: Climatology and Variability of Precipitation in the Twentieth-Century Reanalysis, J. Climate, 27, 5964–5981,, 2014. a

Lefèvre, N., Caniaux, G., Janicot, S., and Gueye, A. K.: Increased CO2 Outgassing in February-May 2010 in the Tropical Atlantic Following the 2009 Pacific El Niño, J. Geophys. Res.-Oceans, 118, 1645–1657,, 2013. a

Li, H., Ilyina, T., Müller, W. A., and Sienz, F.: Decadal Predictions of the North Atlantic CO2 Uptake, Nat. Commun., 7, 11076,, 2016. a

Li, H., Ilyina, T., Müller, W. A., and Landschützer, P.: Predicting the Variable Ocean Carbon Sink, Science Advances, 5, eaav6471,, 2019. a, b, c, d

Lin, S.-J. and Rood, R. B.: Multidimensional Flux-Form Semi-Lagrangian Transport Schemes, Mon. Weather Rev., 124, 2046–2070,, 1996. a

Lovenduski, N. S., Bonan, G. B., Yeager, S. G., Lindsay, K., and Lombardozzi, D. L.: High Predictability of Terrestrial Carbon Fluxes from an Initialized Decadal Prediction System, Environ. Res. Lett., 14, 124074,, 2019a. a, b

Lovenduski, N. S., Yeager, S. G., Lindsay, K., and Long, M. C.: Predicting near-term variability in ocean carbon uptake, Earth Syst. Dynam., 10, 45–57,, 2019b. a, b, c, d, e

Luo, H., Zheng, F., and Zhu, J.: Evaluation of Oceanic Surface Observation for Reproducing the Upper Ocean Structure in ECHAM5/MPI-OM, J. Geophys. Res.-Oceans, 122, 9695–9711,, 2017. a, b, c

Marotzke, J., Müller, W. A., Vamborg, F. S. E., Becker, P., Cubasch, U., Feldmann, H., Kaspar, F., Kottmeier, C., Marini, C., Polkova, I., Prömmel, K., Rust, H. W., Stammer, D., Ulbrich, U., Kadow, C., Köhl, A., Kröger, J., Kruschke, T., Pinto, J. G., Pohlmann, H., Reyers, M., Schröder, M., Sienz, F., Timmreck, C., and Ziese, M.: MiKlip: A National Research Project on Decadal Climate Prediction, B. Am. Meteorol. Soc., 97, 2379–2394,, 2016. a

Mauritsen, T., Bader, J., Becker, T., Behrens, J., Bittner, M., Brokopf, R., Brovkin, V., Claussen, M., Crueger, T., Esch, M., Fast, I., Fiedler, S., Fläschner, D., Gayler, V., Giorgetta, M., Goll, D. S., Haak, H., Hagemann, S., Hedemann, C., Hohenegger, C., Ilyina, T., Jahns, T., Jimenéz-de-la Cuesta, D., Jungclaus, J., Kleinen, T., Kloster, S., Kracher, D., Kinne, S., Kleberg, D., Lasslop, G., Kornblueh, L., Marotzke, J., Matei, D., Meraner, K., Mikolajewicz, U., Modali, K., Möbis, B., Müller, W. A., Nabel, J. E. M. S., Nam, C. C. W., Notz, D., Nyawira, S.-S., Paulsen, H., Peters, K., Pincus, R., Pohlmann, H., Pongratz, J., Popp, M., Raddatz, T. J., Rast, S., Redler, R., Reick, C. H., Rohrschneider, T., Schemann, V., Schmidt, H., Schnur, R., Schulzweida, U., Six, K. D., Stein, L., Stemmler, I., Stevens, B., Storch, J.-S., Tian, F., Voigt, A., Vrese, P., Wieners, K.-H., Wilkenskjeld, S., Winkler, A., and Roeckner, E.: Developments in the MPI-M Earth System Model Version 1.2 (MPI-ESM1.2) and Its Response to Increasing CO2, J. Adv. Model. Earth Sy., 11, 998–1038,, 2019. a

Meehl, G. A., Goddard, L., Murphy, J., Stouffer, R. J., Boer, G., Danabasoglu, G., Dixon, K., Giorgetta, M. A., Greene, A. M., Hawkins, E., Hegerl, G., Karoly, D., Keenlyside, N., Kimoto, M., Kirtman, B., Navarra, A., Pulwarty, R., Smith, D., Stammer, D., and Stockdale, T.: Decadal Prediction: Can It Be Skillful?, B. Am. Meteorol. Soc., 90, 1467–1486,, 2009. a, b

Merryfield, W. J., Baehr, J., Batté, L., Becker, E. J., Butler, A. H., Coelho, C. A. S., Danabasoglu, G., Dirmeyer, P. A., Doblas-Reyes, F. J., Domeisen, D. I. V., Ferranti, L., Ilynia, T., Kumar, A., Müller, W. A., Rixen, M., Robertson, A. W., Smith, D. M., Takaya, Y., Tuma, M., Vitart, F., White, C. J., Alvarez, M. S., Ardilouze, C., Attard, H., Baggett, C., Balmaseda, M. A., Beraki, A. F., Bhattacharjee, P. S., Bilbao, R., de Andrade, F. M., DeFlorio, M. J., Díaz, L. B., Ehsan, M. A., Fragkoulidis, G., Grainger, S., Green, B. W., Hell, M. C., Infanti, J. M., Isensee, K., Kataoka, T., Kirtman, B. P., Klingaman, N. P., Lee, J.-Y., Mayer, K., McKay, R., Mecking, J. V., Miller, D. E., Neddermann, N., Justin Ng, C. H., Ossó, A., Pankatz, K., Peatman, S., Pegion, K., Perlwitz, J., Recalde-Coronel, G. C., Reintges, A., Renkl, C., Solaraju-Murali, B., Spring, A., Stan, C., Sun, Y. Q., Tozer, C. R., Vigaud, N., Woolnough, S., and Yeager, S.: Current and Emerging Developments in Subseasonal to Decadal Prediction, B. Am. Meteorol. Soc., 101, 869–896,, 2020. a

Milinski, S., Bader, J., Haak, H., Siongco, A. C., and Jungclaus, J. H.: High Atmospheric Horizontal Resolution Eliminates the Wind-Driven Coastal Warm Bias in the Southeastern Tropical Atlantic, Geophys. Res. Lett., 43, 10455–10462,, 2016. a

Orr, J. C., Najjar, R. G., Aumont, O., Bopp, L., Bullister, J. L., Danabasoglu, G., Doney, S. C., Dunne, J. P., Dutay, J.-C., Graven, H., Griffies, S. M., John, J. G., Joos, F., Levin, I., Lindsay, K., Matear, R. J., McKinley, G. A., Mouchet, A., Oschlies, A., Romanou, A., Schlitzer, R., Tagliabue, A., Tanhua, T., and Yool, A.: Biogeochemical protocols and diagnostics for the CMIP6 Ocean Model Intercomparison Project (OMIP), Geosci. Model Dev., 10, 2169–2199,, 2017. a

Park, J.-Y., Stock, C. A., Yang, X., Dunne, J. P., Rosati, A., John, J., and Zhang, S.: Modeling Global Ocean Biogeochemistry With Physical Data Assimilation: A Pragmatic Solution to the Equatorial Instability, J. Adv. Model. Earth Sy., 10, 891–906,, 2018. a, b

Park, J.-Y., Stock, C. A., Dunne, J. P., Yang, X., and Rosati, A.: Seasonal to Multiannual Marine Ecosystem Prediction with a Global Earth System Model, Science, 365, 284–288,, 2019. a

Paulsen, H., Ilyina, T., Six, K. D., and Stemmler, I.: Incorporating a Prognostic Representation of Marine Nitrogen Fixers into the Global Ocean Biogeochemical Model HAMOCC, J. Adv. Model. Earth Sy., 9, 438–464,, 2017. a

Pohlmann, H., Jungclaus, J. H., Köhl, A., Stammer, D., and Marotzke, J.: Initializing Decadal Climate Predictions with the GECCO Oceanic Synthesis: Effects on the North Atlantic, J. Climate, 22, 3926–3938,, 2009. a, b

Pohlmann, H., Müller, W. A., Bittner, M., Hettrich, S., Modali, K., Pankatz, K., and Marotzke, J.: Realistic Quasi-Biennial Oscillation Variability in Historical and Decadal Hindcast Simulations Using CMIP6 Forcing, Geophys. Res. Lett., 46, 14118–14125,, 2019. a, b

Rast, S., Brokopf, R., Esch, M., Gayler, V., Kirchner, I., Kornblueh, L., Rhodin, A., and Schulzweida, U.: User Manual for ECHAM6, Tech. rep., available at:, (last access: 2 March 2021), Max Planck Institute for Meteorology, Hamburg, 2012. a

Saito, M., Ito, A., and Maksyutov, S.: Evaluation of Biases in JRA-25/JCDAS Precipitation and Their Impact on the Global Terrestrial Carbon Balance, J. Climate, 24, 4109–4125,, 2011. a

Schneck, R., Reick, C. H., and Raddatz, T.: Land Contribution to Natural CO2 Variability on Time Scales of Centuries, J. Adv. Model. Earth Sy., 5, 354–365,, 2013. a

Schneider, T. and Griffies, S. M.: A Conceptual Framework for Predictability Studies, J. Climate, 12, 3133–3155,, 1999. a

Séférian, R., Bopp, L., Gehlen, M., Swingedouw, D., Mignot, J., Guilyardi, E., and Servonnat, J.: Multiyear Predictability of Tropical Marine Productivity, P. Natl. Acad. Sci. USA, 111, 11646–11651,, 2014. a

Séférian, R., Berthet, S., and Chevallier, M.: Assessing the Decadal Predictability of Land and Ocean Carbon Uptake, Geophys. Res. Lett., 45, 2455–2466,, 2018. a

Servonnat, J., Mignot, J., Guilyardi, E., Swingedouw, D., Séférian, R., and Labetoulle, S.: Reconstructing the Subsurface Ocean Decadal Variability Using Surface Nudging in a Perfect Model Framework, Clim. Dynam., 44, 315–338,, 2015. a, b, c, d

Spring, A.: Reproducibility repository for “Spring, A., Dunkl, I., Li, H., Brovkin, V., & Ilyina, T. (in press). Trivial improvements in predictive skill due to direct reconstruction of global carbon cycle. Earth System Dynamics”, available at:, last access: 2 November 2021. a

Spring, A. and Ilyina, T.: Predictability Horizons in the Global Carbon Cycle Inferred From a Perfect-Model Framework, Geophys. Res. Lett., 47, e2019GL085311,, 2020. a, b, c, d, e, f, g, h

Spring, A., Ilyina, T., and Marotzke, J.: Inherent Uncertainty Disguises Attribution of Reduced Atmospheric CO2 Growth to CO2 Emission Reductions for up to a Decade, Environ. Res. Lett., 15, 114058,, 2020. a, b

Stevens, B., Giorgetta, M., Esch, M., Mauritsen, T., Crueger, T., Rast, S., Salzmann, M., Schmidt, H., Bader, J., Block, K., Brokopf, R., Fast, I., Kinne, S., Kornblueh, L., Lohmann, U., Pincus, R., Reichler, T., and Roeckner, E.: Atmospheric Component of the MPI-M Earth System Model: ECHAM6, J. Adv. Model. Earth Sy., 5, 146–172,, 2013. a

Toggweiler, J. R., Dixon, K., and Bryan, K.: Simulations of Radiocarbon in a Coarse-Resolution World Ocean Model: 1. Steady State Prebomb Distributions, J. Geophys. Res.-Oceans, 94, 8217–8242,, 1989. a

van den Hurk, B., Kim, H., Krinner, G., Seneviratne, S. I., Derksen, C., Oki, T., Douville, H., Colin, J., Ducharne, A., Cheruy, F., Viovy, N., Puma, M. J., Wada, Y., Li, W., Jia, B., Alessandri, A., Lawrence, D. M., Weedon, G. P., Ellis, R., Hagemann, S., Mao, J., Flanner, M. G., Zampieri, M., Materia, S., Law, R. M., and Sheffield, J.: LS3MIP (v1.0) contribution to CMIP6: the Land Surface, Snow and Soil moisture Model Intercomparison Project – aims, setup and expected outcome, Geosci. Model Dev., 9, 2809–2832,, 2016.  a

Wilks, D. S.: Statistical Methods in the Atmospheric Sciences, vol. 91 of International Geophysics Series, 2nd edn., Academic Press, Amsterdam, Boston, ISBN 13: 978-0-12-751966-1, ISBN 10: 0-12-751966-1, 2006. a

Zhang, S., Harrison, M. J., Rosati, A., and Wittenberg, A.: System Design and Evaluation of Coupled Ensemble Data Assimilation for Global Oceanic Climate Studies, Mon. Weather Rev., 135, 3541–3564,, 2007. a, b, c

Zhu, J. and Kumar, A.: Influence of Surface Nudging on Climatological Mean and ENSO Feedbacks in a Coupled Model, Clim. Dynam., 50, 571–586,, 2018. a, b, c, d

Short summary
Numerical carbon cycle prediction models usually do not start from observed carbon states due to sparse observations. Instead, only physical climate is reconstructed, assuming that the carbon cycle follows indirectly. Here, we test in an idealized framework how well this indirect and direct reconstruction with perfect observations works. We find that indirect reconstruction works quite well and that improvements from the direct method are limited, strengthening the current indirect use.
Final-revised paper