Comment on esd-2021-4

overall – I think this is the longest abstract I have ever seen, yet it only describes concepts vaguely. I recommend the authors strip out the details of internally defined distinctions and spend more time on the implications of “We 25 find improvements in global carbon cycle predictive skill from direct reconstruction compared to indirect reconstruction. After correcting for mean bias, indirect and direct reconstruction both predict the target similarly well and only moderately worse than perfect initialization after the first lead year.”

global carbon cycle" by Spring et al. describe results from reconstructed simulations nudging a model simulation to itself for atmospheric physics, ocean physics, and carbon cycle as well as perfect predictability experiments with the original model and these reconstructions. I like this paper very much as a detailed investigation into the limits of injesting "data" into a model. The text is extremely dense, with many concepts and model limitations all being discussed at once with a focus on biases without any description of the mean state, which made me need to read over sections over and over before I was able to put the pieces together. For example, the degredation of ITCZ and Southern Ocean winds is only clear after one puts together a mental map of the base state. It might help to have the base simulation of each parameter in Figure 1 as a new Figure 1 or provided as supplementary material. The use of language was combersome, however, with such vague words as "indirect", "direct" and "reconstruction" are used when descriptive terms like "physically nudged" "physically nudged atmosphere" and "physcially and biogeochemically nudged" would have worked. I am guessing that there is a literature precedent for this redirection of terms, but it made the early parts of the manuscript difficult to maintain in scope. The discussion of Figure 2 is incomplete and extends out through Figure 5. The conclusions seem a bit wanting of the opportunities for future investigation. Rather than being satisfied with "We conclude that the indirect carbon cycle reconstruction serves its purpose." It would be much more productive to point out what alternatives to nudging might provide superior options for future work. It should also be noted in the conclusion that the present work does not address the potential role for structural uncertainty, and potential for ecosystems to be more complex than represented in the current model and thus needing external constraint and providing a potential advantage to "direct initialization". technical comments: p1,ln8 -"We nudge variables from this target onto arbitrary initial conditions 150 years later mimicking an assimilation simulation generating initial conditions for hindcast experiments of prediction systems" I don't understand how this process works from this description. There is also a comma missing after "later" Instead, it sounds like the authors "nudged variables towards simulations from the same run 150 years earlier" to create a reconstruction of the target dataset. P1 ln12 -I don't quite understand the distinction between "direct reconstruction" and indirect reconstruction". It is not defined in the abstract.
Abstract overall -I think this is the longest abstract I have ever seen, yet it only describes concepts vaguely. I recommend the authors strip out the details of internally defined distinctions and spend more time on the implications of "We 25 find improvements in global carbon cycle predictive skill from direct reconstruction compared to indirect reconstruction. After correcting for mean bias, indirect and direct reconstruction both predict the target similarly well and only moderately worse than perfect initialization after the first lead year." Ln41 -"where the forecast is started from" is redundant.
Ln 55 -This sentence is an identity "In this perfect-model target reconstruction framework, we have perfect knowledge about the ground truth and a perfect model" Ln 58 -"Originally"? A reference should be provided as to the early work that is being invoked.
Ln 60 and 61 -This appears to be describing results and conclusions of the present work. References should be provided to establish the literature context (as is done on ln 62).

Ln63 -comma needed after "change"
Ln 65 -How do you know about these "severe consequences"? what is the citation? I know that this problem is discussed in the following, but there must be others: Ln 91 -This is a strange justification. One could make the same argument for N2 or O2… presumably the reason for focusing on carbon has more to do with relevance to society. Is the question being answered why land and ocean are being treated together? If so, perhaps "We focus on the combined ocean and land aspects of the carbon cycle because this allows us to explore the implications of flux predictability for atmospheric CO2 as wellmixed greenhouse gas." Ln 123 -", when also" should be "when" Ln 221 -I believe "also" should be "and" Ln 244-245 -"dominated by the bias of pCO2" instead of the bias in temperature?
Ln 248 -The description of this figure suddenly stops without addressing the XCO2 panels. Ln 299 -I believe "than" is intended after "larger" Ln 302 -not sure why this sentence has its own paragraph Ln 347 -It is only here, after Figure 5 is presented, that I get to find why Figure 2o looks so much like Figure 2m. If I understand correctly, it is a coincidence - Figure 2m is high because the surface temperature is high, while Figure 2o is high because the land releases CO2 over the course of the year do to the climate mismatch. A statement to this effect near Ln 248 before moving on to Figure 3 would help orient the reader.
Ln 352 -"direction" should be "direct" Ln 362 -I believe "also" should be "even" Ln 365 -measuring" should be "measured by" Ln 407 -"but below the initialized" is unclear, is "but drifts slightly below the initialized value over the course of the simulation" intended?
Ln 422 -"For a real-world application, our direct land carbon reconstruction method cannot be used." I would disagree with this statement and should change "cannot" to "should not". The easiest form of data assimilation for land would be to simply over-write the vegetation biomass periodically from a satellite product, something very similar in principle to what is being done here. I think the more interesting question that is answered here is why that is a bad idea. I think this is a point very much worth making as satellite products become more diverse and land initialization approaches are considered.
Ln 424 -This conclusion appears to be the crux of the paper -that the nudging technique introduces such large biases in climate mean state as to make the "direct" approach incompatible with the original model. I am not an expert on physical data assimilation, but isn't that the reason that ensemble Kalman filter is used rather than nudging? Would one expect these other techniques that do not shift the ITCZ or dampen Southern Ocean winds to also find a "trivial" role for BGC initialization? 457 -Rather than being satisfied with "We conclude that the indirect carbon cycle reconstruction serves its purpose." It would be much more productive to point out what alternatives to nudging might provide superior options for future work. It should also be noted in the conclusion that the present work does not address the potential role for structural uncertainty to provide an advantage to "direct initialization"