Evaluation of terrestrial pan-Arctic carbon  cycling using a data-assimilation system

López-Blanco, Efrén; Exbrayat, Jean-François; Lund, Magnus; Christensen, Torben R.; Tamstorf, Mikkel P.; Slevin, Darren; Hugelius, Gustaf; Bloom, Anthony A.; Williams, Mathew

doi:https://doi.org/10.5194/esd-10-233-2019

Articles | Volume 10, issue 2

https://doi.org/10.5194/esd-10-233-2019

© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/esd-10-233-2019

© Author(s) 2019. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 10, issue 2

Research article

|

24 Apr 2019

Research article |

| 24 Apr 2019

Evaluation of terrestrial pan-Arctic carbon cycling using a data-assimilation system

Efrén López-Blanco, Jean-François Exbrayat, Magnus Lund, Torben R. Christensen, Mikkel P. Tamstorf, Darren Slevin, Gustaf Hugelius, Anthony A. Bloom, and Mathew Williams

Download

Final revised paper (published on 24 Apr 2019)
Supplement to the final revised paper
Preprint (discussion started on 22 May 2018)
Supplement to the preprint

Interactive discussion

Status: closed

AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment

- Printer-friendly version

- Supplement

RC1: 'Good idea - but needs improvements', Matthias Forkel, 20 Jun 2018
- AC1: 'Reply to Reviewer', Efrén López Blanco, 24 Oct 2018
RC2: 'Review of López-Blanco 2018', Anonymous Referee #2, 07 Aug 2018
- AC2: 'Reply to Reviewer', Efrén López Blanco, 24 Oct 2018

Peer-review completion

AR: Author's response | RR: Referee report | ED: Editor decision

ED: Reconsider after major revisions (05 Nov 2018) by Ning Zeng

AR by Efrén López-Blanco on behalf of the Authors (07 Nov 2018) Author's response Manuscript

ED: Referee Nomination & Report Request started (08 Nov 2018) by Ning Zeng

RR by Matthias Forkel (19 Nov 2018)

RR by Anonymous Referee #2 (06 Feb 2019)

Suggestions for revision or reasons for rejection

I’m going to be upfront that I’m very torn about what to recommend with respect to this paper. On the one hand, I acknowledge the incredible amount of work that went into this project and believe that there is important and interesting science coming out of this project. On the other hand, based on the responses to questions raised, it is now clear there are definitely things here that I don’t think were done correctly. What complicates this is that many of the things done wrong (especially with respect to model process error) were also done wrong in previous papers on the Bayesian calibration of terrestrial carbon models (both by this team and others). This helps explain such mistakes, but it doesn’t justify them, and I worry that continuing to allow papers to make the same mistakes just perpetuates the situation. The crux of the issue is really in how the authors are treating the error term in their likelihood. First, they are ascribing 100% of the error as coming from the observations, and not acknowledging (statistically) that their model is imperfect (though their own Results and Discussion clearly demonstrate that the model is far from perfect). By incorrectly ascribing 100% of the error to observations, and none to process error (model misspecification, stochastic events, unaccounted for heterogeneity), the authors are also missing that (unlike observation error) process error propagates forward into model predictions. This means that modeled fluxes and pools are going to be consistently overconfident by an unknown (but potentially nontrivial) amount. Second, not only do the author ascribe all the error to observations, but they treat that observation error as a known parameter, despite acknowledging that the data products used don’t have error estimates. This is a significant departure from standard statistical modeling, where the variance is an unknown fit parameter. For example, when you fit a linear regression the model has three unknown parameters (slope, intercept, sigma) and sigma is virtually never treated as an a prior known quantity. While treating sigma as a known shouldn’t have large effects on the mean values of the model parameters (though this is far from guaranteed when dealing with nonlinear models; Jensen’s Inequality), more important is that it can have a real effect on the uncertainties about the model parameters. By subjectively choosing the observation error, one is also subjectively choosing the confidence intervals on the parameters. And since in CARDAMOM the only uncertainties that are included in predictions are parameter uncertainties, this also means you are subjectively choosing the uncertainty in the predictive confidence intervals. Ideally, these models should be refit including an unknown, fit model process error, and then that process error should be propagated into predictions/hindcasts. This process error ideally should also be in addition to, not instead of, an observation error (which may not be a known, but may have an informative prior on it)

Additional points of concern:

1) Neither the DALEC2 model nor the CARDAMOM system appear to be publically archived. This means this work can’t be reproduced or expanded upon by others. I don’t know if such lack of openness is within the letter of the law of this journal, but it’s definitely a deviation from the current norms of the community.

2) As noted in my original review, I’m not comfortable with this system being called data assimilation, at least not with some additional qualifier being added (e.g. “parameter data assimilation”) to make it clear that the outputs are deterministic model forward simulations not a reanalysis. To me, calling this data assimilation is like calling linear regression “machine learning.” Sure people do it, but it makes the term pretty meaningless.

3) After clearly diagnosing your photosynthesis scheme (ACM) as being at the root of model biases and compensating errors, the decision to not include any ACM parameters in the calibration (and toss the issue up to a lack of acclimation rather than simple miscalibration) strikes me as odd and I cannot understand why the authors are digging in their heels on this.

4) Similar to (3), since NPP in DALEC is very tightly tied to GPP, and TT = Cstock/NPP, it sure seems like systematic biases in GPP will translate to systematic biases in TT. As noted earlier, I find some of the reported TT estimates to be implausible and don’t understand the authors resistance to even considering comparing their results to independent field estimates.

5) The differences between DALEC and observations are greater than the differences between DALEC and the ISI-MIP models, so why are the authors so hard on the ISI-MIP models?

Detailed comments:

L60: The authors responses suggested that a more complex calculation of TT was actually performed that relaxed the assumption of steady state. I would include that here (along with the steady state calculation) as I suspect a number of readers (myself included) would prefer to know that you’re not relying on a steady state assumption to assess a system that’s clearly not in steady state.

L160: This line refers to DALEC2 as an ‘intermediate complexity’ model, but later arguments actually hinge on it being a simple model, and most of us would consider DALEC to really be on the simple end of the process-model spectrum

L171: MODIS LAI reports an uncertainty estimate. How did you aggregate those uncertainties when aggregated the observations? This is nontrivial as neither the MODIS products or MODIS LAI validation papers report anything about the spatial or temporal autocorrelation in the product’s errors.

L188: Table S2 looks like it just contains a bunch of uniform priors for all other parameters. I think that should be stated here so that readers don’t need to find the supplement to learn that. It’s perfectly fair, however, to make readers go to the supplement to see the exact numerical values of the priors.

L194: This sentence states that MODIS doesn’t report an uncertainty estimate, but that’s not accurate.

L206: I’m concerned about the way the statistics are being reported here. For example, the RMSE of a model is traditionally based on the model error (difference between the model and the observations). Here, the authors are defining the model’s RMSE as the RMSE after applying both a multiplicative and additive bias correction (i.e. the predicted/observed regression). Similarly, the R2 isn’t the variance explained by the model, but the variance jointly explained by the model and a linear bias correction to that model. This results in a very optimistic view of the model’s actual performance.

L251: Just want to continue to express my skepticism about some of these pool and flux estimates. For example, in my own experiences in Alaska, the boreal forest has WAY more than 160% more structural tissue than the tundra. There needs to be some independent plot-scale validation of this.

L258: Likewise, this stem turnover time seems much too fast and needs independent validation. I understand that grid cell to plot- or plant-scale validation isn’t perfect, but it’s better to report the performance explicitly, and then cushion it based on possible scale mismatch, rather than to ignore whether these estimates are consistent with prior research.

L294: typo on “uncertainties”

L313: It would be good to have some sort of quantification of spatial coherence beyond RMSE & R2 (which are nonspatial). Look to the GIS and remote sensing literature for examples of what sort of statistics are available to do this.

L328: Don’t introduce new Methods in the Results. Please document what this analysis is and why you are doing it earlier in the paper.

L378: Consistent with my previous concerns, DALEC appears to be running to fast. That said, this is still a comparison to other models, not to data.

L391: Here you say you had a ‘strong prior on photosynthesis’ but as far as I can tell the photosynthetic parameters were fixed at defaults, not assigned priors. According to Eqn 2, the only 2 parameters assigned non-uniform priors were canopy efficiency (which in Tables 2 and S2 is labeled as a phenology parameter) and autotrophic respiration

L397: If you’ve demonstrated a bias in your photosynthetic model, I’m not sure I agree that this could be resolved with more precise data if you’re not updating the parameters in the photosynthetic submodel

L427: I fundamentally disagree that models should be benchmarked against highly-derived, model-based data products. But this isn’t the central point of the paper and thus I won’t hold up this paper over that disagreement.

L459: While it’s true that brute-force MCMC is not feasible for complex models, but there are other options available that do work with larger models, such emulators (Fer et al 2018 Biogeoscience) and ensemble or particle filters.

L477: For the record, if you didn’t fit every grid cell independently then you wouldn’t need to upscale/interpolate field observations.

L495: Where are the DALEC2 and CARDAMOM code repositories?

Table 2: I find it interesting that, given the papers focus on turnover times, turnover parameters are the least constrained part of the model.

Hide

ED: Reconsider after major revisions (07 Feb 2019) by Ning Zeng

AR by Efrén López-Blanco on behalf of the Authors (14 Mar 2019) Author's response Manuscript

ED: Publish as is (23 Mar 2019) by Ning Zeng

AR by Efrén López-Blanco on behalf of the Authors (31 Mar 2019) Author's response Manuscript

Download

Article (12786 KB)
Full-text XML

Short summary

The terrestrial CO₂ exchange in Arctic ecosystems plays an important role in the global carbon cycle and is particularly sensitive to the ongoing warming experienced in recent years. To improve our understanding of the atmosphere–biosphere interplay, we evaluated the state of the terrestrial pan-Arctic carbon cycling using a promising data assimilation system in the first 15 years of the 21st century. This is crucial when it comes to making predictions about the future state of the carbon cycle.