Proxy records from Greenland ice cores have been studied for several decades,
yet many open questions remain regarding the climate variability encoded
therein. Here, we use a Bayesian framework for inferring inverse,
stochastic–dynamic models from

Data-driven stochastic difference equation models have recently been successfully applied to a wide range of climatic phenomena

In general terms, stochastic–dynamic models are derived by approximating the discrete-time divided differences of observed time series
by a deterministic function

In the present study, we are specifically interested in fitting low-dimensional stochastic–dynamic models to high-resolution time
series of two paleoclimatic proxy records, namely the

In contrast, we choose here a data-driven, stochastic–dynamic approach: we intend to find a system of stochastic differential equations
(SDEs) to simulate time series that reproduce the statistical and dynamical properties of the observed

Given a multivariate, low-dimensional time series

The temporal evolution of both the

This bistability suggests using a system of two coupled SDEs with a double-well potential as a model of the processes generating the
two time series of

Here

Time series simulated by our empirical model will be compared to the original time series in terms of statistical properties, such as the probability density functions (PDFs) of the time series, their power spectra and the average waiting time between sharp transitions from stadials to interstadials. Furthermore, we will test the relevance of the different model ingredients, such as the nonlinear terms, the memory and the coupling terms, using Bayesian model selection criteria.

The general potential of Bayesian parameter inference such as MLE has recently been discussed for stochastic–dynamic climate models
from incomplete data

Time series of

We employ proxy records of

Dating uncertainties were reported for each measurement of the raw data, and they accumulate toward the more remote past, as
a consequence of the layer-counting procedure, which starts from the top of the core

The

The raw time series (blue curves in Fig.

It has recently been shown in general terms how to approximate the GLE (

Compared to the GLE (

In practical applications, the SDE (

In contrast to all other model parameters, the values for

The explicit coupled SDE system governing our stochastic–dynamic model is hence given by

Recently,

Following, e.g.,

Note that the functional form of the likelihood function in Eq. (

We remark that the approximation of the divided differences

On the other hand, the MLE is asymptotically optimal. In particular, it is efficient in the sense that the variance of the parameter
estimates achieves the so-called Cramér–Rao lower bound, which is optimal, as the number of samples tends to infinity

Simulated

Statistical properties of the observed and simulated

As a training set for the parameter optimization of our stochastic–dynamic model in Eq. (

We use the natural logarithm of the dust time series (i) because of the large range of dust concentration values and (ii) because it
has high co-variability with

Our results indicate that for

The specific values of the coefficients of Eq. (

In order to simulate optimal sample time series for the

Illustrative time series of

The means and standard deviations of the observed time series are reproduced well by the simulations: for the preprocessed

The simulated time series (Fig.

The observed time series are, due to the sawtooth-shaped transitions between stadials and interstadials, not symmetric under time
reversal. A quantitative measure of the time-reversal asymmetry is provided by the third-order moment

Third-order statistical moment

Sample-size-corrected Akaike information criteria (AICc) and Bayesian information criteria (BIC) for the different model
versions. Note that the model parameters include the standard deviations and correlation that appear in the respective model's noise term. For the
models with memory, we chose

Summary statistics for different memory step sizes

Given the very strong correlations

Following

For the dust concentrations (Fig.

The PDFs of the simulated

The spectra of both the

We studied the

The main features of our inverse model are (i) the nonlinear terms in the Markovian part, (ii) the inclusion of non-Markovian memory
terms and (iii) the coupling terms between the two time series. Cubic terms have previously been used to model the

The nonlinear terms can be physically motivated by the fact that the observed time series oscillate between two quasi-equilibria, namely the stadials and interstadials. If only linear terms were used, the bimodality of the observed time series, and hence the existence of two quasi-stable states, could not be reproduced (see Figs. A2 and A3).

The purely Markovian form of the model approximates the PDFs of the observed time series less well (Figs.

When removing all coupling terms between the two variables from the inverse-model Eq. (

Following previous authors

For the case at hand, both AICc and BIC consistently favor the full model, which includes nonlinear, memory and coupling terms. This
is followed by the linear coupled model with memory terms, the nonlinear coupled model without memory terms, the nonlinear model with
memory but without coupling terms, and finally the linear model without memory terms (Table

Note that the AICc penalizes higher numbers

We thus suggest interpreting the values in Table

We emphasize that the full model proposed herein has the highest number of parameters out of the different candidates but is still the one with the lowest BIC and AIC. Therefore, it can be argued that this number of parameters is not too high, and it is not likely that the full model over-fits the observed data.

Furthermore, it should be noted that the values of AICc and BIC can only be compared on the basis of the same underlying data. Since we
use a higher-resolution version of the NGRIP data as compared to the previous authors

As noted above, we chose for the memory step size

The model results presented here appear only in the high-resolution version of the NGRIP ice core record, which was originally sampled
every

We have shown that a coupled, two-dimensional stochastic–dynamic model with cubic drift term and linear delay terms is capable of
reproducing the statistical properties of

Key ingredients for an accurate simulation of the observed time series are as
follows.

High-resolution time series have to be used as training data, indicating that the high-frequency variability present in the
records plays a vital role for the overall evolution of the climate processes that generated the NGRIP ice core. Interpolation of the
raw data, which is sampled at depth intervals of

Cubic terms need to be included in the Markovian part of the model. This can be physically motivated by the presence of two
quasi-equilibria in the observed time series – the stadials and interstadials – that could not be modeled without two such
quasi-stable states in the underlying dynamical system. Cubic terms have already been considered in previous attempts to model the

Coupling terms between the

Non-Markovian terms that account for memory effects are helpful. Their inclusion allows, to some extent, reproducing the
time-reversal asymmetry of the dust time series. The main contribution of including memory terms into the model is, however, to
improve the average simulated waiting times between subsequent transitions from stadials to interstadials (cf. Fig. 2e) for

Our results demonstrate that the statistical characteristics of the roughly 40 ka long, high-resolution NGRIP time series of

We note, though, that

The predictive power of the proposed stochastic–dynamic model for the abrupt transitions from stadials to interstadials should be addressed in future work.

The high-resolution NGRIP data used in this study are available online at

The approach to data-driven stochastic–dynamic modeling taken here is rooted in the MZ formalism of statistical mechanics

Assume that

By orthogonally projecting Eq. (

Ergodic-type arguments show that the averaged part can in principle be learned from a time series, assuming the existence of a “nice”
invariant measure

Note that Eq. (

The coefficients of the explicit SDE system, obtained from MLE.

Probability density of the least-squares residuals for

Same as Fig. 2 in the main text but for the model without nonlinear terms.

Same as Fig. 3a and b in the main text but for the model without nonlinear terms.

Same as Fig. 2 in the main text but for the model without memory terms.

Same as Fig. 3a and b in the main text but for the model without memory terms.

Same as Fig. 2 in the main text but for the model without coupling terms. Note that a coevolution of the observed

Same as Fig. 3a and b in the main text but for the model without coupling terms.

MB, D-DR and AS provided the data. NB, MDC, MG, DK and HL conceived of the research. NB conducted the numerical analysis and prepared the manuscript. All authors discussed the results, drew conclusions and edited the manuscript.

The authors declare that they have no competing financial interests.

This research was initiated by a collaboration between D.-D. Rousseau and the late Sigfús Johnsen, to whose memory it is dedicated. N. Boers acknowledges funding by the Alexander von Humboldt Foundation and the German Federal Ministry for Education and Research. M. D. Chekroun, M. Ghil, D. Kondrashov and H. Liu acknowledge support by grant N00014-16-1-2073 from the Multidisciplinary University Research Initiative (MURI) of the Office of Naval Research and by National Science Foundation grant OCE-1243175. M. D. Chekroun and H. Liu also acknowledge support by National Science Foundation grants DMS-1616981 and DMS-1616450, respectively. D. Kondrashov also acknowledges support by the Government of the Russian Federation (agreement no. 14.Z50.31.0033 with the Institute of Applied Physics of RAS). This is LDEO contribution no. 8167. NGRIP is directed and organized by the Ice and Climate research group, Niels Bohr Institute, University of Copenhagen. It is supported by funding agencies in Denmark (FNU), Belgium (FNRS-CFB), France (IPEV and INSU/CNRS), Germany (AWI), Iceland (RannIs), Japan (MEXT), Sweden (SPRS), Switzerland (SNF) and the USA (NSF, Office of Polar Programs). Edited by: Anders Levermann Reviewed by: Takahito Mitsui and one anonymous referee