The current state of development and the prospects of the regional MiKlip decadal prediction system
for Europe are analysed. The MiKlip regional system consists of two 10-member hindcast
ensembles computed with the global coupled model MPI-ESM-LR downscaled for the European
region with COSMO-CLM to a horizontal resolution of 0.22
In recent years, interest in climate predictions on timescales from 1 year up to a decade
has increased in the climate science community, since this time span falls within the
planning horizon for a wide variety of decision makers (Meehl et al., 2009, 2014). A
large ensemble of initialized decadal hindcasts has been consolidated in a component of
the Coupled Model Intercomparison Project Phase 5 (CMIP5; Taylor et al., 2012), and the
number of studies aiming at decadal predictions has strongly increased in recent years
(for a review see Meehl et al., 2014). Typically, the North Atlantic is a key region for
decadal predictions and forecast skill is found for various quantities such as heat
content and sea surface temperature (e.g. Kröger et al., 2012; Yeager et al., 2012),
In the German research consortium MiKlip (
The MiKlip consortium is to our best knowledge the first institution worldwide which has established a decadal prediction system for the regional scale. With this aim, considerable efforts were made to downscale the global MPI-ESM hindcasts by developing and/or employing different regionalization techniques. After the second project phase, an exceptionally large ensemble was regionalized by dynamical downscaling with regional climate models. Although being computationally expensive, dynamical downscaling has many advantages compared to other downscaling methods. For example, all output variables are physically consistent in dynamically downscaled model runs, which is particularly important when using decadal predictions for impact modelling, hydrological simulations, or user-oriented parameters. Previous experiences reveal that a skill for regional decadal predictions exists but that the interpretation of the results is quite complex due to the non-linear relationship to the global prediction skill. For example, Mieruch et al. (2014) found rather heterogeneous predictive skill for precipitation and temperature over Europe in the baseline0 generation. The skill differs over space, season, variable, and lead time after initialization. However, a general feature is an improved model spread for precipitation in the downscaled hindcasts when compared to their global counterparts. A potential for predicting regional peak winds and wind energy potentials over central Europe several years ahead was identified in Haas et al. (2016) and Moemken et al. (2016). Specifically, they found highest skill scores for the first years after initialization. All the individual studies analysing the MiKlip prediction system consider different ensembles, variables, lead times, skill metrics, and/or downscaling and data pre-processing methods. Therefore, it is difficult to draw general conclusions regarding skill over Europe in the MiKlip decadal prediction system. In particular, an overall statement for the benefit of regionalization and thus for the prospects of a regional decadal prediction system is hardly possible so far. This motivated us to analyse both the global and the downscaled MiKlip ensemble with respect to different issues.
In this study, the decadal predictive skill for temperature, precipitation, and wind
speed over Europe is analysed for the baseline0 and baseline1 generations of the MiKlip
system. With this aim, we used the same methodologies for all three variables to ensure
comparability. Global MPI-ESM and downscaled hindcast ensembles are considered to address
the following four key questions:
Is there a potential for skilful regional decadal predictions in Europe? Does regional downscaling provide an added value for decadal predictions?
How does the regional decadal predictive skill depend on the ensemble size? How does the number of initializations affect the skill estimates?
The datasets used in this study are described in Sect. 2, followed by the methodologies
for data pre-processing and skill analysis in Sect. 3. The results for the four key
questions are shown in Sect. 4. A summary and discussion, as well as an outlook for
future work, are given in Sect. 5.
The analysed global hindcasts were simulated with the coupled model MPI-ESM in low
resolution (MPI-ESM-LR; Giorgetta et al., 2013). Its atmospheric component is based on
the ECHAM6 model (Stevens et al., 2013) with a horizontal resolution of T63 and
47 vertical levels, which is coupled to the MPI-OM (Jungclaus et al., 2013) with a
horizontal resolution of 1.5
The global hindcasts are dynamically downscaled to the EURO-CORDEX domain (Giorgi et al.,
2009; cf. Fig. 1) at a horizontal grid resolution of 0.22
CCLM modelling domain (EURO-CORDEX domain): Model orography and PRUDENCE regions. 1: British Isles, BI; 2: Iberian Peninsula, IP; 3: France, FR; 4: Mid-Europe, ME; 5: Scandinavia, SC; 6: Alps, AL; 7: Mediterranean, MD; and 8: eastern Europe, EA.
We evaluate the performance of both the global MPI-ESM and the regional CCLM hindcasts
with the following datasets. For temperature and precipitation, we consider the
observational dataset E-OBS (Haylock et al., 2008) based on the ECA & D (European
Climate Assessment & Dataset;
In this study, we want to quantify if the initialization with observed climate states improves the performance of decadal predictions. To address this issue, uninitialized historical CMIP5 runs are usually considered as reference dataset (see also Sect. 3.2). With this aim, a 10-member ensemble of uninitialized MPI-ESM-LR historical runs started from a pre-industrial control simulation are used, which use observed natural and anthropogenic forcings (e.g. aerosol and greenhouse gas concentrations among others) for the period 1850–2005 (e.g. Müller et al., 2012).
All datasets considered in this study are pre-processed in an analogous
manner to enable a direct comparison. First, all data are bilinear
interpolated to the grid used for the E-OBS data (0.25
Following the suggestion of Goddard et al. (2013), the skill analysis is partly performed for spatial means. Spatial averaging of the anomaly time series is performed for eight PRUDENCE regions over Europe (see Fig. 1; Christensen and Christensen, 2007). Note that we only used grid points over land surfaces for the spatial means as E-OBS data are not available over the oceans. Additionally, we calculated the predictive skill on the basis of all individual grid points for specific analysis.
The following metrics are used to evaluate the performance of the global and regional hindcast ensembles and to address the four key questions: the mean squared error skill score (MSESS) and the anomaly correlation coefficient (ACC), which are both measures for the forecast accuracy. The skill metrics are applied to the pre-processed time series described in Sect. 3.1 and are computed for multi-annual means for lead time years 1–5 after initialization. Recent studies analysing the MiKlip decadal prediction system demonstrated that the MiKlip ensemble performs best for the first years after initialization for a wide range of variables, while the skill diminishes for longer forecast periods. For example, Müller et al. (2012) found highest skill scores for years 1–4 and 2–5 for annual mean surface temperature for both the North Atlantic region and global means. The same is true for annual wind speed and wind energy potentials over central Europe, for which skilful predictions are mainly restricted to the first years after initialization (years 1–4), while negative skill scores are found for longer lead time periods (Moemken et al., 2016). Kruschke et al. (2014) provided evidence that the prediction skill for winter cyclones over the North Atlantic region is best for years 2–5 and reduced for longer time periods. Following the recommendation by Goddard et al. (2013), we focus, in the following, on the lead time years 1–5 after initialization.
The deterministic MSESS (Murphy, 1988) is defined as
Following Murphy (1988) and given that anomalies are used (as in this
study), the MSESS with the climatology as reference (i.e.
Hence, MSESS depends on the conditional bias, as well as on the correlation. It is
smaller than the correlation in the case that there is a conditional bias, for which the
optimal value is 0. CB depends on the balance between correlation and the ratio of the
standard deviation between the ensemble and the observation. The improvement or added
value of CB is calculated according to Kadow et al. (2015) as
The significance of the skill scores is determined by using a bootstrapping approach at
the 95 % level (Kadow et al., 2015) and a
In this section, we address the first key question and analyse the general potential for
skilful regional decadal predictions over Europe. Figure 2 shows MSESS plots for
temperature, precipitation, and surface wind speed in CCLM_b0 and CCLM_b1 with the
climatology as reference. Not surprisingly, the MSESS is positive for most of Europe for
temperature. It is significant for most regions in western Europe in CCLM_b0 and large
parts of southern Europe in CCLM_b1 (Fig. 2a and b). This is due to the strong positive
trend in the observed temperature, which is predicted by the hindcasts but not captured
by the climatology. Deviations between both ensembles are larger for precipitation
(Fig. 2c and d), where the MSESS fields are distinctly patchier when compared to
temperature (Fig. 2a and b). While positive and significant skill scores are found over
large parts of western Europe in CCLM_b1 (Fig. 2d), MSESS values are mostly negative
over this region in CCLM_b0 (Fig. 2c). For wind speed (Fig. 2e and f), a positive MSESS
is found only for northern Europe and CCLM_b0, where skill scores are often significant.
As at the same time negative skill scores are found for other European regions in both
ensembles, the climatology is generally closer to the observations than the hindcasts. In
this respect, we have analysed the respective spatial mean wind speed time series for the
Iberian Peninsula (Prudence region 2) and CCLM_b0. The wind speed shows a slight
negative trend in CCLM_b0, while the trend is slightly positive for the observational
dataset (not shown). At the same time, the decadal variability for wind speed is quite
small over this region in all datasets (it ranges from 0.02 to
Spatial distribution of the MSESS for the multi-annual mean anomalies
of lead years 1–5 for
A different picture is revealed when using the uninitialized historical runs as reference dataset for the MSESS computation (Fig. 3). For temperature (Fig. 3a and b), positive skill scores are found in both ensembles over Scandinavia and for southeastern Europe, and at some grid points this prediction skill is significant. A stripe of negative values occurs over the British Isles and central Europe. The analysis of the time series for Mid-Europe (spatial mean over Prudence region 4) reveals that this negative skill mainly results from a strong temperature increase from dec1960 to dec1970 in the observations, while CCLM_b0 and CCLM_b1 depict a decrease in temperature (not shown), which in fact was observed in southern Europe for instance. As a consequence, the temperature in the hindcasts has larger deviations than the uninitialized simulations compared to the observations during the first half of the considered period, but agree well to the observations from dec1980 onwards. The largest deviations between CCLM_b0 and CCLM_b1 are found for Iberia, parts of southern France, and Italy, where the MSESS is positive for CCLM_b1 but neutral to negative for CCLM_b0.
Spatial distribution of the MSESS for the multi-annual mean anomalies
of lead years 1–5 for
Again, deviations in the MSESS fields between both ensembles are larger for precipitation (Fig. 3c and d), reflecting the local character of rainfall. Both ensembles show positive and partly significant MSESS values for regions in Scandinavia and eastern Europe, and to a lesser extent for Iberia and the British Isles (Fig. 3c and d). In CCLM_b1, predictive skill is also identified over western central Europe. Thus for CCLM_b1 positive skill is found for larger areas indicating an added value of the improved initialization procedure in baseline1 compared to baseline0.
Regarding wind speed, the predictive skill in CCLM_b0 (Fig. 3e) shows high and significant MSESS values over Scandinavia, Iberia, southern Italy, and along the coasts of the North Sea and Baltic Sea, while negative values are found, for example, over parts of France, southern Germany, and the Alpine region. In CCLM_b1, the MSESS depicts positive values over most of western and central Europe, while negative values are now identified along the eastern coast of the Baltic Sea (Fig. 3f). Overall the predictive skill of CCLM_b0 is slightly higher and affects a larger area, indicating that the changes in the initialization method do not improve the results for wind speed.
We conclude that in terms of the MSESS accuracy there generally is a potential for skilful decadal predictions over Europe in the regional MiKlip ensembles. However, the skill pattern depends on the region and the variable. For individual regions, the initialization leads to an added value for accurate (retrospective) forecasts several years ahead, while for some regions the uninitialized historical runs deliver better predictions. Also the discrepancies between the two hindcast generations (CCLM_b0 and CCLM_b1) are rather heterogeneous. While for temperature we only found a slight shift in the pattern due to the different initialization methods, discrepancies can be large for precipitation and wind speed depending on the region.
Recent studies document that the application of regional climate models may improve climate simulations, in particular over complex terrain (Berg et al., 2013; Feldmann et al., 2013; Hackenbruch et al., 2016). This is mainly due to a more realistic representation of the topography (e.g. mountain ranges or coastlines) in the regional climate models (RCMs) compared to global-scale general circulation models (GCMs). In this section, we analyse whether the downscaling with a regional climate model also leads to an added value for decadal predictions over Europe. With this aim, we use MPI_b0 and MPI_b1 as reference datasets for the MSESS shown in Fig. 4 (see also Sect. 3.2).
Spatial distribution of the MSESS for the multi-annual mean anomalies
of lead years 1–5 for
MSESS, ACC, and conditional bias (CB) for the CCLM ensembles (left half) and
added value compared to the global MPI ensembles (right half) for b0 (upper half) and
b1 (lower half) for temperature averaged over the eight Prudence regions (cf. Fig. 1) and
the whole of Europe (EU). In the left half, bold (italic) numbers represent MSESS values
above
Generally, significant improvements in the prediction skill by dynamical downscaling are restricted to limited geographical areas, and they strongly depend on the variable. For temperature a significant added value of downscaling is found over Scandinavia and southeast Europe in CCLM_b0, and over southeast Europe and the British Isles in CCLM_b1 (Fig. 4a and b). Therefore, the regionalization typically provides an improvement in regions where the global hindcasts show mostly medium to lower skill. In some regions with already high skill in the MPI-ESM hindcasts there is no improvement by the downscaling. This includes, for example, an area from the British Isles over France to Germany in baseline0 and regions along the western Mediterranean coast in baseline1. However, the global model outperforms the regional model over large parts of northwest Europe in the baseline0 ensemble (see also Table 1).
Again, rather patchy MSESS fields are obtained for precipitation. Nevertheless, there are several regions with significantly improved prediction skills in the CCLM ensembles compared to MPI_b0 and MPI_b1 (Fig. 4c and d). For example, both CCLM_b0 and CCLM_b1 reveal significant positive MSESS over eastern Germany and over parts of Scandinavia.
The added value of downscaling is most pronounced for wind in CCLM_b0 (Fig. 4e). Significant improvements of the MSESS are detected for southeast Europe, Italy, Scandinavia, and for coastal areas of Iberia, France, and England. Areas with an added value of downscaling are also existent in CCLM_b1 (Fig. 4f), but visibly reduced compared to CCLM_b0.
As Table 1, but for precipitation.
As Table 1, but for wind speed.
In Tables 1–3 we summarize the analysis of skill (cf. Fig. 2) and added value
(cf. Fig. 4) of the regional hindcasts compared to the climatology for the three
variables as spatial means over the PRUDENCE regions (cf. Fig. 1). The tables display the
MSESS as well as its components correlation (ACC) and conditional bias (CB), according to
the Murphy decomposition (cf. Sect. 3.2). The formatting of the cells in the left half of
the tables display distinctly positive (bold numbers), negative (italic numbers), or
slightly positive skill (no formatting) derived from the results shown in Fig. 2 for the
MSESS. A common threshold for all three variables is chosen in a way that above this
level a skill score is regarded as significant by the bootstrapping procedure. The
thresholds for the formatting are above (bold) or below (italic)
For temperature (Table 1), the MSESS and ACC are high in all regions except Scandinavia (SC). The correlation is above 0.8 in CCLM_b0 for the northwestern part of Europe – namely the British Isles (BI), France (FR), Mid-Europe (ME), and the Alps (AL). For CCLM_b1 the highest correlation is found more in the southern part of the region, namely in France (FR), the Iberian Peninsula (IP), the Alps (AL), and the Mediterranean (MD). The CB is low for most areas, except France (FR) and Scandinavia (SC) in CCLM_b0. The MSESS is high due to the high ACC and the low CB. MSESS and ACC are higher and CB lower for b1 than for b0 over the whole of Europe (EU). A significant added value of the regionalization of baseline0 occurs for Scandinavia (SC) and Eastern Europe (EA) as well as for Europe (EU). In these regions CB is significantly lower for CCLM compared to the MPI-ESM hindcasts. CCLM_b1 shows an added value for the regions British Isles (BI), the Mediterranean (MD) and Eastern Europe (EA) as well as over the entire domain (EU). The main cause is again a reduced conditional bias.
The MSESS for precipitation (Table 2) is much lower than for temperature. It is mostly negative for CCLM_b0 and only slightly positive for some regions in CCLM_b1. For CCLM_b0 a significant positive correlation is only found for the Iberian Peninsula (IP). The CB is generally negative (except for IP), which is inherited from the global hindcasts, since for MPI_b0 CB is even lower than for CCLM_b0. Over the whole domain the added value is low and not significant. For CCLM_b1, a positive ACC is uncovered for France (FR), Mid-Europe (ME) and Scandinavia (SC). The conditional bias is much lower than for CCLM_b0 and only slightly negative in most areas. An added value in relation to MPI_b1 occurs in those regions where ACC of CCLM_b1 is positive. There is also an added value for the whole domain (EU), but it is only significant for ACC.
The highest ACC for the 10 m wind speed is revealed for northern Europe (BI and SC) as well as in the Mediterranean (MD) for CCLM_b0, and in the southern regions for CCLM_b1 (Table 3). These are all regions with low to moderate conditional bias. CB is strongly negative in regions with a low skill. Interestingly, the added value for the MSESS of CCLM_b0 is significant in all regions, and for most regions a significant improvement is also visible with respect to ACC and CB. For CCLM_b1, on the other hand, significant improvements of MSESS and ACC compared to MPI_b1 are only found for a few regions.
We conclude that regional downscaling may indeed provide an added value for decadal predictions over Europe, both for individual grid points as well as for spatial means. However, this added value is not systematic but depends on variable and region.
Past studies suggest that the ensemble size of a prediction system has an impact on the
forecast skill of a model (Richardson, 2001; Ferro et al., 2008). Generally, there is a
consensus that the prediction skill for both seasonal and decadal predictions is enhanced
when the number of ensemble members is increased. Kadow et al. (2015) analysed the global
MiKlip baseline1 generation and concluded that the forecast accuracy for surface
temperature for lead years 1 and 2–9 is improved for nearly the whole globe when the
ensemble size is increased from 3 to 10 members. This is in line with the findings of
Sienz et al. (2016) who examined the prediction skill for North Atlantic sea surface
temperatures in the same hindcast ensemble. Also for seasonal predictions of the North
Atlantic Oscillation a forecast system profits from increasing size (e.g. Scaife et al.,
2014). However, the ensemble-size-dependent skill bias has never been demonstrated based
on regional decadal climate predictions before. With this aim, we analyse the impact of
the ensemble size on the predictive skill for the eight PRUDENCE regions in Europe in
both the regional and the global MiKlip ensembles. In the following, results are only
shown for the Iberian Peninsula (IP), as the findings are similar for the other PRUDENCE
regions. Figure 5 exhibits the dependency of MSESS and delta_ACC when compared to the
historical simulations for lead years 1–5 (
Skill scores for the multi-annual mean anomalies of lead years 1–5 of the
CCLM_b0 (red), MPI_b0 (yellow), CCLM_b1 (blue), and MPI_b1 (green) ensembles
depending on the ensemble size (
As expected, enhanced predictive skill can be observed when the number of members is
increased stepwise for both the global and the regional hindcast ensembles. MSESS shows a
rather logarithmic relationship with increasing
For ensembles with less than 10 members, the skill scores of all possible
In summary, our findings confirm what is already verified in recent studies for global decadal predictions: the predictive skill of a regional decadal prediction system is generally improved when the size of the hindcast ensembles increases. This is valid for all variables, regions, and hindcast ensembles considered in this study. The skill scores converge towards a certain value in most cases for MSESS in all hindcasts (see Fig. 5a–c). The increments in added value by increasing the number of ensemble members decrease for more than 5 members. Nevertheless, it is recommended to use 10 members or more for the skill assessment of decadal predictions on the regional scale, as is also endorsed in the Decadal Climate Prediction Project (DCPP) contribution to CMIP6 (Boer et al., 2016).
A lesson learned from the CMIP5 decadal experiments is that more starting years and thus a larger number of initializations is beneficial to establish robust skill estimates (Boer et al., 2016). This has been reflected in the progress from the first global MiKlip hindcast generation baseline0 to the second generation baseline1. Whereas baseline0 provides 10 ensemble members every fifth year (compliant with the CMIP5 experimental protocol), baseline1 provides this ensemble size for each starting year of the hindcast period. To assess the impact of using only five initializations (as used elsewhere in the paper) on the robustness of our main conclusions we performed a sensitivity analysis with the global baseline1 ensemble, for which all starting years are available. For this, we compared the sample with ten-yearly starting dates with the full yearly initialized MPI-ESM-LR baseline1 ensemble over the same period from 1960 to 2000.
Figure 6 presents a comparison between the ACC scores for the sample with five initializations (Fig. 6a, c, e) and the sample with all 41 initializations (Fig. 6b, d, f). For all three variables the score maps show in general comparable spatial distributions. The skill maps for the sample with all initializations usually depict a smoother spatial distribution with less extreme skill values and larger areas with significant skill scores. The regional averages over most of the PRUDENCE regions are comparable. However, in some regions larger differences can occur: for temperature over Ireland and Scotland, for precipitation over parts of France and eastern Europe, and for wind from northeastern Spain towards the Alps. Similar results are found for MSESS (not shown) for which not only the number of initializations of MPI_b1 is increased but also the number of uninitialized historical runs.
Spatial distribution of the ACC for the multi-annual mean anomalies of lead
years 1–5 in MPI_b1 for
It is obvious that more initializations increase the robustness of the skill assessment, especially with respect to quantitative estimates and significance of the results. Therefore, this work supports the recommendations made for CMIP6 by Boer et al. (2016) to generate hindcast ensembles with yearly starting dates. Nevertheless, using only five initializations already represents the general features and to some extent the significance of the regional distribution. Therefore, the analysis of the sample with all initializations confirms the qualitative findings from Sect. 4.1. The results regarding the added value and the ensemble size dependence are less affected by the number of initializations. Given the above findings, we conclude that the results obtained here for a limited number of initializations qualitatively comparable to those which would be obtained for samples with distinctly more initializations.
In this study, the decadal prediction skill of the regional MiKlip decadal
prediction system is analysed for temperature, precipitation, and wind speed
over Europe and compared to the forecast skill of the global ensemble. The
goal is to assess the prospect of such a system for the application in
forecasts on decadal timescales. Focus is given to years 1–5 after
initialization. Two skill scores are used to quantify the accuracy of the
two different MiKlip hindcast generations. The main findings of our study
can be summarized as follows:
There is a potential for regional decadal predictability over Europe for
temperature, precipitation, and wind speed in the MiKlip system, but the
predictive skill depends on the variable, the region, and the hindcast generation. The MiKlip prediction system may distinctly benefit from regional
downscaling. An added value in terms of accuracy and to some extend significance of skill
is particularly revealed for temperature over the British Isles (BI), Scandinavia (SC),
the Iberian Peninsula (IP); and for precipitation over the British Isles (BI),
Scandinavia (SC), Mid-Europe (ME), and France (FR) for the b1 generation. Most of these
regions are characterized by complex coastlines and orography, which indicates that the
better representation of topographic structures in the regionalized hindcasts may improve
the predictive skill. The improvement of the initialization procedure from baseline0 to baseline1
as described in Pohlmann et al. (2013b) increases the overall predictive skill in the
downscaled MiKlip hindcasts over Europe, at least for precipitation and temperature.
However, improvement of the skill varies between variable and region. The skill for
temperature increases around the Mediterranean Sea and parts of Scandinavia from b0
to b1. For precipitation the skill of b1 compared to b0 is higher in all regions except
the Iberian Peninsula and eastern Europe. Only for wind speed is there mostly no benefit
from the improved initialization. A systematic enhancement of MSESS skill scores with increasing ensemble
size, as revealed in numerous studies for global predictions, is also found for the
regional MiKlip decadal predictions system, and 10 members is found to be suitable for
regional decadal forecasts. This is valid for all variables and European regions. Based on the MPI_b1 data, it was shown that results derived
from only five initializations used in this study qualitatively agree with
results based on the full set of annual initializations. Nevertheless, such
an increase would improve the robustness and significance of the skill maps.
Müller et al. (2012) and Pohlmann et al. (2013b) found systematic prediction skills
for surface temperature over large parts of the North Atlantic and Europe in both global
generations (baseline0, baseline1). From the results of our study, it is apparent that
the Mediterranean area and the Iberian Peninsula seem to be key European regions for
decadal prediction skill with the regional prediction system. This is in line with
findings from Guemas et al. (2015) and may be related to skilful predictions of the
Atlantic Multi-decadal Oscillation (AMO; Garcia-Serrano et al., 2012; Guemas et al.,
2015). Due to the rather non-linear relationship of these large-scale North Atlantic
features to regional atmospheric conditions over Europe, the mechanisms steering the
decadal variability and predictability of climate variables in European regions are thus
more complex. The decadal variability in regional precipitation, temperature, and wind
speed over most parts of Europe is largely affected by the North Atlantic Oscillation,
but its skilful decadal predictability over the continent is still under debate. With
respect to this, a better understanding of the mechanisms relevant for the regional
climate over Europe on the decadal timescale is required, as was, for example, obtained
for the tropical Atlantic (Dunstone et al., 2011). This is an objective of the ongoing
second phase of the MiKlip project.
The skill scores may strongly vary between neighbouring grid points. Comparable results were found by Guemas et al. (2015), who detected a rather diffuse pattern for the accuracy of decadal predictions over Europe for seasonal temperature and precipitation. This might at least partly be due to spatial and temporal inhomogeneity of the gridded observational references. A more realistic assessment of the prediction skill can be made by considering spatial means (Goddard et al., 2013), which was mostly considered in this study. In line with Kadow et al. (2015), we could show that an enlargement of the ensemble size up to 10 members results in an improvement of the prediction skill over Europe. However, prediction skill could further benefit from even larger ensemble sizes, especially in areas with low signal-to-noise ratio (cf. Sienz et al., 2016).
Bias and drift adjustment (e.g. Boer et al., 2016) provides prospect in skill improvement not only for GCMs but also for RCMs. This is particularly the case for ensemble simulations run with full-field initialization (like the third MiKlip generation prototype, not analysed here; cf. Marotzke et al., 2016). While bias and drift adjustment methods have improved the forecast skill of near-term climate prediction (e.g. Kruschke et al., 2016), the general expectation is that drift correction is less important for prediction systems employing anomaly initializations like the baseline0 and baseline1 ensembles analysed here (Marotzke et al., 2016). Nevertheless, bias correction and calibration are an important topic in the second phase of MiKlip.
Due to the high computational costs of dynamical downscaling, only five initializations (one per decade) are available for the regional MiKlip ensemble (see Sect. 2). This is a shortcoming regarding the statistical significance of the results and some of the statements presented in this study. However, we could show that the qualitative findings are only partly influenced by the limited number of available hindcasts and that the main conclusions can be regarded as robust. The statistical significance will be easier to quantify when the regional simulations for the newest MiKlip ensemble generation are available with annual starting dates over more than 50 years. On the other hand, regional decadal forecasts may have advantages beyond the examples discussed in this paper. For example, RCMs enable the integration of improved components of the hydrological cycle or climate-system components with memory on multi-year timescales like soil moisture (Khodayar et al., 2014; Sein et al., 2015). Kothe et al. (2016) has shown that extracting the initial state of the deep soil in the RCMs from regional data assimilation schemes may improve decadal predictions. Further, Akhtar et al. (2017) demonstrated that the regional feedback between large water bodies and the atmosphere play a major in the regional climate system. This feedback can only be captured in regionalized climate predictions by a dynamic RCM–ocean coupling. Most of the approaches mentioned above are ongoing within the second phase of MiKlip and are expected to enhance the decadal prediction skill over Europe. We thus conclude that a decadal prediction system would clearly benefit from a regional forecast ensemble.
The regional decadal prediction system generated by the MiKlip consortium comprises
altogether 1000 years (two hindcast generations, each of them comprising 10 hindcast
members for five starting years) of simulations with 0.22
The MiKlip data will be made available via the CERA database
(
MR, HF, SM, and MU developed the concept of the paper; MR, HF, and JGP wrote the first manuscript draft. MR, HF, SM, MU, NL, and JM contributed with data analysis and analysis tools. HF, SM, MR, BA, and BF contributed with RCM simulations. MK and WM contributed with the global MPI-ESM-LR simulations and prepared boundary conditions for RCM simulations. CK leads the MiKlip-C consortium, with project leaders BA, BF, JGP, and GS. All authors contributed with ideas, interpretation of the results, and manuscript revisions.
The authors declare that they have no conflict of interest.
MiKlip is funded by the German Federal Ministry for Education and Research (BMBF,
contracts: 01LP1518 A-D and 01LP1519) All simulations were carried out at the German
Climate Computing Centre (DKRZ), which also provided all major data services. We
acknowledge the E-OBS dataset from the EU-FP6 project ENSEMBLES
(
This paper was edited by Andrey Gritsun and reviewed by three anonymous referees.