The objective of this paper is to present a new dataset of bias-corrected CMIP5 global climate model (GCM) daily data over Africa. This dataset was obtained using the cumulative distribution function transform (CDF-t) method, a method that has been applied to several regions and contexts but never to Africa. Here CDF-t has been applied over the period 1950–2099 combining Historical runs and climate change scenarios for six variables: precipitation, mean near-surface air temperature, near-surface maximum air temperature, near-surface minimum air temperature, surface downwelling shortwave radiation, and wind speed, which are critical variables for agricultural purposes. WFDEI has been used as the reference dataset to correct the GCMs. Evaluation of the results over West Africa has been carried out on a list of priority user-based metrics that were discussed and selected with stakeholders. It includes simulated yield using a crop model simulating maize growth. These bias-corrected GCM data have been compared with another available dataset of bias-corrected GCMs using WATCH Forcing Data as the reference dataset. The impact of WFD, WFDEI, and also EWEMBI reference datasets has been also examined in detail. It is shown that CDF-t is very effective at removing the biases and reducing the high inter-GCM scattering. Differences with other bias-corrected GCM data are mainly due to the differences among the reference datasets. This is particularly true for surface downwelling shortwave radiation, which has a significant impact in terms of simulated maize yields. Projections of future yields over West Africa are quite different, depending on the bias-correction method used. However all these projections show a similar relative decreasing trend over the 21st century.

Global and regional climate models (GCMs and RCMs) are used to produce
projections of future climates driven by various types of greenhouse gas
emission scenarios. The last Coupled Model Intercomparison Project

Scientific communities working on evaluation and modelling of climate change
impacts (in terms of crop yields, water resources, health, etc.) are
increasingly using these simulation outputs either to compute related impact
metrics or to run impact models. However robust biases are still present in
climate models due to ill-defined processes and associated parametrizations,
leading to biased statistical distributions of simulated physical and
dynamical variables

GCM and RCM output data have to be adjusted to statistical distributions of
observation-based reference data. However, the use of different
bias-correction methods in combination with different reference datasets
contributes to the total uncertainty in climate projections and can
contribute in some contexts more than the use of different GCMs or RCMs

The objectives of this paper are to present and evaluate bias-corrected GCM
data obtained by performing the cumulative distribution function transform (CDF-t)
method over Africa to quantify the sensitivity of the bias-corrected
data to different reference datasets and to illustrate this in terms of
simulated crop yields. It is a contribution to the
AMMA-2050

Bias correction has been applied to daily data of six variables critical for
these types of impact: precipitation (pr), mean near-surface air temperature
(tas), near-surface maximum air temperature (tasmax), near-surface minimum
air temperature (tasmin), surface downwelling shortwave radiation (rsds),
and wind speed (wind). The bias correction has been performed using the
CDF-t method

Section 2 presents the reference data. A first intercomparison of WFD, WFDEI, and EWEMBI is presented in terms of mean seasonal fields over West Africa. In Sect. 3 the CDF-t bias-correction method is shortly presented. Then tests are carried out over 1979–2013 to evaluate the sensitivity of the corrections to the calibration period. In Sect. 4, the evaluation of the CDF-t bias correction is detailed over West Africa, first on mean seasonal fields, then on daily metrics. CDF-t bias-corrected GCM data are also compared with ISIMIP/WFD bias-corrected data for the five GCMs used in ISIMIP. The significant impact induced by some improvements introduced in WFDEI data will be shown. CDF-t outputs are also compared to products from EWEMBI. To go further into this evaluation, a crop model has been used to test the impact on simulated crop yields (specifically a local maize cultivar) of bias-correction data with one GCM and of the three reference data. A sensitivity analysis to individual forcing variables (temperature, pr, and rsds) is also presented. Finally the bias-correction impact on crop simulations in the context of RCP8.5 climate change projections is shown. Conclusions are given in Sect. 5.

The AMMA-2050 dataset comprises bias-corrected daily data for the variables
pr, tas, maximum air temperature and
minimum air temperature, rsds, and wind
speed. It covers the domain 20

List of available CMIP5 models used for historical and RCP simulations. The five GCMs also used in ISIMIP are in italics. The number in each column is the number of ensemble member used in this work. Zero indicates that no run was used. The last line shows the total number of runs used for each simulation.

We use daily data extracted from the CMIP5 archive, covering the period from
1 January 1950 to 31 December 2099. Based on availability of daily data, it
comprises 29 GCMs for the 1950–2005 historical period and RCP8.5
2006–2099 projection, 27 GCMs for the RCP4.5 projection, and 20 GCMs for the RCP2.6 projection
(see Table

The observation-based reference dataset is critical for the correction of GCM biases, especially when corrections are applied to daily data. The reference dataset must also have a global coverage on a regular grid, which may induce large uncertainties in void in situ data areas as in Africa. So we used the available WFD, WFDEI, and EWEMBI reference datasets to compare to each other and to compare bias-corrected (with WFD) ISIMIP data with bias-corrected (with WFDEI) AMMA-2050 data.

The WFD dataset

WFDEI, an improved version of WFD, has been produced based on ERA-Interim
reanalysis, over the period from 1 January 1979 to 31 December 2013 on a
0.5

More recently, the EWEMBI dataset has been produced within ISIMIP

Summer climatology from different observation datasets (WFD, WFDEI,
and EWEMBI):

In the following, to reduce the number of figures, the results are presented only for the summer season, July–September (JAS), which is the main rainy season over the Sahel. Similar computations have been performed over the other seasons, especially over spring, which is the main rainy season over the Guinean coast, and some of the results will be commented on.

Figure

In this work, we use the CDF-t method developed by

Once

This CDF-t approach has been applied to five out of the six variables (tas,
tasmax, tasmin, rsds, and wind) over the period 1950–2099 (historical and
RCP2.6, RCP4.5, and RCP8.5 runs). For pr, an updated CDF-t
approach has been used, referred to as “singularity stochastic removal” (SSR),
addressing rainfall occurrence and intensity issues

CDF-t has been applied month by month to take into account the strong
seasonality over Africa. It has been applied using a moving window to smooth
discontinuities

Examples of CDF-t bias correction applied to mean West Africa daily pr data for the five GCMs used in ISIMIP are shown (Fig. S1 in the Supplement). It is represented in terms of cumulative distribution function. The distributions of raw GCM data are clearly different from the WFDEI data. Some of them show more low pr values in GCMs than in WFDEI while others have more low pr values. The CDF-t bias correction appears very effective as the WFDEI and bias-corrected GCM data distributions are closely superimposed.

Before applying the CDF-t correction through the moving window process over 1950–2099, the bias-correction method has to be calibrated individually for every GCM over a reference period. In order to have a calibration dataset as representative as possible of the variability in the various variables, especially pr, the time period 1979–2013 has finally been used for calibration of the bias-correction method. However the sensitivity to the calibration period has been explored over West Africa by testing it on two sub-periods, 1979–1996 and 1996–2013, to prevent any overestimation of the bias-correction performance. This has been performed on the five GCMs used in ISIMIP, and it is more specifically shown in the IPSL-CM5A-LR model in summer for tas, pr, and rsds (Supplement).

Three calibration periods have been tested: 1979–1996, 1996–2013, and 1979–2013 (see Fig. S2). First, it is clear that the bias correction is powerful to remove the cold bias of the raw data. Second, the positive trend present in the raw data over the period 1979–2013, as in WFDEI but with a weaker range, is preserved after the bias correction. This is probably due to the dry bias of pr over the Sahel in raw data that induces a higher sensitivity to the impact of anthropogenic global warming over the period than in observations. Third, the effect of the calibration period is clear. By using the calibration period 1979–1996, the remaining bias of corrected data is near zero and is weakly positive over 1997–2013, while by using the calibration period 1996–2013, the remaining bias of corrected data is near zero and is weakly negative over 1979–1995. Using the calibration period 1979–2013, the remaining bias is overall very weak and on average near zero. Similar tests have been carried out for the variables pr and rsds, and for the other seasons, with similar conclusions. Thus, while it can be thought that using the whole observational period to calibrate the bias-correction process may lead to overestimation of the fit between observations and bias-corrected data, it in fact provides a more robust correction. Therefore we choose the longest period 1979–2013 to perform the calibration process.

Mean near-surface air temperature (

A list of priority metrics has been established between scientists and
stakeholders involved in AMMA-2050. We are presenting results based on some
of these metrics related to the three variables, pr,
near-surface air temperature (tas), and surface downwelling shortwave
radiation (rsds). These metrics are

the seasonal mean for pr, tas, and rsds;

the mean time–latitude annual cycle over (15

the 95th percentile of daily values for tas;

the number of days with tas

the 95th percentile of daily values for pr;

the number of wet days (pr

the number of days with pr

the number of dry days (pr

the 95th percentile of the duration of consecutive dry days sequences.

Taylor diagrams relative to the mean of near-surface air temperature
over 1979–2001 from 29 individual models

Spatial correlation, standard deviation (SD), and root-mean-square
error (RMSE) computed for different observation datasets over the Sahel
(18

Same as Fig.

Same as Fig.

Same as Fig.

Same as Fig.

In the following, the Taylor diagram

Regarding the seasonal mean metrics, WFDEI and EWEMBI statistics are similar except for rsds, for which they are quite different over the Guinean coast. WFD is also very close to WFDEI but all statistics are a bit different, with again more differences for the Guinea coast.

Figure

Figure

Figures

Hovmöller diagrams of daily temperature (

Same as Fig.

Same as Fig.

The 95th percentile of daily values for temperature from various
observation datasets in JAS: WFD

Same as Fig.

The 95th percentile of daily precipitation rate (mm day

Same as Fig.

Seasonal mean of number of days with precipitation greater than or equal to
10 mm day

Same as Fig.

Figures

Figures

Figure

Figure

Time series of crop maize yield over the Sahel (18

Temporal mean of maize yield (t ha

Time series of RCP8.5 projections of maize yields over the Sahel
(18

In the following, similar diagnostics are presented to evaluate the selected
daily metrics. To reduce the number of figures in the core of the
paper, some of them are presented in the Supplement (three metrics in
the core of the paper, three others in the Supplement). A more
complete metrics report is available at

Figures

Figures

Finally, Figs.

The sensitivity of simulated crop yields over West Africa to raw and bias-corrected forcing data is now evaluated. A crop model forced by atmospheric variables integrates biases and variability in these forcing data in a non-linear way. This integration may reduce or amplify the variability induced from these forcing data.

This has been tested by using the crop model SARRA-O (System of
Agroclimatological Regional Risk Analysis; version O). The model simulates
yield attainable under water-limited conditions by simulating the soil water
balance, potential and actual evapotranspiration, phenology, potential and
water-limited carbon assimilation, and biomass partitioning

Figure

Sensitivity experiment means and biases (kg ha

Figure

To go further, a sensitivity analysis to individual variables has been
conducted by comparing the SARRA-O simulation forced with WFDEI data with
simulations where one of these WFDEI variables is replaced by the
corresponding raw IPSL-CM5A-LR data. These variables are pr, rsds, tasmin, and
tasmax, and also rsds from ISIMIP bias-corrected IPSL-CM5A-LR (using WFD as
reference). Table

SARRA-O has also been run over the period 1950–2099 using the RCP8.5
projection, forced by ISPL-CM5A-LR in terms of raw, CDF-t bias-corrected, and
ISIMIP bias-corrected data. Figure

The objectives of this paper are (i) to introduce a new bias-corrected dataset for which the CDF-t correction method has been applied to CMIP5 GCM daily data for the first time over Africa, (ii) to quantify the effect of using different reference datasets on the corrected data, (iii) and to illustrate this effect on crop simulations over West Africa. This bias correction has been applied over the period 1950–2099, combining historical runs and RCP scenarios with 29/27/20 GCMs for RCP8.5/4.5/2.6 respectively. It has been applied to six variables critical for agricultural impacts: daily accumulated pr, daily mean, minimum and maximum near-surface air temperature, daily mean surface downwelling shortwave radiation, and daily mean wind speed.

The use of different bias-correction methods also based on different
reference datasets contributes to the total uncertainty in climate
projections and can contribute in some contexts more than the use of
different GCMs or RCMs

The whole observational period, 1979–2013, has been chosen to calibrate the bias-correction process. It has been shown that using various calibration sub-periods has a weak impact, in particular on the time evolution over the 21st century.

The evaluation of CDF-t bias correction applied to the 29 GCMs, both to mean
seasonal data and to daily metrics, has shown that CDF-t is very
effective in removing the biases in respect to the reference WFDEI data and
in reducing the high inter-GCM scattering. It has also shown some distance,
depending on variables and metrics, from bias-corrected ISIMIP GCM data,
mainly due to the differences between WFDEI and WFD reference data. WFDEI
(and associated CDF-t bias-corrected GCMs) appears closer to EWEMBI than WFD
(and associated ISIMIP bias-corrected GCMs). Metrics based on temperature are
very close for the three reference datasets, and some differences exist in
pr-based metrics. In contrast, significant differences have been
highlighted in terms of rsds. This has
consequences in terms of crop (maize) yields over West Africa. Sensitivity
simulations performed with one GCM have shown that bias corrections improve
the yields simulated by the raw GCM. However, the ISIMIP bias-corrected GCM still
underestimate them as CDF-t bias-corrected GCMs do but with yield estimates
closer to observed ones. EWEMBI provides the closest yields to observed
estimates. This is mainly due to rsds
whose values are underestimated in WFDEI south of 10

The main perspective of this work is to go on exploring the uncertainty
linked to bias-correction methods and their associated reference data in RCP
climate scenarios by producing a second version of this bias-corrected
29-GCM ensemble over Africa using more recent reference data like EWEMBI or
others like those used in AgMIP based on other reanalyses (AgMERRA or AgCFSR;

The CFD-t bias correction has been applied independently for each of the six
variables. However, this may be a problem since existing spatial coherency and
dependence among variables may be destroyed by the application of univariate
calibrations. Recently, to address this issue, improved calibrations have
been developed in terms of multivariate correction and spatial and/or temporal
dependences

This work constitutes a first step in producing bias-corrected datasets over Africa within AMMA-2050. An atlas is in preparation that will provide extensive results over Africa to the FCFA stakeholders and end-user communities. These communities will be accompanied by FCFA climate scientists in order to be aware of the way to use these data and their limitations.

The ISIMIP Fast Track data are available at

The supplement related to this article is available online at:

The authors declare that they have no conflict of interest.

The research leading to these results has received partial funding from the NERC/DFID Future Climate For Africa programme under the AMMA-2050 project, grant number NE/M019934/1. The lead author has been also supported by IRD. Mathieu Vrac has been partly funded by the ANR StaRMIP project. We acknowledge the World Climate Research Programme's Working Group on Coupled Modelling, which is responsible for CMIP, and we thank the climate modelling groups (listed in Table 1 of this paper) for producing and making their model output available. For CMIP the US Department of Energy's Program for Climate Model Diagnosis and Intercomparison provides coordinating support and led development of software infrastructure in partnership with the Global Organization for Earth System Science Portals. The authors also thank the EU Watch project and its members for data availability and ISIMIP data. Edited by: Somnath Baidya Roy Reviewed by: Toshichika Iizumi and one anonymous referee