Bias correction of surface downwelling longwave and shortwave radiation for the EWEMBI dataset

Many meteorological forcing datasets include bias-corrected surface downwelling longwave and shortwave radiation (rlds and rsds). Methods used for such bias corrections range from multi-year monthly mean value scaling to quantile mapping at the daily time scale. An additional downscaling is necessary if the data to be corrected have a higher spatial resolution than the observational data used to determine the biases. This was the case when EartH2Observe (E2OBS; Calton et al., 2016) rlds and rsds were bias-corrected using more coarsely resolved Surface Radiation Budget (SRB; Stackhouse Jr. et al., 2011) data 5 for the production of the meteorological forcing dataset EWEMBI (Lange, 2016). This article systematically compares various parametric quantile mapping methods designed specifically for this purpose. The methods vary in the time scale at which they operate, in their way of accounting for physical upper radiation limits, and in their approach to bridging the spatial resolution gap between E2OBS and SRB. It is shown how temporal and spatial variability deflation related to bilinear interpolation and other deterministic downscaling approaches can be overcome by downscaling the target statistics of quantile mapping from 10 the SRB to the E2OBS grid such that sub-SRB-grid scale spatial variability present in the original E2OBS data is retained. Cross-validations at the daily and monthly time scale reveal that it is worthwhile to take empirical estimates of physical upper limits into account when adjusting either radiation component and that, overall, bias correction at the daily time scale is more effective than bias correction at the monthly time scale if sampling errors are taken into account. A validation against independent ground observations from the Baseline Surface Radiation Network (BSRN; König-Langlo et al., 2013) suggests 15 that the bias correction of E2OBS surface downwelling radiation using SRB data that was done for the production of EWEMBI had a positive and neutral overall effect on rlds and rsds, respectively. Using any of the other methods tested here would have given similar results as the biases relative to BSRN remaining after bias correction are dominated by the corresponding SRB data biases. Copyright statement. The author agrees to the licence and copyright terms of Copernicus Publications as of 6 June 2017. 20


Introduction
High-quality observational datasets of surface downwelling radiation are of interest in many fields of climate science, including energy budget estimation (Kiehl and Trenberth, 1997;Trenberth et al., 2009;Wild et al., 2013) and climate model evaluation (Garratt, 1994;Ma et al., 2014;Wild et al., 2015).As part of so-called climate or meteorological forcing datasets such as those generated within the Global Soil Wetness Project (GSWP; Zhao and Dirmeyer, 2003), at Princeton University (Sheffield et al., 2006), and within the Integrated Project Water and Global Change (WATCH; Weedon et al., 2011), the longwave and shortwave components of surface downwelling radiation (abbreviated as rlds and rsds or just longwave and shortwave radiation in the following) are used to correct model biases in climate model output (Hempel et al., 2013;Iizumi et al., 2017;Cannon, 2017) and drive simulations of climate impacts, for example (Müller Schmied et al., 2016;Veldkamp et al., 2017;Chang et al., 2017;Krysanova and Hattermann, 2017;Ito et al., 2017).
These meteorological forcing datasets are global, longterm meteorological reanalysis datasets such as those produced by the National Centers for Environmental Prediction -National Center for Atmospheric Research (NCEP-NCAR; Kalnay et al., 1996;Kistler et al., 2001) and the European Centre for Medium-Range Weather Forecasts (ECMWF; Uppala et al., 2005;Dee et al., 2011), refined by bias correction using global, gridded observational data.For the components of surface downwelling radiation, such a bias correction is often necessary because observations of these variables are not assimilated in the reanalyses, which makes Published by Copernicus Publications on behalf of the European Geosciences Union.S. Lange: Bias correction of surface downwelling radiation for the EWEMBI dataset them subject to modelling biases of land-atmosphere interactions and cloud processes, for example (Kalnay et al., 1996;Ruane et al., 2015).
Different approaches are adopted in order to carry out these bias corrections.Weedon et al. (2011Weedon et al. ( , 2014) ) apply indirect corrections at the monthly timescale using near-surface air temperature observations for rlds and observations of atmospheric aerosol loadings and cloudiness for rsds.Sheffield et al. (2006) directly rescale rlds and rsds to match observed multi-year monthly mean values.Ruane et al. (2015) directly adjust distributions of daily mean rsds.The observational dataset commonly used for such direct adjustments of rlds and rsds is the Surface Radiation Budget (SRB) dataset assembled by the National Aeronautics and Space Administration (NASA) and the Global Energy and Water Exchanges project (GEWEX ;Stackhouse Jr. et al., 2011).
Another meteorological forcing dataset, the EartH2Observe, WFDEI and ERA-Interim data Merged and Bias-corrected for ISIMIP (EWEMBI; Lange, 2016), was recently assembled to be used as the reference dataset for bias correction of global climate model output within the Inter-Sectoral Impact Model Intercomparison Project phase 2b (ISIMIP2b; Frieler et al., 2017).The surface downwelling longwave and shortwave radiation data included in EWEMBI are based on daily rlds and rsds from the climate forcing dataset compiled for the EartH2Observe project (E2OBS; Calton et al., 2016).In order to reduce deviations of E2OBS rlds and rsds statistics from the corresponding SRB estimates in particular over tropical land (Dutra, 2015), for EWEMBI, the former were bias-adjusted to the latter at the daily timescale using two newly developed parametric quantile mapping methods.
These methods are conceptually similar to the Ruane et al. (2015) method, which fits beta distributions to reanalysed and observed daily mean rsds for every calendar month, thereby accounting for upper and lower physical limits of rsds using the multi-year monthly maximum value as the upper and zero as the lower limit of the distribution, and then uses quantile mapping to adjust the distributions.In contrast to Ruane et al. (2015), the methods developed to adjust E2OBS rlds and rsds for EWEMBI applies moving windows to estimate beta distribution parameters for every day of the year.This precludes discontinuities at the turn of the month (Rust et al., 2015;Gennaretti et al., 2015) and promises a better bias correction where the seasonality of radiation is very pronounced such as for rsds at high latitudes.Also, the new methods estimate the physical upper limits of rlds and rsds differently, acknowledging that these limits are necessarily greater than or equal to the greatest value observed during any fixed period.Lastly, while Ruane et al. (2015) linearly interpolate SRB rsds from its natural horizontal resolution of 1.0 • to the 0.5 • reanalysis grid prior to bias correction, the new methods aggregate the E2OBS data from their original 0.5 • grid to the 1.0 • SRB grid, where the bias correction is then carried out, and disaggregates these aggregated and bias-corrected data back to the E2OBS grid.Depending on the disaggregation method, this approach promises to generate bias-corrected data with more realistic temporal as well as spatial variability.
The new methods are comprehensively described and cross validated in this article.Moreover, several modifications of the new methods are tested here that differ in how they handle the spatial resolution gap between the E2OBS and SRB grids, and how they account for the physical upper limits of rlds and rsds.Also included are bias correction methods that operate at the monthly timescale in order to test if bias correction of daily or monthly mean values yields better overall cross-validation results.The lessons learned from these analyses shall benefit bias corrections of surface downwelling radiation to be carried out in future generations of climate forcing datasets.

SRB
The observational data used for the bias correction of E2OBS daily mean rlds and rsds for EWEMBI were the NASA-GEWEX Surface Radiation Budget (SRB; Stackhouse Jr. et al., 2011) primary-algorithm estimates of daily mean rlds and rsds from the latest SRB releases available at the time, which were release 3.1 for rlds and release 3.0 for rsds.These data cover the whole globe on a regular 1.0

Methods
For the reader who is is not familiar with the concepts of quantile mapping and/or statistical downscaling, a short introduction including definitions of relevant terms is given in Appendix A. The parametric quantile mapping methods introduced in the following are named according to the scheme BCvtpx, where v, t, p are used to distinguish between methods for longwave and shortwave radiation (v = l, s) operating at the daily and monthly timescales (t = d, m) using basic and advanced distribution types or parameter estimation techniques (p = b, a).Index x = 0, 1, 2 is used for variants of these methods that differ in how they handle the spatial resolution gap between the SRB and E2OBS grids.For the BCvtp0 methods, the SRB data are spatially bilinearly interpolated to the E2OBS grid and the E2OBS data are then bias-corrected using these interpolated SRB data; this is to mimic the Ruane et al. (2015) approach.For bias correction with the BCvtp1 methods, E2OBS data are spatially aggregated to the SRB grid, and the aggregated data are then bias-corrected and the resulting data disaggregated back to the E2OBS grid; this approach was used to produce the EWEMBI radiation data.Lastly, the BCvtp2 methods adjust mean values and variances at the E2OBS grid such that mean values and variances of spatial aggregates to the SRB grid match the corresponding SRB estimates while the sub-SRB-grid-scale spatial structure of mean values and variances present in the original E2OBS data is retained; this is to overcome the variability deflation induced by the other two approaches.Since the BCvtp0 and BCvtp2 methods are based on the BCvtp1 methods, the latter are introduced first.Readers who are merely interested in how the EWEMBI radiation data were produced are informed that methods BClda1 and BCsda1 were used for that purpose.

Bias correction at the SRB grid scale
For the BCvtp1 methods, daily mean E2OBS rlds and rsds are first aggregated to the SRB grid using a first-order conservative remapping scheme (Jones, 1999).The conservative remapping ensures that each aggregated value is the grid-cell area-weighted mean of the underlying four E2OBS values.The methods of bias correction of these aggregated values are described in the following.The method used for the sub-sequent disaggregation to the E2OBS grid is described in Sect.3.1.3.
The BCvtp1 methods use parametric transfer functions of the form F SRB vtp −1 (F E2OBS vtp (•)), where F E2OBS vtp and F SRB vtp are climatological cumulative distribution functions (CDFs) of aggregated E2OBS and SRB data, respectively.The CDFs are estimated individually for every SRB grid cell and day of the year (Fig. 1).In order to quantify the extent to which bias correction results benefit from explicitly accounting for physical radiation limits, the basic and advanced methods BCltb1 and BClta1 for longwave radiation use normal and beta distributions, respectively.For shortwave radiation, the relevance of physical limits is less questionable, given that the lower limit of zero matters at least during polar night, and that the solar radiation incident upon land and ocean surfaces is limited by the solar radiation incident upon the top of the atmosphere (see Fig. 1).Therefore, all BCstp1 methods use beta distributions and the basic and advanced methods only differ in how they estimate the beta distribution parameters (see Fig. 1, Table 1).

Bias correction at the daily timescale
The parameters of the climatological CDFs F E2OBS vdp and F SRB vdp are estimated based on empirical multi-year mean values, variances, and maximum values of daily mean radiation from the December 1983-November 2007 time period.Data from the whole period were used for the production of EWEMBI rlds and rsds.Data from some half of the period (see Sect. 4.1) are used for cross validation in this study.
For shortwave radiation, the basic daily bias correction method is designed to resemble the method outlined by Ruane et al. (2015, Sect. 3.4).BCsdb1 estimates mean values and variances of climatological beta distributions by 25day running mean values of multi-year daily mean values and variances, respectively, and their upper bounds by 25day running mean values of 25-day running maximum values of multi-year maximum values of daily mean rsds (solid red line in Fig. 1c).The idea behind this upper-bound estimate is that 25-day running maximum values of multi-year maximum values of daily mean rsds resemble the multi-year monthly maximum values of daily mean rsds used by Ruane et al. (2015).Please note that using the same window length for the running maximum calculation and the additional smoothing ensures that the resulting upper bounds are always greater than or equal to the multi-year maximum values of daily mean rsds.
The BCsda1 method employs the climatology of daily mean shortwave insolation at the top of the atmosphere (rsdt; see Appendix B for how rsdt is calculated in this study) for the upper-bound estimation.This is motivated by rsds being limited by rsdt in most locations and seasons, which suggests that the annual cycle of the upper bound of daily mean rsds has a similar shape as the climatology of daily mean rsdt.  1) include mean values and standard deviations (dotted red, dashed blue), and upper bounds (solid red, solid blue) where beta distributions are used.Note that the basic and advanced estimates of mean values and standard deviations only differ in panel (c) near the beginning and end of polar night (see Table 1).The green line in panel (a) represents 25-day running mean values of 25-day running maximum values of 24-year maximum values of daily mean rlds, which are used to estimate the upper bounds of the climatological beta distributions used by the BClda1 method (solid blue line in panel a).The lower bounds of all climatological beta distributions are set to zero.
Table 1.Distribution types and parameter estimation methods of bias correction methods BCvtp1 for day d of the year (see Fig. 1).Please note that the lower bounds of all climatological beta distributions are set to zero and that 24-year statistics are replaced by 12-year statistics for cross validation.

Method
Distribution type Mean value x ij is the daily mean rlds (for BCltp1) or rsds (for BCstp1) on day j of year i.Brackets Therefore, method BCsda1 uses a rescaled daily mean rsdt climatology as the upper-bound climatology of daily mean rsds (solid blue line in Fig. 1c).The rescaling is performed with the smallest possible factor that guarantees that the re-sulting upper bounds are greater than or equal to the multiyear maximum values of daily mean rsds on all days of the year with rsdt ≥ 50 W m −2 .An extension of this guarantee to days of the year with lower rsdt would inflate the rescaling Earth Syst.Dynam., 9, 627-645, 2018 www.earth-syst-dynam.net/9/627/2018/factor because during dusk and dawn of polar night, rsds can exceed rsdt due to diffuse radiation coming in from lower latitudes.Therefore, on days of the year with rsdt < 50 W m −2 , the maximum of the rescaled rsdt and the empirical multiyear maximum daily mean rsds is used as the upper rsds bound.Mean values and variances of the climatological beta distributions of the BCsda1 method are estimated by running mean values of multi-year daily mean values and variances, respectively.The window length used for these running mean calculations is 25 days by default.On days that are fewer than 13 days away from the beginning or end of polar night (as defined by daily mean rsdt going to zero), the window length is shortened to 2n + 1, where n is the number of days between the day in question and the beginning or end of polar night.
For longwave radiation, both the basic and the advanced daily bias correction methods use 25-day running mean values of multi-year daily mean values and variances to estimate climatological mean values and variances, respectively.The upper bounds used by BClda1 are not estimated by the often rather un-smooth 25-day running mean values of 25day running maximum values of 24-year maximum values of daily mean rlds (solid green line in Fig. 1a) but by a suitably shifted and rescaled mean value climatology (solid blue line in Fig. 1a; formulas in Table 1).
Since the choice of the window length used for all the running mean and maximum value calculations mentioned above is somewhat arbitrary, the window length dependence of the overall performances of the BCvda1 methods is investigated in Appendix D. Sensitivities are found to be very low for window lengths between 10 and 40 days.

Bias correction at the monthly timescale
In order to mimic a bias correction at the monthly timescale as was performed by Sheffield et al. (2006, Sect. 3.d.3) 1) because 31-day running mean values of multi-year maximum values of daily mean radiation are mathematically always greater than or equal to multi-year maximum values of 31-day running mean radiation.The resulting upper bounds are typically much larger than observed 24-year maximum monthly mean radiation (see Fig. 1d) because 31 consecutive days of daily mean radiation at the re-spective physical upper limit are very unlikely to occur in reality.

Disaggregation to the E2OBS grid
In principle, the disaggregation of aggregated and biascorrected E2OBS data from the SRB to the E2OBS grid can be carried out in various ways.The simplest approach would arguably be a mere interpolation, which is disadvantageous since it ignores the sub-SRB-grid-scale spatial variability present in the original E2OBS data.Probabilistic disaggregation methods that are designed to retain that variability (see Sheffield et al., 2006, Sect. 3.b.1), are impractical if, as in the present case, the purpose of the disaggregation is the production and publication of a dataset because all variants of the dataset that can potentially be generated by a probabilistic algorithm are, as long as all conceivable constraints have been incorporated in the algorithm, equally plausible candidates for the one dataset to be published.Therefore, not a probabilistic but the following deterministic disaggregation approach was used for the production of EWEMBI rlds and rsds and is adopted here for all BCvtp1 methods.
First, E2OBS-grid-scale upper bounds of daily mean radiation are estimated by bilinearly interpolated maximum values of the climatological upper bounds of SRB all-sky and clear-sky radiation, which in turn are estimated using the BClda1 method for rlds and the BCsda1 methods for rsds (see Table 1 and blue lines in Fig. 1a, c).The clear-sky radiation data are included in order to prevent the E2OBSgrid-scale upper bounds from being much lower than the real physical limits of daily mean radiation at that spatial scale, given that due to sub-SRB-grid-scale spatial variability, upper radiation bounds at the E2OBS grid scale may exceed those at the SRB grid scale.
The original daily E2OBS data are then clamped between zero and these upper bounds, and the resulting values (or their distances to their upper bounds) are rescaled day by day and SRB grid cell by SRB grid cell such that their SRBgrid-scale aggregates match the bias-corrected values.More precisely, for a fixed but arbitrary SRB grid cell and a fixed but arbitrary day, let Y denote the bias-corrected value at the SRB grid scale, w k with4 k=1 w k = 1 the area weights of the four E2OBS grid cells k = 1, 2, 3, 4 contained in the SRB grid cell, X k the clamped original E2OBS data values with upper bounds b k , and Y k the bias-corrected values at the E2OBS grid scale to be computed.
).This rescaling procedure ensures that 0 ≤ Y k ≤ b k and S. Lange: Bias correction of surface downwelling radiation for the EWEMBI dataset 3.2 Bias correction at the E2OBS grid scale

The BCvtp2 methods
The disaggregation method introduced above corrects the original E2OBS values from the four E2OBS grid cells contained in one SRB grid cell as if they must all be too low (high) if their area-weighted average is too low (high).This implicit assumption is questionable since it rules out the possibility that the area-weighted average is too low because one of the four values is much too low while the others are slightly too high, to give just one example.A statistical manifestation of this problem is illustrated and discussed in Sect.4.2.
The assumption does not need to be made if the bias correction is carried out directly at the E2OBS grid.With target distributions fixed at the SRB grid, target distributions at the E2OBS grid can be defined such that the bias-corrected data have the SRB-grid-scale target distributions and the sub-SRB-grid-scale structure of the original E2OBS data.For parametric bias correction methods such as those introduced above, this can be achieved via suitable definitions of the parameters of the E2OBS-grid-scale target distributions.Here, for every BCvtp1 method, a corresponding BCvtp2 method is defined to operate at the same temporal scale and to use the same source (at the E2OBS grid) and target (at the SRB grid) distribution type and parameter estimation technique (see Table 1).E2OBS-grid-scale target climatologies of mean values, variances, and (where necessary) upper bounds are defined as follows.
The mean value estimates of the original E2OBS data are shifted by a common offset per SRB grid cell and day of the year to obtain the E2OBS-grid-scale target mean values.The offsets are chosen such that the E2OBS-grid-scale target mean values aggregated to the SRB grid match the corresponding SRB mean value estimates.E2OBS data biascorrected using these E2OBS-grid-scale target mean values have SRB-grid-scale aggregates that match the SRB-gridscale target mean values because (i) the aggregation is a linear operation and (ii) the mean value of a linear combination of random variables is equal to the same linear combination of the mean values of these random variables.
To obtain the E2OBS-grid-scale target variances, the variance estimates of the original E2OBS data are rescaled by a common (to all four E2OBS grid cells contained in one SRB grid cell) factor f ij per day i of the year and SRB grid cell j .For the derivation of the formula for f ij let Y ij k (and X ij k ) denote random variables representing bias-corrected (and original) E2OBS data from day i of the year and E2OBS grid cells k = 1, 2, 3, 4 contained in SRB grid cell j .Then the estimated variance of the SRB-grid-scale aggregate of Y ij k can be expanded to where w j k is the area weight of E2OBS grid cell j k with 4 A bias correction would be deemed successful if the left-hand side of Eq. ( 1) was equal to the estimated variance of Z ij , the SRB data from day i of the year and grid cell j .On the right-hand side of Eq. ( 1), f ij Var(X ij k ) can be substituted for Var(Y ij k ) by definition of the scaling factors, and Cor(Y ij k , Y ij l ) can be approximated by Cor(X ij k , X ij l ) since quantile mapping preserves ranks and therefore rank correlations and therefore approximately Pearson correlations.The variance scaling factors f ij for method BCvtp2 are therefore calculated based on where the variances are estimated using the respective BCvtp1 approach (see Table 1), and the Pearson correlations are estimated by inversely Fisher-transformed 25-day running mean values of Fisher-transformed 24-year daily Pearson correlations of daily (for BCvdp2) or 31-day running mean (for BCvmp2) radiation data.The Fisher transformations are invoked here in order to approximately account for correlation value-dependent sampling error intervals (Fisher, 1915(Fisher, , 1921)).
The E2OBS-grid-scale target upper bounds are calculated in the same way as the E2OBS-grid-scale target mean values.This way, the latter rarely exceed the former.Where they do, the latter are reduced to 99 % of the former.For longwave (shortwave) radiation, such reductions are necessary in four (11 % of all) E2OBS grid cells on an average of 15 % (5 %) of all days of the year.
Furthermore, in order to obtain realistic E2OBS-grid-scale target beta distributions, the E2OBS-grid-scale target variances calculated using Eq. ( 2) are limited to 40 % of µ(b−µ), where µ and b are the E2OBS-grid-scale target mean values and upper bounds, respectively.This limit is imposed because (i) the variance σ 2 of a random variable taking values from within the interval [a, b] can generally not be greater than (µ − a)(b − µ) if µ is the random variable's mean value; (ii) if that random variable is beta distributed and σ 2 > (µ − a)(b − µ)/2 then the probability density function is U shaped (Wilks, 1995), which is considered unrealistic for climatological distributions of rlds and rsds; and (iii) σ 2 /(µ(b − µ)) has an empirical upper limit of about 40 % in the original Earth Syst.Dynam., 9, 627-645, 2018 www.earth-syst-dynam.net/9/627/2018/E2OBS radiation data.The 40 % condition is never met for longwave radiation whereas for shortwave radiation it is met in 14 % of all E2OBS grid cells on an average of 2 % of all days of the year.

The BCvtp0 methods
For the BCvtp0 methods, daily SRB data are first bilinearly interpolated to the E2OBS grid.The E2OBS data are then bias-corrected directly at the E2OBS grid using the interpolated SRB data and transfer functions defined exactly as for the respective BCvtp1 method.

Results
In the following, the bias correction methods introduced above are cross validated at the SRB grid scale (Sect.4.1), and their disaggregation performance is assessed by comparing sub-SRB-grid-scale spatial variability before and after bias correction (Sect.4.2).

Cross validation at the SRB grid scale
For the cross validation against SRB data, 24 years worth of overlapping E2OBS and SRB data are divided into two 12year samples of which the first one is used to calibrate and the second one to validate the method.Common practice would be to use data from the first and second half of the 24-year period to define these samples.However, due to climate change this definition may yield calibration and validation data samples that differ statistically.These differences in turn, which are essentially climate change signals, may differ in extent between the E2OBS and SRB data.Switanek et al. (2017) have shown that such differences in climate change signals may then dominate cross-validation metrics and thereby distort the comparative validation of bias correction methods.
In order to minimise this climate change impact on crossvalidation results, here, calibration and validation data samples are composed of data from all odd years and all even years or vice versa, respectively.The samples are accordingly labelled every1st and every2nd.Please note that results for BCvtp2 are not shown or discussed in this section because BCvtp1 and BCvtp2 produce virtually identical data at the SRB grid scale.

BCvtp0 vs. BCvtp1
The first question addressed here is how the bilinear spatial interpolation of SRB data to the E2OBS grid before bias correction with the BCvtp0 methods impacts the distribution of bias-corrected rlds and rsds values at the SRB grid scale.To quantify these impacts, biases in multi-year daily mean values, standard deviations, and maximum values remaining after bias correction with methods BCvda0 and BCvda1 are compared in the left and middle columns of Figs. 2 and 3.
Since linear interpolation always yields values that are intermediate to the values at the interpolation knots it is expected that daily SRB data bilinearly interpolated to the E2OBS grid and then aggregated back up to the SRB grid will be more smooth overall both in space and time than the original SRB data.Manifestations of the increased smoothness in time are the more negative biases of standard deviations (Fig. 2) and maximum values (Fig. 3) remaining after bias correction with BCvda0 than with BCvda1.Standard deviations after bias correction with BCvda0 in particular are negatively biased by more than 4 % (median over calendar months × validation data samples) in most regions.In mountainous and therefore spatially heterogeneous regions, multiyear monthly mean radiation is also changed significantly by the interpolation, with median biases over calendar months × validation data samples remaining after bias correction with BCvda0 exceeding 2 W m −2 in many such places (Fig. 2).

BCvtax vs. BCvtbx
Next is an assessment of how the treatment of the upper bound of the distributions estimated by the BCvdp1 methods impacts the distribution of bias-corrected rlds and rsds values at the SRB grid scale.To quantify these impacts, biases in multi-year daily mean values, standard deviations, and maximum values remaining after bias correction with methods BCvda1 and BCvdb1 are compared in the middle and right columns of Figs. 2 and 3.
For longwave radiation, the basic method BCldb1 assumes normally distributed values and therefore does not account for any upper physical limit of rlds whereas the advanced method BClda1 assumes the existence of such a limit and estimates it empirically.Figure 3 shows that the advanced method generally yields a better correction of 12-year maximum values.In contrast, standard deviations are slightly better corrected by the basic method and mean values are equally well corrected by both methods (Fig. 2).
For shortwave radiation, both the basic and the advanced method empirically estimate upper physical limits of rsds and take these into account in the form of upper bounds of beta distributions.The limit estimates are based on downwelling shortwave radiation at the surface and at the top of the atmosphere for BCsda1, and on rsds only for BCsdb1.Figure 3 shows that the basic method generally yields a better correction of 12-year maximum values.Standard deviations and mean values are also slightly better corrected by BCsda1 than by BCsdb1 (Fig. 2).

BCvdpx vs. BCvmpx
Next is a comparative cross validation of methods BCvdpx and BCvmpx operating at the daily and monthly timescales, respectively.The cross validation itself is also performed at the daily and monthly timescales based on statistics of daily and monthly mean radiation, respectively.A joint assessment www.earth-syst-dynam.net/9/627/2018/Earth Syst.Dynam., 9, 627-645, 2018 of these cross validations shall reveal whether bias correction at the daily or monthly timescale is better overall.By design, the BCvdpx and BCvmpx methods are equally good at correcting multi-year mean values of daily mean radiation.However, both day-to-day and year-to-year variability are expected to be differently well corrected by the methods operating at different timescales.Since day-to-day variability is (not) explicitly adjusted by the methods operating at the daily (monthly) timescale, the BCvdpx methods are expected to perform better at the daily timescale than the BCvmpx methods.The year-to-year variability, however, is explicitly corrected by the BCvmpx methods and it is not by the BCvdpx methods because daily data from different years are pooled before quantile mapping is carried out at the daily timescale.Consequently, biases in interannual standard deviations of monthly mean radiation are much larger after bias correction with BCvda1 than with BCvma1 (Fig. 4), and the BCvmpx methods are generally expected to perform better at the monthly timescale than the BCvdpx methods.
In order to assess whether bias correction at the daily or monthly timescale is more effective overall, a performance measure is needed that is comparable across timescales.Common performance measures of distribution adjustments at individual timescales are the two-sample Kolmogorov-Smirnov (KS) and Kuiper's two-sample test statistics.While Kuiper's test is equally sensitive to CDF differences at all quantiles, the KS test is more sensitive at the median than in the tails.A straightforward comparison of these test statistics across timescales is not very meaningful because sample sizes at the daily and monthly timescales differ by a factor of 30, which implies that the same value of a test statistic has different statistical significance at the daily and monthly timescales.A better comparability can be achieved by comparing the test statistic's p value, which represents  www.earth-syst-dynam.net/9/627/2018/Earth Syst.Dynam., 9, 627-645, 2018 the statistical significance of CDF differences.In the present cross validation, the CDFs compared are based on biascorrected E2OBS and the corresponding SRB data, and a higher p value indicates more similar CDFs and therefore a better bias correction.For details of the calculation of p values of the two-sample KS and Kuiper's two-sample test statistics see Appendix C. Global distributions of p values of two-sample test statistics for seasonal distributions of daily and monthly mean rlds and rsds are shown in Fig. 5 for the KS test and Fig. 6 for Kuiper's test.In accordance with expectations, both tests indicate that CDFs are generally better adjusted by BCvdpx than by BCvmpx at the daily timescale and vice versa at the monthly timescale.However, performance differences between BCvdpx and BCvmpx are clearly more significant at the daily than at the monthly timescale.This suggests that bias-correcting at the daily instead of at the monthly timescale yields bias decrements at the daily timescale that exceed bias increments at the monthly timescale.Therefore, bias correction at the daily timescale is deemed more effective overall than bias correction at the monthly timescale.
To elaborate on this further, the p = 10 % significance level is marked by a grey horizontal line in all panels of Figs. 5 and 6 and is to be compared with the 10th percentiles of the global distributions of p values of the two-sample test statistics.Any coincidence of such a 10th percentile with the 10 % significance level suggests that the corresponding pvalue distribution is in agreement with the null hypothesis of the respective test.Since the null hypothesis of both tests is that the samples compared are from the same underlying distribution, such a coincidence suggests that the bias correction which produced one of the samples compared worked perfectly within the limits of sampling uncertainty.Similarly, 10th percentiles of p-value distributions above (below) the 10 % significance level suggest overcorrections (undercorrections) in terms of sampling uncertainty.In that sense, the BCvtpx methods generally overcorrect at the monthly timescale and undercorrect at the daily timescale.
The KS and Kuiper's test statistics also confirm the finding of Sect.4.1.2that at the daily timescale, the BCvda1 methods outperform the BCvdb1 methods for longwave radiation and vice versa for shortwave radiation.This holds true for all seasons and irrespective of CDF differences being generally Earth Syst.Dynam., 9, 627-645, 2018 www.earth-syst-dynam.net/9/627/2018/greater in summer and winter (DJF and JJA) than in the transition seasons (MAM and SON) both before and after bias correction.Moreover, the test statistics find both BCvda1 and BCvdb1 to outperform BCvda0 at the daily timescale, which is in line with the finding of Sect.4.1.1 that the BCvda0 methods deflate day-to-day variability.
The fact that all BCvdp1 methods undercorrect at the daily timescale demonstrates the imperfections of these parametric quantile mapping methods.The remaining CDF differences must be linked to imperfect bias corrections of moments of higher-than-second order since multi-year mean values and standard deviations are well adjusted by design.To illustrate this, relative skewness biases remaining after bias correction with BCvdp1 are shown to exceed 50 % (median over calendar months × validation data samples) in many regions (Fig. 3).Another manifestation of the imperfections are remaining biases in the tails of the distribution of daily mean rlds and rsds.These must be larger than the remaining median biases because p values of Kuiper's test statistics for these distributions are generally larger than those of the corresponding KS test statistics.

Spatial disaggregation and sub-SRB-grid-scale spatial variability
As outlined in Sect.3.2.1, the BCvtp1 approach to the disaggregation of bias-corrected daily mean rlds and rsds values from the SRB to the E2OBS grid scale is based on the implicit assumption that the original E2OBS values of daily mean radiation onto the four E2OBS grid cells contained in one SRB grid cell must all be too low (high) if their areaweighted average is too low (high).The four original values are then all increased (decreased) by the BCvtp1 method.
In order to account for their upper (lower) physical bounds, the increases (decreases) are performed by a common scaling factor applied to the distances to these bounds.This leads to a reduction of the differences among the four values (this is necessarily true if the four bounds are equal; it is true in most cases if they are similar), i.e., to a deflation of sub-SRB-gridscale spatial variability.
In order to illustrate and quantify the extent of this variability deflation and compare the BCvtp0, BCvtp1, and BCvtp2 methods in terms of their impact on sub-SRBgrid-scale spatial variability, the root-mean-square deviation (RMSD) of the four E2OBS-grid-scale values of daily mean radiation per SRB grid cell from their area-weighted average is calculated over all days of a given calendar month both before and after bias correction with either method.Median relative bias-correction-induced changes of these RMSDs are depicted in Fig. 7 and demonstrate that BCvda1 indeed generally deflates them, in some regions by more than 20 % (median over calendar months) for both longwave and shortwave radiation.In contrast, BCvtp0 and BCvtp2 deflate or inflate them depending on variable and region.
In an analogous manner, such RMSDs can be computed based on data from the four E2OBS grid cells contained in one staggered SRB grid cell, where the staggered SRB  grid cells of the SRB grid (a-c, g-i) and the staggered SRB grid (d-f, j-l; see text).For every 1 • grid cell and calendar month, the RMSDs are calculated using original or bias-corrected E2OBS data from the four 0.5 • grid cells contained in the 1 • grid cell, pooling data from the entire December 1983-November 2007 time period and omitting shortwave radiation data from months with average rsdt less than 1 W m −2 .Depicted are median and agreement in the direction of monthly RMSD changes by bias correction (same colouring scheme as in Fig. 2).Very similar results are obtained for the corresponding basic bias correction methods.
grid is a regular 1.0 • × 1.0 • latitude-longitude grid shifted by 0.5 • latitude and 0.5 • longitude relative to the SRB grid, i.e., every staggered SRB grid cell contains E2OBS grid cells contained in four different SRB grid cells.Median relative bias-correction-induced changes of these RMSDs are also depicted in Fig. 7. Ideally, bias-correction-induced changes of RMSDs from SRB and staggered SRB grid cell mean values would be equal.It would then be impossible to tell from their comparison whether the bias correction's target distributions were defined on the SRB or on the staggered SRB grid.
The BCvdp1 methods do not fulfil this criterion as they deflate RMSDs from SRB grid cell mean values everywhere while inflating RMSDs from staggered SRB grid cell mean values in many regions, in particular over the tropical oceans.The criterion is much better fulfilled by the BCvdp2 and BCvdp0 methods.The RMSDs are generally greater after bias correction with BCvdp2 than with BCvdp0, i.e., BCvdp2 produces data with greater sub-SRB-grid-scale spa-tial variability than BCvdp0.This difference is most visible for longwave radiation, for which BCvdp0 produces a stark land-sea contrast of RMSD changes with strong RMSD reductions over land whereas BCvdp0 does so to a much lesser extent.This strong deflation of sub-SRB-grid-scale spatial variability by BCvtp0 is believed to be another artefact caused by the bilinear interpolation of SRB data to the E2OBS grid.

Summary and conclusions
This article introduces various parametric quantile mapping methods for the bias correction of E2OBS daily mean surface downwelling longwave and shortwave radiation using the corresponding SRB data.The quantile mapping methods differ in (i) the timescale at which they operate, (ii) if and how they take physical upper radiation bounds into account, and (iii) how they handle the spatial resolution gap between E2OBS and SRB.
A cross validation at the SRB grid scale demonstrates that statistics of daily mean radiation are better corrected by methods operating at the daily timescale than by methods operating at the monthly timescale, and vice versa for statistics of monthly mean radiation.Since these performance differences are statistically more significant at the daily than at the monthly timescale, overall, bias correction at the daily timescale is deemed more effective then bias correction at the monthly timescale.
The cross validation further suggests that it is generally worthwhile to explicitly take physical upper radiation bounds into account during quantile mapping.For shortwave radiation, different approaches to their estimation are tested.A simple approach using running maximum values is found to outperform a more complicated one based on daily mean insolation at the top of the atmosphere (rsdt).This must be due to other factors besides rsdt that influence the upper physical bounds of rsds.Atmospheric humidity is an example for such a factor: The highest rsds values usually occur under clear-sky conditions and they are higher the drier the atmosphere.Atmospheric humidity in turn is limited by the water vapour holding capacity of the atmosphere, which is controlled by atmospheric temperature.The climatology of atmospheric temperature lags behind that of rsdt.Hence, the climatology of the upper physical bounds of rsds can be expected to deviate from the rsdt climatology.
The cross validation also reveals to what extent the bilinear spatial interpolation of SRB data to the E2OBS grid prior to bias correction with the BCvtp0 methods deflates day-to-day variability.This variability deflation has a greater effect on bias correction performance than a change of if and how physical upper radiation bounds are taken into account during quantile mapping, but a much smaller effect than a change of the timescale at which the quantile mapping is carried out.
Lastly, the cross validation at the daily timescale shows that none of the quantile mapping methods tested here are perfect, concerning in particular the adjustment of distribution tails and moments of higher-than-second order.This indicates that the true distribution of rlds and rsds is not always exactly normal or beta, as assumed by the parametric quantile mapping methods tested here.Potentially, nonparametric quantile mapping methods (that do not rely on such assumptions) could yield better cross-validation results as long as overfitting is avoided (e.g., Gudmundsson et al., 2012).However, an introduction of and comparison to such methods is beyond the scope of this article.
To bridge the spatial resolution gap between E2OBS and SRB, the methods used for the production of EWEMBI rlds and rsds deterministically disaggregate the E2OBS data previously aggregated to and bias-corrected at the SRB grid.It is shown that the method used for that disaggregation introduces artefacts in the sub-SRB-grid-scale spatial variability, which can be overcome by applying quantile mapping directly at the E2OBS grid using either bilinearly interpolated SRB data or target distribution parameters that are based on the more coarsely resolved SRB data as well as on sub-SRBgrid-scale spatial variability present in the original E2OBS data.This latter approach yields both good cross-validation results at the SRB grid scale and suitable adjustments of the sub-SRB-grid-scale spatial variability.
The best methods identified here are therefore BClda2 for rlds and BCsdb2 for rsds.In comparison to BClda1 and BCsda1 used for the production of EWEMBI rlds and rsds, bias correction with these methods yields more natural sub-SRB-grid-scale spatial variability and, in the case of rsds, slightly better cross-validation results at the SRB grid scale.Edited by: Valerio Lucarini Reviewed by: two anonymous referees

Figure 1 .
Figure 1.Estimation of parameters of quantile mapping methods used for the bias correction of longwave (a, b) and shortwave (c, d) radiation at the daily (a, c) and monthly (b, d) timescales.This example is based on SRB daily mean rlds and rsds data from 79.5 • N, 12.5 • E and the December 1983-November 2007 time period.Climatological distribution parameters are estimated based on empirical 24year mean values (dark grey), standard deviations (light grey range around mean values), and minimum and maximum values (black) of daily mean (a, c) and 31-day running mean (b, d) radiation computed for every day of the year.The distribution parameters estimated for the basic (red) and advanced (blue) bias correction methods (see Table1) include mean values and standard deviations (dotted red, dashed blue), and upper bounds (solid red, solid blue) where beta distributions are used.Note that the basic and advanced estimates of mean values and standard deviations only differ in panel (c) near the beginning and end of polar night (see Table1).The green line in panel (a) represents 25-day running mean values of 25-day running maximum values of 24-year maximum values of daily mean rlds, which are used to estimate the upper bounds of the climatological beta distributions used by the BClda1 method (solid blue line in panel a).The lower bounds of all climatological beta distributions are set to zero.

Figure 2 .
Figure 2. Biases relative to SRB in mean values (a-f) and standard deviations (g-l) of spatially aggregated (to the SRB grid) daily mean longwave (a-c, g-i) and shortwave (d-f, j-l) radiation after bias correction with methods BCvda0 (left), BCvda1 (middle), and BCvdb1 (right).The biases are calculated individually for each calendar month (January to December) and calibration data sample (every1st, every2nd) pooling SRB and corrected E2OBS data from all years of the corresponding validation data sample (every2nd, every1st, respectively) and omitting shortwave radiation data from months with monthly mean rsdt less than 1 W m −2 (see Appendix B and Fig.D1c).Depicted are median and agreement in direction (sign of bias) of these individual biases, represented by hue and saturation of a grid cell's colour, respectively.Categories of agreement in bias direction are defined based on one-sided p values obtained from modelling underestimations and overestimations for individual calendar months and validation data samples as outcomes of independent 50/50 Bernoulli trials.More saturated colours indicate higher statistical significance of biases remaining after bias correction.

Figure 3 .
Figure 3. Same as Fig. 2 but for biases in skewness (a-f) and 12-year maximum values (g-l).

Figure 4 .
Figure 4. Same as Fig. 2 but for relative biases in interannual standard deviations of monthly mean radiation remaining after bias correction with methods BCvda1 (left) and BCvma1 (right).

Figure 5 .
Figure 5. Overall performance of bias correction methods BCvda1, BCvda0, BCvdb1, and BCvma1 for longwave (top) and shortwave (bottom) radiation at the daily (left) and monthly (right) timescales as quantified by p values of two-sample Kolmogorov-Smirnov test statistics of the respective E2OBS and SRB data before (black) and after (colours) bias correction (see Appendix C; greater p values indicate stronger agreement of E2OBS and SRB distributions).The p values are determined individually for each grid cell, season, and calibration data sample, with all corresponding values pooled into one distribution and omitting shortwave radiation data from months with average rsdt less than 1 W m −2 .The horizontal lines of each box-whisker plot represent the 90th, 75th, 50th, 25th, and 10th (from top to bottom) grid-cell area-weighted percentiles of the natural logarithms of these p values over calibration data sample (1sthalf, 2ndhalf), latitude, and longitude.The grey horizontal line marks the p = 10 % significance level.

Figure 6 .
Figure 6.Same as Fig. 5 but based on p values of Kuiper's two-sample test statistic.

Figure 7 .
Figure 7. Relative change by bias correction with methods BCvda0 (left), BCvda1 (middle), and BCvda2 (right) of the root-mean-square deviation (RMSD) of daily mean E2OBS-grid-scale longwave (a-f) and shortwave (g-l) radiation from the aggregated SRB-grid-scale values based on 1• grid cells of the SRB grid (a-c, g-i) and the staggered SRB grid (d-f, j-l; see text).For every 1 • grid cell and calendar month, the RMSDs are calculated using original or bias-corrected E2OBS data from the four 0.5 • grid cells contained in the 1 • grid cell, pooling data from the entire December 1983-November 2007 time period and omitting shortwave radiation data from months with average rsdt less than 1 W m −2 .Depicted are median and agreement in the direction of monthly RMSD changes by bias correction (same colouring scheme as in Fig.2).Very similar results are obtained for the corresponding basic bias correction methods.

Figure D1 .
Figure D1.Optimal window length for running mean and maximum calculations that precede the estimation of parameters of the climatological distributions of longwave (v = l; top) and shortwave (v = s; bottom) radiation that are used for bias correction with BCvda1 (see Table1).Window lengths are varied between 10 and 40 days.Optimal window lengths maximise the p value of the two-sample KS statistic of bias-corrected E2OBS data cross validated against SRB data (see Sect. 4 and Appendix C) and are determined individually for every grid cell, calendar month (with all corresponding values pooled into one distribution), and calibration data sample (every1st, every2nd).Zonal medians of optimal window lengths for each month and calibration data sample are shown in panels (a) and (c).Results are masked in (c) where and when the monthly mean rsdt (Eqs.B1-B4) is less than 1 W m −2 .Panels (b) and (d) show medians of optimal window lengths over months and calibration data samples.

Figure D2 .
Figure D2.Dependence of two-sample KS statistic p values on window length for different radiation types and calibration data samples (see text and Fig. D1).Plotted are the grid-cell area-weighted 50th (a) and second (b) percentiles of the natural logarithms of the p values over months, latitudes, and longitudes.
• × 1.0 • latitude-longitude grid and span the July 1983-December 2007 time period.For bias correction and cross validation, a 24-year subsample of these data that spans the December 1983-November 2007 time period was used and is used here.Additional data from the adjacent months November 1983 and December 2007 are employed for computations of running mean values.The SRB estimates of rlds and rsds are based on satellite-derived cloud parameters and ozone fields, reanalysis meteorology, and a few other ancillary datasets.Due to a lack of satellite coverage during most of the July 1983-June 1998 time period over an area centred at 70 • E, SRB data artefacts are present over the Indian Ocean (https://gewex-srb.larc.nasa.gov/common/php/SRB_known_issues.php, last access: 18 May 2018; see Figs. 2-4, 7).Every SRB grid cell contains exactly four E2OBS grid cells.