08 Apr 2021
08 Apr 2021
Is time a variable like the others in multivariate statistical downscaling and bias correction?
 ^{1}MétéoFrance, 42 avenue GaspardCoriolis, 31057, Toulouse, France
 ^{2}Laboratoire des Sciences du Climat et de l’Environnement (LSCEIPSL), CEA/CNRS/UVSQ, Université ParisSaclay, Centre d’Etudes de Saclay, Orme des Merisiers, 91191 GifsurYvette, France
 ^{}These authors contributed equally to this work.
 ^{1}MétéoFrance, 42 avenue GaspardCoriolis, 31057, Toulouse, France
 ^{2}Laboratoire des Sciences du Climat et de l’Environnement (LSCEIPSL), CEA/CNRS/UVSQ, Université ParisSaclay, Centre d’Etudes de Saclay, Orme des Merisiers, 91191 GifsurYvette, France
 ^{}These authors contributed equally to this work.
Abstract. Bias correction and statistical downscaling are now regularly applied to climate simulations to make then more usable for impact models and studies. Over the last few years, various methods were developed to account for multivariate – intersite or intervariable – properties in addition to more usual univariate ones. Among such methods, temporal properties are either neglected or specifically accounted for, i.e., differently from the other properties. In this study, we propose a new multivariate approach called “Time Shifted Multivariate Bias Correction” (TSMBC), which targets to correct the temporal dependency in addition to the other marginal and multivariate aspects. TSMBC relies on considering the initial variables at various times (i.e., lags) as additional variables to correct. Hence, temporal dependencies (e.g., autocorrelations) to correct are viewed as intervariable dependencies to be adjusted and an existing multivariate bias correction (MBC) method can then be used to answer this need. This approach is first applied and evaluated on synthetic data from a Vector Auto Regressive (VAR) process. In a second evaluation, we work in a “perfect model” context where a Regional Climate Model (RCM) plays the role of the (pseudo) observations, and where its forcing Global Climate Model (GCM) is the model to be downscaled/bias corrected. For both evaluations, the results show a large reduction of the biases in the temporal properties, while intervariable and spatial dependence structures are still correctly adjusted. However, increasing too much the number of lags to consider does not necessarily improve the temporal properties and a too strong increase in the number of dimensions of the dataset to correct can even imply some potential instability in the adjusted/downscaled results, calling for a reasoned use of this approach for large datasets.
 Preprint
(3431 KB) 
Supplement
(1885 KB)  BibTeX
 EndNote
Yoann Robin and Mathieu Vrac
Status: closed

RC1: 'Comment on esd202112', Anonymous Referee #1, 18 May 2021
General Comments:
In this manuscript, a new method of incorporating the temporal variable into a multivariable bias correction is introduced with sufficient motivations and with a thorough and clear description. This new method is versatile in that it can work with any existing MBC's and this is demonstrated via applying it to dOTC and to a more naive method they call Random Bias Correction. The method is first tested on a synthetic dataset for an explorative tuning of the parameters, then to a real dataset. A few points from the analyses from the real data experiments are unconvincing (this will be touched in the specific comments), but most results are wellsupported. A new generalizable metric is introduced for measuring bias reduction relative to some groundtruth dataset but its benefits and shortcomings could be discussed further.
Specific Comments:
 In section 2.2, the concept of reconstruction by rows is introduced. Reconstruction by rows certainly seem to perform better than reconstruction by columns. It is asserted here that many reconstructions are possible and that these are determined by the "starting row". Starting the $n^th$ row for $1< n < l$ for some lag $l$ omits the first $n1$ values, which are clearly needed in the final reconstruction. It is possible that those $n1$ values are repeated more than once in the lagged matrix and a more specific description of how to include these values is needed.
 Section 3.1 asserts that the starting row has little impact on the overall bias correction performance, and this is attributed to the high correlations of the results of the TSMBC method to the biased data matrix X, as well as the high correlations between the results of the TSMBC methods with varying starting rows (as shown in Figure 3). In figure 3, it is also shown that all TSMBC results have very low correlations with Y, the reference matrix. Shouldn't the results of TSMBC be "corrected" and therefore aspire to exhibit higher correlations with Y more than X?
 The major aspect of TSMBC is that by adding lagged versions of the original time series data, the data is augmented to include the temporal variable as just another variable. This initial mapping from a dimension of size $N_X\times d$ to $(N_Xs)\times d(s+1)$ is injective but the inverse mapping is not. The authors chose to use a simple reconstruction that only relies on one extra parameter, the starting row, as a way to choose this inverse mapping, and assert in section 3.1 that the choice of the starting row does not have a big impact. Given that the analysis of figure 3 is unconvincing, it may be important to more carefully consider how to design the inverse mapping. For example, what is the variance of the repeated values? For TSMBC with lag $s$, there are some time indices that are repeated $s+1$ times total in the reconstruction. Are those $s+1$ values all very close to each other? If not, should some averaging scheme be used? If not, what does the variability in the reconstruction at some time index indicate about whether it should be trusted?
 Regarding the analysis of figure 8 (pg 13, lines 372389): The statement in line 374375 "Generally speaking, for a specific configuration of the method (i.e., L1V, L2V, S1V or S2V), TSMBC (5 or 10) is better than dOTC that does not account for temporal properties. " is not well supported by figure 8. Apart from the plots for tas/tas (first column in figure 8), it is difficult to see that the TSMBC cells show darker (higher BR_w) values than the naive comparison dOTC. In addition, shouldn't the 3 methods (dOTC, TSMBC5, TSMBC10) all show the same value/color for lag 0 for each L1V, L2V, S1V, and S2V? What are some reasons they are not?
 One justification for why TSMBC10 performs worse than TSMBC5 is given by the fact that the inflated data size $(N_X10)\times d(10+1)$ results in a higher complexity method. In line 412413, it is stated "The increase in the complexity (i.e., the number of dimensions) of the method is made at the expense of the quality of the results." This is a vague statement and could be made stronger with more specific ideas. For example, the increased number of dimensions could potentially lead to linear dependence which then could interfere with the underlying MBC method being used. There could be some other ways that the increased complexity could have negative effects, and they should be discussed in more detail. Given the size of the problem, numerical instability should probably be ruled out.
 Regarding the BR_{\Kappa} metric. One downside of this metric is explained well in the conclusions, in line 458461: "However, biases in the intensities of the (intervariable, intersiteortemporal) correlations might remain. This is typically related to very small differences between two Wasserstein distances very close to zero: if the raw simulations already have a DCP set close to the reference, its Wasserstein distance will be near zero. Therefore, the relative reduction of bias BR can be strongly negative, even though the absolute difference is potentially very small."
Maybe this point should be suggested when the metric is first introduced in section 4.1.
Technical Corrections
 Should "corrected" in line 227 be "correlated" instead?
 Line 289 should have [\infty,1] instead of ]\infty, 1]

AC1: 'Reply on RC1', Yoann Robin, 30 Jun 2021
The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd202112/esd202112AC1supplement.pdf

RC2: 'Comment on esd202112', Anonymous Referee #2, 19 May 2021
This manuscript deals with a new approach (TSMBC) of how to incorporate the time as additional variable into a multivariable bias correction. The approach can be conducted with existing multivariate BC methods such as MBCn, R2D2, or MRrec. Here, the dOTC approach is followed and the results are compared to a “naive” method, the “Random Bias Correction” (RBC).
The method is first tested on a synthetic dataset, following a VAR process, before applying it to “real” climate data, based on a pseudo reality approach, i.e. treating the RCM results as observations.
The approach could potentially be interesting and innovative. It seems that this is the first time that the time is treated as separate variable in the bias correction. However, I have some doubts that the results are reliable for the application with the real data case (see detailed comments below). Moreover, I think that the evaluation of the TSMBC using synthetic data based on the VAR process is of limited value. It did not convince me technically and scientifically, nor did it help me to better understand the proposed procedure.
On the other hand, more information is required to understand the potential value of the TSMBC. Authors did not convincingly present the methodological background. Critical questions remain unanswered, e.g. what is a VAR process? How is the sampling from the VAR process done? How does the dOTC works?
The Wasserstein metric is also not well introduced in the method section.
Major issues:
 It remains spurious how and why the increase of the numbers of dimensions (could be time lags or other “variables”) affects the stability of the approach. It is just mentioned that the dimensionality should not exceed 10.
 I have some concerns about applying a BC using climate simulations (based on GCMs and not on reanalysis data) if the temporal sequence of variables is addressed, however, in this case I think it would be acceptable, since the reference is not observation data but downscaled results of the same forcing GCM.
 My main concern stems from Figure 1 (right, top line). It seems that the mean precipitation and temperature fields do not correspond to the coast line, as I would strongly assume. Due to the coarse resolution, you would expect some distortions in the overlay, but this looks really erroneous. It seems that the projection of GCM and RCM is wrong, it could be reversed left to right.
 Unfortunately, this would have tremendous impacts on the results and interpretations in the following (e.g. the spatial dependencies given in Figure 6). For instance, please explain the statement in lines 300302. Why is the evolution of GCM variables so different from that of the RCM? Indeed, the RCM includes more spatiallydetailed “processes”, but is still driven by the GCM. Since the domain of the RCM is rather small, the impact of the forcing is expected to dominate the RCM simulations.
 Moreover, I cannot understand the differences the different performances of the calibration and the projection period (Figure 4 & 5). I would expect very similar performances. What is leading to the big discrepancies between the different periods?
 The evaluation results of the TSMBC using synthetic data based on the VAR process are not convincing (whole section 3) and – at least for me – not fully understandable. For the revisions, I would suggest to leave out this synthetic exercise. Rather, I would focus on better explain the applied methods, i.e. the bias corrections approach applied here (dOTC), the Wettersteinbased metric, and how the naïve RBC (reference approach) works. I am also wondering if this naïve approach is really suitable for fair comparison.
 The introduction should be improved, e.g. the statement given in line 28 (… (ii) from inherent biases in the model simulations.”) is not very helpful. Potential reasons for the biases shall be mentioned. More and more recent references are required, e.g. for strong statements given in lines 39 & 40.

AC2: 'Reply on RC2', Yoann Robin, 30 Jun 2021
The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd202112/esd202112AC2supplement.pdf
Status: closed

RC1: 'Comment on esd202112', Anonymous Referee #1, 18 May 2021
General Comments:
In this manuscript, a new method of incorporating the temporal variable into a multivariable bias correction is introduced with sufficient motivations and with a thorough and clear description. This new method is versatile in that it can work with any existing MBC's and this is demonstrated via applying it to dOTC and to a more naive method they call Random Bias Correction. The method is first tested on a synthetic dataset for an explorative tuning of the parameters, then to a real dataset. A few points from the analyses from the real data experiments are unconvincing (this will be touched in the specific comments), but most results are wellsupported. A new generalizable metric is introduced for measuring bias reduction relative to some groundtruth dataset but its benefits and shortcomings could be discussed further.
Specific Comments:
 In section 2.2, the concept of reconstruction by rows is introduced. Reconstruction by rows certainly seem to perform better than reconstruction by columns. It is asserted here that many reconstructions are possible and that these are determined by the "starting row". Starting the $n^th$ row for $1< n < l$ for some lag $l$ omits the first $n1$ values, which are clearly needed in the final reconstruction. It is possible that those $n1$ values are repeated more than once in the lagged matrix and a more specific description of how to include these values is needed.
 Section 3.1 asserts that the starting row has little impact on the overall bias correction performance, and this is attributed to the high correlations of the results of the TSMBC method to the biased data matrix X, as well as the high correlations between the results of the TSMBC methods with varying starting rows (as shown in Figure 3). In figure 3, it is also shown that all TSMBC results have very low correlations with Y, the reference matrix. Shouldn't the results of TSMBC be "corrected" and therefore aspire to exhibit higher correlations with Y more than X?
 The major aspect of TSMBC is that by adding lagged versions of the original time series data, the data is augmented to include the temporal variable as just another variable. This initial mapping from a dimension of size $N_X\times d$ to $(N_Xs)\times d(s+1)$ is injective but the inverse mapping is not. The authors chose to use a simple reconstruction that only relies on one extra parameter, the starting row, as a way to choose this inverse mapping, and assert in section 3.1 that the choice of the starting row does not have a big impact. Given that the analysis of figure 3 is unconvincing, it may be important to more carefully consider how to design the inverse mapping. For example, what is the variance of the repeated values? For TSMBC with lag $s$, there are some time indices that are repeated $s+1$ times total in the reconstruction. Are those $s+1$ values all very close to each other? If not, should some averaging scheme be used? If not, what does the variability in the reconstruction at some time index indicate about whether it should be trusted?
 Regarding the analysis of figure 8 (pg 13, lines 372389): The statement in line 374375 "Generally speaking, for a specific configuration of the method (i.e., L1V, L2V, S1V or S2V), TSMBC (5 or 10) is better than dOTC that does not account for temporal properties. " is not well supported by figure 8. Apart from the plots for tas/tas (first column in figure 8), it is difficult to see that the TSMBC cells show darker (higher BR_w) values than the naive comparison dOTC. In addition, shouldn't the 3 methods (dOTC, TSMBC5, TSMBC10) all show the same value/color for lag 0 for each L1V, L2V, S1V, and S2V? What are some reasons they are not?
 One justification for why TSMBC10 performs worse than TSMBC5 is given by the fact that the inflated data size $(N_X10)\times d(10+1)$ results in a higher complexity method. In line 412413, it is stated "The increase in the complexity (i.e., the number of dimensions) of the method is made at the expense of the quality of the results." This is a vague statement and could be made stronger with more specific ideas. For example, the increased number of dimensions could potentially lead to linear dependence which then could interfere with the underlying MBC method being used. There could be some other ways that the increased complexity could have negative effects, and they should be discussed in more detail. Given the size of the problem, numerical instability should probably be ruled out.
 Regarding the BR_{\Kappa} metric. One downside of this metric is explained well in the conclusions, in line 458461: "However, biases in the intensities of the (intervariable, intersiteortemporal) correlations might remain. This is typically related to very small differences between two Wasserstein distances very close to zero: if the raw simulations already have a DCP set close to the reference, its Wasserstein distance will be near zero. Therefore, the relative reduction of bias BR can be strongly negative, even though the absolute difference is potentially very small."
Maybe this point should be suggested when the metric is first introduced in section 4.1.
Technical Corrections
 Should "corrected" in line 227 be "correlated" instead?
 Line 289 should have [\infty,1] instead of ]\infty, 1]

AC1: 'Reply on RC1', Yoann Robin, 30 Jun 2021
The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd202112/esd202112AC1supplement.pdf

RC2: 'Comment on esd202112', Anonymous Referee #2, 19 May 2021
This manuscript deals with a new approach (TSMBC) of how to incorporate the time as additional variable into a multivariable bias correction. The approach can be conducted with existing multivariate BC methods such as MBCn, R2D2, or MRrec. Here, the dOTC approach is followed and the results are compared to a “naive” method, the “Random Bias Correction” (RBC).
The method is first tested on a synthetic dataset, following a VAR process, before applying it to “real” climate data, based on a pseudo reality approach, i.e. treating the RCM results as observations.
The approach could potentially be interesting and innovative. It seems that this is the first time that the time is treated as separate variable in the bias correction. However, I have some doubts that the results are reliable for the application with the real data case (see detailed comments below). Moreover, I think that the evaluation of the TSMBC using synthetic data based on the VAR process is of limited value. It did not convince me technically and scientifically, nor did it help me to better understand the proposed procedure.
On the other hand, more information is required to understand the potential value of the TSMBC. Authors did not convincingly present the methodological background. Critical questions remain unanswered, e.g. what is a VAR process? How is the sampling from the VAR process done? How does the dOTC works?
The Wasserstein metric is also not well introduced in the method section.
Major issues:
 It remains spurious how and why the increase of the numbers of dimensions (could be time lags or other “variables”) affects the stability of the approach. It is just mentioned that the dimensionality should not exceed 10.
 I have some concerns about applying a BC using climate simulations (based on GCMs and not on reanalysis data) if the temporal sequence of variables is addressed, however, in this case I think it would be acceptable, since the reference is not observation data but downscaled results of the same forcing GCM.
 My main concern stems from Figure 1 (right, top line). It seems that the mean precipitation and temperature fields do not correspond to the coast line, as I would strongly assume. Due to the coarse resolution, you would expect some distortions in the overlay, but this looks really erroneous. It seems that the projection of GCM and RCM is wrong, it could be reversed left to right.
 Unfortunately, this would have tremendous impacts on the results and interpretations in the following (e.g. the spatial dependencies given in Figure 6). For instance, please explain the statement in lines 300302. Why is the evolution of GCM variables so different from that of the RCM? Indeed, the RCM includes more spatiallydetailed “processes”, but is still driven by the GCM. Since the domain of the RCM is rather small, the impact of the forcing is expected to dominate the RCM simulations.
 Moreover, I cannot understand the differences the different performances of the calibration and the projection period (Figure 4 & 5). I would expect very similar performances. What is leading to the big discrepancies between the different periods?
 The evaluation results of the TSMBC using synthetic data based on the VAR process are not convincing (whole section 3) and – at least for me – not fully understandable. For the revisions, I would suggest to leave out this synthetic exercise. Rather, I would focus on better explain the applied methods, i.e. the bias corrections approach applied here (dOTC), the Wettersteinbased metric, and how the naïve RBC (reference approach) works. I am also wondering if this naïve approach is really suitable for fair comparison.
 The introduction should be improved, e.g. the statement given in line 28 (… (ii) from inherent biases in the model simulations.”) is not very helpful. Potential reasons for the biases shall be mentioned. More and more recent references are required, e.g. for strong statements given in lines 39 & 40.

AC2: 'Reply on RC2', Yoann Robin, 30 Jun 2021
The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd202112/esd202112AC2supplement.pdf
Yoann Robin and Mathieu Vrac
Yoann Robin and Mathieu Vrac
Viewed
HTML  XML  Total  Supplement  BibTeX  EndNote  

528  134  17  679  40  12  13 
 HTML: 528
 PDF: 134
 XML: 17
 Total: 679
 Supplement: 40
 BibTeX: 12
 EndNote: 13
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1