The manuscript exposes the results of the computation of the Spatial Permutation Entropy (SPE) on two SST datasets (ERA5 and NOAA OI v2) over two regions (El Nino and Gulf Stream). The SPE is computed from “spatial patterns”, which encode how neighbouring pixels values compare against each other, and the SPE is defined as an entropy over the distribution of these patterns. Spatial Mutual Information (SMI) on the distribution of patterns are also considered. I thank the authors for the precisions added with respect to the previous versions of the manuscript, which made some explanations clearer. I think in particular that the videos showing the evolution of the histogram of the patterns are useful to interpret the variations of the SPE.
There are three ways in which the SPE can be used:
1. It can be used to detect transitions in a dataset, either through change points in H_{NS} and H_{WE} (Sec. 4.1), or through SMI_{NS} and SMI_{WE} (Sec. 4.2, 2 datasets required)
2. Variations of the SPE can be interpreted in terms of change of relative importance of spatial patterns, which the authors interpret as increase or decrease of the gradient patterns (Sec. 4.1)
3. Characterize the similarity of two datasets (how much the distribution of patterns is the same for the two datasets, Sec. 4.2)
This underlines the potential versatility of the tool, as emphasized by the authors in the manuscript. However, I think that concrete and pragmatic results (quantitative and qualitative) are lacking for the SPE to be used by other researchers in their work on different datasets.
Main comments:
About use 1. of the SPE: it is shown in Sec. 4.2 that the transitions detected by PELT on SPE time series are not all detected by PELT on SMI_{hist}, and none are detected in the time series of the Pearson’s spatial cross-correlation coefficient (r) and of the Average Absolute Difference (AAD). I think that this is the major clue of the manuscript in favour of the SPE for detecting transitions. The contrast of SMI_{hist}, the AAD and r with the SPE is otherwise quite poor, so that it is not clear whether these tools could be complementary to the SPE (see below). The comparison of the SPE with the AAD and r is done when using the two datasets to detect transitions, and no equivalent comparison is done in the much more common case where only one dataset is available for the variables of interest.
I also wonder whether the AAD and r are the best choices to provide a point of comparison. It seems from lines 338-339 that techniques to detect transitions already exist in the geophysical literature, why did the authors not use the methods of these papers to provide points of comparison ?
Accordingly to the remarks added by the authors in this new version of the manuscript, there is no guarantee that the SPE detects all change points (there are actually some change points which were detected in some configurations of the SPE but not in others) so that the authors propose to use this tool in conjunction with other data analysis techniques. Unfortunately, the authors do not suggest which other tools could be used in complement to the SPE. It is left to the user to further understand the weak points of the SPE to identify which tools should be used in addition to the SPE. A way of addressing this issue would be provide a quantification of the transitions detected. Such tests can be done by generating synthetic datasets with known transitions. I think that, yet another possibility would be to count the fraction of transitions detected by the SPE in the ERA5 and in the NOAA datasets, after having listed all the changes in the methodology to produce these datasets (it seems reasonable to me that an almost exhaustive listing is doable since I expect datasets like ERA5 to be well-documented). With synthetic datasets, one could also investigate which type of transitions is detected by the method and which are not.
About use 2. of the SPE: the authors interpret variations of the SPE as increases or decreases of gradients in the SST. I think that this is an interesting potential use, but I think that the caveat for such a use should be refined. Indeed, as explained at lines 221-230, some low values of the SPE with delta = 8 in the NS direction of the El Nino region are not explained by gradients of the SST. It can be checked explicitly on the videos showing the histograms of patterns for ERA5 on this region that there are a lot of histograms which are not dominated by the patterns 0123 and 3210 (for example on 1998-06, 2002-11, 2010-03, 2010-05, 2015-06, 2015-10, 2025-01). I think that this is a nice example that a low SPE does not automatically mean that the 0123 and 3210 patterns dominate and that using the SPE in such a way would require more investigation to be able to confidently draw conclusions on datasets for which we do not whether there are gradients or not.
About use 3.: the authors finally use the SPE to compare the two datasets. Again, I find the conclusions somewhat vague. There are two conclusions: a) the datasets become more similar with time, b) the datasets are more similar at large scales. I think that conclusion a) is to be expected, given the “significant advances in Earth observation systems due to the introduction of new satellite observations and new data processing methodologies” (lines 293-294). Stated in this way, conclusion b) should also be expected. It would interesting to be able define a scale (possibly depending on the user’s needs) where the datasets agree sufficiently (this scale would probably depend on time, since the datasets are more similar with time). I think that Fig. C10 provides an interesting point to start this analysis, and I think that developing this in the main text would support this way of using the SPE. Characterizing qualitatively the differences in the datasets could also be interesting, so that users could choose one or the other depending on the processes they study.
Minor comments:
I do not understand the explanation about the sudden drops of H_{WE} for delta = 8 in the lines 217-221: why uneven cooling/warming explain sudden drops ? What does exactly mean “at the smaller scale the variations of SST are more uniform” (lines 220-221) ? Is there a reference for that ? Or is does it just mean that the SST values are very noisy at these scales ?
What can we conclude about the fact that there are correlations between the SST anomaly and the SPE for the El Nino region (lines 207-216), especially given that no correlation is found in the Gulf Stream region (lines 231-233) ?
What can we conclude about the fact that a transition in 2016 is detected in SMI_{NS} with delta = 1 in the El Nino region (lines 251-252) ? Is this related to some change in the datasets ? Why was it not detected from Fig. 3 ?
Are the transitions reported in lines 259-264 already found in Sec. 4.1 ? If no, does that mean that it is better to have two datasets to compare to detect transitions in one of them, rather than computing the SPE on one dataset ?
I find Appendix A confusing:
1. If I understand correctly, the CPD algorithm used for the SMI is described in lines 352-361, while the one used for the SPE, the AAD and r is described at lines 349-350 and 362-376. Is that correct ?2. Are surrogates created for the SMI ? I would think so from lines 341-347, but not from lines 352-361.
3. I do not really understand the point of the steps described in lines 368-376: from what I understand, the previous step allows to identify a value of P for which no false change points are detected, so why add another step ?
4. Why make a difference between the points mentioned in the main text and those reported in Table A1 ? If change points are considered robust (and therefore reported in Table A1), why not consider them in the text ?
5. The first two quartiles of P are the same than those of R (Eq. (A1)), so the lines 368-376 seem to simply describe that half of the change points are considered robust, is that correct ? Since change points are supposed to correspond to something real, it is quite arbitrary to consider that half of them.
Are there other methods to choose a suitable penalty parameter than the one described in Appendix A ? The original paper about PELT seems to expose some of them and Rocha and de Souza Filho (2020) (cited at line 339) seems to discuss methods to choose penalty functions. Why did the authors not use these functions ? To have a better idea of the performance of the SPE to detect transitions, I think that it would be better not to use new methods to choose P.
Technical comments:
Appendix B seems to have redundant explanations with Appendix A, please merge the two.
I find the notation to report coefficients and p-values in the caption of Fig. 4 not very clear.
Fig. C10: is there a reason why the results displayed were computed with L = 3 instead of L = 4 as in the rest of the paper ?
Line 100: k is not an integer, so k \in [1, …, L!] is not really correct
Line 154: “we have also performed the analysis…” (“the” is missing)
Line 324: “Both the size…” (no comma after “both”)
Line 416-417: the sentence “The first one corresponding to a change point with linear trend before and after, which survives detrending, and the second one to a trend/no-trend transition” misses verbs.
Lines 189-198: The p-value of the coefficient from the fit in panel (a) of Fig. 4 implies that the coefficient can be considered to be 0. But this paragraph is a little misleading about that (it seems to say that all entropies follow the same kind of trend).
Line 385: “panel a” → “panel (a)”, same for “panel d”
Lines 221-230: it would maybe be easier to understand if it is said explicitly that the patterns with delta = 8 span more than half of the length in the NS region for the selected region.
Lines 254-255: “at long scales, warming signals are consistently identified in both, ERA5 and NOAA”: how exactly are identified the warming signals in Fig. 6 ? Is this related to what is discussed about Fig. 4 ?
Lines 264-265: it should be said the transition here is in addition to the 2007 one reported just above.
Line 305: what is meant by “that are consistent with the two datasets”?
Line 350: “ADD” → “AAD”
Line 355: “non” → “none”
Line 362: “ADD” → “AAD” |
Find comments in the attached PDF.