Classification of synoptic circulation patterns with a two-stage clustering algorithm using the structural similarity index metric (SSIM)

Winderlich, Kristina; Dalelane, Clementine; Walter, Andreas

doi:https://doi.org/10.5194/esd-2022-29

Preprints

https://doi.org/10.5194/esd-2022-29

Preprints

29 Jul 2022

| 29 Jul 2022

Status: this preprint has been withdrawn by the authors.

Classification of synoptic circulation patterns with a two-stage clustering algorithm using the structural similarity index metric (SSIM)

Kristina Winderlich, Clementine Dalelane, and Andreas Walter

Abstract. We develop a new classification method for synoptic circulation patterns with the aim to extend the evaluation routine for climate simulations. This classification is applicable for any region of the globe of any size given the reference data. Its unique novelty is the use of the structural similarity index metric (SSIM) instead of traditional distance metrics for cluster building. This classification method combines two classical clustering algorithms used iteratively, hierarchical agglomerative clustering (HAC) and k-medoids, with the only one pre-set parameter – the threshold on the similarity between two synoptic patterns expressed as the structural similarity index measure SSIM. This threshold is set by the user to imitate the human perception of the similarity between two images (similar structure, luminance and contrast) and the number of final classes is defined automatically.

We apply the SSIM-based classification method on the geopotential height at the pressure-level of 500 hPa from the reanalysis data ERA-Interim 1979–2018 and demonstrate that the built classes are 1) consistent to the changes in the input parameter, 2) well separated, 3) spatially and temporally stable, and 4) physically meaningful.

We use the synoptic circulation classes obtained with the new classification method for evaluating CMIP6 historical climate simulations and an alternative reanalysis (for comparison purposes). The output fields of CMIP6 models (and of the alternative reanalysis) are assigned to the classes and the quality index is computed. We rank the CMIP6 simulations according to this quality index.

This preprint has been withdrawn.

Received: 05 Jul 2022 – Discussion started: 29 Jul 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 2639 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (2639 KB)

Download & links

This preprint has been withdrawn.

Kristina Winderlich, Clementine Dalelane, and Andreas Walter

Interactive discussion

Status: closed

RC1:
'Comment on esd-2022-29', Anonymous Referee #1, 04 Aug 2022

Please see attached review.

Citation: https://doi.org/10.5194/esd-2022-29-RC1
- AC2: 'Reply on RC1', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC2
AC1:
'Comment on esd-2022-29', Kristina Winderlich, 17 Aug 2022

The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC1-supplement.pdf

Citation: https://doi.org/10.5194/esd-2022-29-AC1
- EC1: 'Reply on AC1', Gabriele Messori, 09 Sep 2022
  
  Dear Authors,
  >> We would like to ask the Editor to make the judgement, which options the manuscript may
  
  >> still have.
  In reply to the above query, I would suggest that you provide replies to all reviewer comments. Based on the comments and your replies I will then inform you of my decision regarding the manuscript.
  Best Regards,
  Gabriele Messori
  
  Citation: https://doi.org/10.5194/esd-2022-29-EC1
RC2:
'Comment on esd-2022-29', Anonymous Referee #2, 23 Aug 2022
Review of the manuscript entitled: “Classification of synoptic circulation patterns with a two-stage clustering algorithm using the structural similarity index metric (SSIM)” by Kristina Winderlich, Clementine Dalelane and Andreas Walter

Summary

The authors develop a new classification method for synoptic circulation patterns with the aim to extend the evaluation routine for climate simulations. Its unique novelty is the use of the structural similarity index metric (SSIM) instead of traditional distance metrics for

cluster building. This classification method combines two classical clustering algorithms used iteratively, hierarchical agglomerative clustering (HAC) and k-medoids. The authors apply the classification method to ERA-interim and NCEP1 reanalysis, and CMIP6 models. The authors wish to demonstrate that the built classes are consistent, well separated, spatially and temporally stable, and physically meaningful. Finally, the authors rank the CMIP6 models according to their ability to represent the weather types using different quality indices.

Dear authors,

The purpose of using synoptic circulation patterns to evaluate climate models is a welcomed aim, but is not the first time this is done, as it may seem from the text. Indeed, the ability of models to capture the characteristics of synoptic patterns is an important aspect of improving climate model simulations. The SSIM is generally an interesting and seems to be promising approach for the classification of weather regimes. The article is generally well written, however it should be extended to serve as a high quality research article in ESD.

My comments and suggestions to improve the manuscript are as follows:

General comments

Many classification algorithms attempt to categorize weather types/regimes over the Atlantic-European-Mediterranean region. If the authors suggest a new procedure, they should at least demonstrate why their classification is better than other classification procedures. Indeed, the authors try to explain their choices, but do not demonstrate how their procedure is superior in comparison to other classifications. Perhaps the authors can randomly select days and subjectively see for how many of them the classification does a decent job? Comparing to the original classification you mention in the text would then provide a semi-quantitative way of demonstrating the improvement from one classification to the other.

Forty-three classes seems a rather large number of weather types and can probably be significantly reduced by some sort of EOF analysis. If not, it should at least be explained why the authors do not use this approach as it is very common. Furthermore, I would like to see some further explanation on how do these synoptic types relate to the four canonical weather regimes.

The CMIP6 model evaluation section in its current form is rather short and does not provide very useful information for model developers. This section should probably be extended. It would be nice to have some discussion as to why you think some models are better or worse. Additional analysis is of course welcomed, but should probably be balanced with the length of the article.

Specific comments

Abstract

What do you mean with physically meaningful? There may be different meanings to physical, and you should probably clarify this in the text.

Line 10: This sentence should be at the very end of the abstract.

Do you think your classification would be useful for extended-range weather forecasts? If so, mention this and in the abstract and discuss in the conclusions.

Introduction

Line 43 – 47: From the introduction, it sounds as if you are the first and only group evaluating models based on weather regimes. However, there is an increasing body of knowledge working in this direction. To name a few articles:

References

Dorrington, J., Strommen, K., and Fabiano, F.: Quantifying climate model representation of the wintertime Euro-Atlantic circulation using geopotential-jet regimes, Weather Clim. Dynam., 3, 505–533, https://doi.org/10.5194/wcd-3-505-2022, 2022.

Fabiano, F., Christensen, H.M., Strommen, K. et al. Euro-Atlantic weather Regimes in the PRIMAVERA coupled climate simulations: impact of resolution and mean state biases on model performance. Clim Dyn 54, 5031–5048 (2020). https://doi.org/10.1007/s00382-020-05271-w

Hochman A, Alpert P, Harpaz T, Saaroni H, Messori G. 2019. A new dynamical systems perspective on atmospheric predictability: eastern Mediterranean weather regimes as a case study. 5: eaau0936. https://doi.org/10.1126/sciadv.aau0936

Line 58: Please discuss the number of regimes some more. There are a few articles focusing on this aspect in the literature. Some use two regimes (Wallace and Gutzler, 1981), others use four (Vautard 1990), six (Falkena et al., 2020) or seven (Grams et al., 2017) regimes. This is important as you use an outstanding number of 43.

References

Falkena, S. K., de Wiljes, J., Weisheimer, A., & Shepherd, T. G. (2020). Revisiting the identification of wintertime atmospheric circula-tion regimes in the Euro-Atlantic sector. Quarterly Journal of the Royal Meteorological Society, 146, 2801–2814. https://doi.org/10.1002/qj.3818

Grams, C. M., Beerli, R., Pfenninger, S., Staffell, I., & Wernli, H. (2017). Balancing Europe’s wind-power output through spatial deployment informed by weather regimes. Nature Climate Change, 7, 557–562. https://doi.org/10.1038/nclimate3338

Vautard, R. (1990). Multiple weather regimes over the North Atlantic: Analysis of precursors and successors. Monthly Weather Review, 118,2056–2081. https://doi.org/10.1175/1520-0493(1990)118<2056:MWROTN>2.0.CO;2

Wallace, J. M., & Gutzler, D. S. (1981). Teleconnections in the geopotential height field during the Northern Hemisphere winter. MonthlyWeather Review, 109, 784–812. https://doi.org/10.1175/1520-0493(1981)109<0784:TITGHF>2.0.CO;2

Line 64-66: This is a very strong critic on all prior classifications and should be further explained why none fit your purpose. These classification procedures were all used extensively in the literature. If you state this, you should at least demonstrate how your classification is superior.

Data and methods

Line 80: If you use ERA-interim and not ERA5 reanalysis, you should at least say why, and mention some of the studies comparing the two data sets. I do not expect much difference for large-scale weather regimes, but this should be at least discussed.

Line 82: Please justify why you use 12:00UTC and not daily or all 6-hourly data.

Line 82: How did you coarse grain the data and why to 2×3 degrees?

You often use ‘synoptic scale’, but I think it is more accurate to consider these regimes as large-scale features. I would try being more accurate on this. Perhaps change throughout the text.

Line 95: Why 151 days of smoothing? Please justify this choice.

Results

Lines 436-440: I do not completely understand how you obtained high resolution relative to coarse resolution in figure 9.

Line 454-456: Your motivation was not to use centroids in the introduction and methods section, but then you test your medoids and say that they are very similar to the centroids. Is this not a circular argument?

Section 4.6: Perhaps provide some illustrations of the different classes in the CMIP6 models, in addition the quality indices in the table.

Table 3: I believe that there is not much difference between the models in the ‘transit’ and ‘persist’ values because there are so many classes. In addition, for the other indices the standard deviation is rather low, which is a bit surprising for more than 30 models. They all do pretty much the same job, which is again a bit surprising.

Are the models evaluation criteria significantly different from one another? I think you should test this.

Conclusions

This section is rather very short and should have a bit more discussion with respect to other articles evaluating models using a classification procedure. The article would also benefit from explaining what is better or similar in the new classification with respect to other methodologies used in the literature. The potential use of this methodology in climate projections or extended-range weather forecasts should probably also be discussed.

Technical comments:

Line 82-84: Please rephrase, something is missing here.

Line 307: This should be ‘Results’ and not ‘Method’ section.

Line 318: Change ‘gives us an evidence that’ to ‘provides evidence that’.

Line 357: Change ‘gives an evidence that’ to ‘provides evidence that’.

Figures:

Figure 4: It is very hard to see anything with so many panels.

Figure 10: I think you mixed up between left and right in the caption. In addition, are there significant difference in the right panels?

Table 3: It should probably be DJF for winter in the upper row and not ‘JDF’.
Citation: https://doi.org/10.5194/esd-2022-29-RC2
- AC3: 'Reply on RC2', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC3-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC3
RC3:
'Comment on esd-2022-29', Anonymous Referee #3, 09 Sep 2022

This paper describes a novel method of clustering circulation fields, and then applies this method to assess the ability of CMIP6 models to simulate realistic circulation patterns. The paper is generally clearly written and straightforward to understand, but I feel that the authors have not sufficiently justified the use of their method over something simpler like k-means. The analysis of circulation in the CMIP6 models is also rather brief. I therefore recommend major revisions.

Major comments

The bulk of the paper describes a new two-step classification method, arguing that previously used methods are 'suboptimal'. However, I don't think that the authors have sufficiently motivated their choice of method - my suspicion is that standard k-means clustering would give similar results.

The authors argue that k-means clustering has a number of drawbacks:

i) the number of clusters has to be pre-specified.

(But the authors' similarity threshold parameter seems to play a similar role, as it is subjectively chosen and also influences the number of clusters.)

ii) k-means centroids could be misleading and unrepresentative of the fields in the cluster.

(But does this not also apply to medoids, as a single field chosen to represent a set of fields? Surely any daily field will contain its own set of small scale features that don't resemble those of other fields. The authors appear to find that the cluster centroids and medoids are pretty similar anyway.)

iii) k-means clusters could be sensitive to outliers. (But does this actually happen in the case of the geopotential height fields?)

The authors quote image processing references to justify the similarity metric used here over (say) mean square error. It would be more convincing if the authors could show actual examples of deficiencies in k-means clusters constructed from their circulation data, and/or that clusters produced using their method were superior to those produced using k-means (for example, using the criteria set out in section 3.3).

2. The analysis of the CMIP6 models is rather limited - there's a ranking of the models according to various metrics, but not much more. Why did the authors choose these particular metrics over the wide variety of other possibilities? Do the HIST statistics correspond to

biases in the mean state of the models? Can the authors suggest any reasons why some models are better than others - eg resolution?

Also, the transition statistics are likely to be very noisy with 43 different circulation types. How can we be confident that the transition results from ERA-Interim are a meaningful benchmark - is there enough reanalysis data to do this?

Again, it would be interesting to know if the results of the model evaluation analysis are signficantly different if k-means derived clusters are used instead.

Minor comments

Line 49 - "Hochman et al proved" - I think 'proved' is only an appropriate word when discussing mathematical proofs. I suggest something like 'argued' or 'demonstrated'. Also, people arguing that clusters represent genuine low-frequency weather regimes tend to find relatively few of them (four in winter seems a popular choice). Presumably the authors are not arguing that the 43 types they analyse here each represent a physical weather regime in this sense?

Line 58 - 'the moving atmosphere' - I'm not sure what this means.

Line 90 onwards - standardising the height fields means that information about the amplitude of the circulation anomalies is lost. But different amplitude anomaly patterns could produce quite different responses in eg surface air temperature and precipitation, so I'm not sure the standardisation step is beneficial.

line 111 - "The k-means clustering assigns every data element to the cluster center that is closest to it, if only by a small margin." Isn't this true of any method that assigns each field to one of a set of a classes?

line 112 - "This makes the method sensitive to noise in the data and may lead to an assignment of a data element to a structurally dissimilar cluster center." - what does "structurally dissimilar" mean here? How can we distinguish the noise from the structure in any given field? Can the authors show examples of fields that are far apart under the Euclidean distance metric but close together under the similarity metric, or vice versa?

line 116 - Doesn't using medoids also risk inflating the significance of small-scale noise in the daily field chosen as the medoid?

line 137 - "Wang and Bovik (2009) demonstrated that the MSE has serious disadvantages when applied on data with temporal and spatial dependencies" - dependencies on what? Does this mean temporal and spatial correlations?

line 194 - is the similarity between two clusters measured using their medoid fields?

line 267 - Is the algorithm stable if applied to slightly different initial subsets of the data? The number of patterns may be stable, but do the same patterns emerge from the clustering?

Figure 3 - it would make more sense to have the transition between the blues and reds in the colour bar at zero, not +0.25.

Line 245 - should there be a reference to figure 6 here?

Line 282 - "However, it is necessary to demand that a cluster medoid represents all cluster elements and their whole entity as a group." Does comparing the mediod and centroid really guarantee this?

Line 307 - is section 4 meant to be labelled 'Method', the same as section 3?

Figure 4 - Can the colour bar be included in the figure? There's room in the bottom row of panels.

Line 320 - "This correspondence gives us an evidence that, albeit not tuned to and not required to mimic semi-manual classifications, the new classification method determines not just arbitrary synoptic patterns but those described by experts in semi-manual classifications."

I'm not convinced - given that there are 43 different types, it seems quite likely that some of them could resemble Grosswetterlagen patterns by chance.

Figure 7 - the text in the figure labels could be much larger for legibility.

line 447 - again, I don't think one can infer that this is an inherent advantage of the SSIM method without making a comparison with other cluster methods.

Citation: https://doi.org/10.5194/esd-2022-29-RC3
- AC4: 'Reply on RC3', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC4-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC4
AC5: 'Comment on esd-2022-29', Kristina Winderlich, 17 Jan 2023

The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC5-supplement.pdf

Citation: https://doi.org/10.5194/esd-2022-29-AC5

Interactive discussion

Status: closed

RC1:
'Comment on esd-2022-29', Anonymous Referee #1, 04 Aug 2022

Please see attached review.

Citation: https://doi.org/10.5194/esd-2022-29-RC1
- AC2: 'Reply on RC1', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC2
AC1:
'Comment on esd-2022-29', Kristina Winderlich, 17 Aug 2022

The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC1-supplement.pdf

Citation: https://doi.org/10.5194/esd-2022-29-AC1
- EC1: 'Reply on AC1', Gabriele Messori, 09 Sep 2022
  
  Dear Authors,
  >> We would like to ask the Editor to make the judgement, which options the manuscript may
  
  >> still have.
  In reply to the above query, I would suggest that you provide replies to all reviewer comments. Based on the comments and your replies I will then inform you of my decision regarding the manuscript.
  Best Regards,
  Gabriele Messori
  
  Citation: https://doi.org/10.5194/esd-2022-29-EC1
RC2:
'Comment on esd-2022-29', Anonymous Referee #2, 23 Aug 2022
Review of the manuscript entitled: “Classification of synoptic circulation patterns with a two-stage clustering algorithm using the structural similarity index metric (SSIM)” by Kristina Winderlich, Clementine Dalelane and Andreas Walter

Summary

The authors develop a new classification method for synoptic circulation patterns with the aim to extend the evaluation routine for climate simulations. Its unique novelty is the use of the structural similarity index metric (SSIM) instead of traditional distance metrics for

cluster building. This classification method combines two classical clustering algorithms used iteratively, hierarchical agglomerative clustering (HAC) and k-medoids. The authors apply the classification method to ERA-interim and NCEP1 reanalysis, and CMIP6 models. The authors wish to demonstrate that the built classes are consistent, well separated, spatially and temporally stable, and physically meaningful. Finally, the authors rank the CMIP6 models according to their ability to represent the weather types using different quality indices.

Dear authors,

The purpose of using synoptic circulation patterns to evaluate climate models is a welcomed aim, but is not the first time this is done, as it may seem from the text. Indeed, the ability of models to capture the characteristics of synoptic patterns is an important aspect of improving climate model simulations. The SSIM is generally an interesting and seems to be promising approach for the classification of weather regimes. The article is generally well written, however it should be extended to serve as a high quality research article in ESD.

My comments and suggestions to improve the manuscript are as follows:

General comments

Many classification algorithms attempt to categorize weather types/regimes over the Atlantic-European-Mediterranean region. If the authors suggest a new procedure, they should at least demonstrate why their classification is better than other classification procedures. Indeed, the authors try to explain their choices, but do not demonstrate how their procedure is superior in comparison to other classifications. Perhaps the authors can randomly select days and subjectively see for how many of them the classification does a decent job? Comparing to the original classification you mention in the text would then provide a semi-quantitative way of demonstrating the improvement from one classification to the other.

Forty-three classes seems a rather large number of weather types and can probably be significantly reduced by some sort of EOF analysis. If not, it should at least be explained why the authors do not use this approach as it is very common. Furthermore, I would like to see some further explanation on how do these synoptic types relate to the four canonical weather regimes.

The CMIP6 model evaluation section in its current form is rather short and does not provide very useful information for model developers. This section should probably be extended. It would be nice to have some discussion as to why you think some models are better or worse. Additional analysis is of course welcomed, but should probably be balanced with the length of the article.

Specific comments

Abstract

What do you mean with physically meaningful? There may be different meanings to physical, and you should probably clarify this in the text.

Line 10: This sentence should be at the very end of the abstract.

Do you think your classification would be useful for extended-range weather forecasts? If so, mention this and in the abstract and discuss in the conclusions.

Introduction

Line 43 – 47: From the introduction, it sounds as if you are the first and only group evaluating models based on weather regimes. However, there is an increasing body of knowledge working in this direction. To name a few articles:

References

Dorrington, J., Strommen, K., and Fabiano, F.: Quantifying climate model representation of the wintertime Euro-Atlantic circulation using geopotential-jet regimes, Weather Clim. Dynam., 3, 505–533, https://doi.org/10.5194/wcd-3-505-2022, 2022.

Fabiano, F., Christensen, H.M., Strommen, K. et al. Euro-Atlantic weather Regimes in the PRIMAVERA coupled climate simulations: impact of resolution and mean state biases on model performance. Clim Dyn 54, 5031–5048 (2020). https://doi.org/10.1007/s00382-020-05271-w

Hochman A, Alpert P, Harpaz T, Saaroni H, Messori G. 2019. A new dynamical systems perspective on atmospheric predictability: eastern Mediterranean weather regimes as a case study. 5: eaau0936. https://doi.org/10.1126/sciadv.aau0936

Line 58: Please discuss the number of regimes some more. There are a few articles focusing on this aspect in the literature. Some use two regimes (Wallace and Gutzler, 1981), others use four (Vautard 1990), six (Falkena et al., 2020) or seven (Grams et al., 2017) regimes. This is important as you use an outstanding number of 43.

References

Falkena, S. K., de Wiljes, J., Weisheimer, A., & Shepherd, T. G. (2020). Revisiting the identification of wintertime atmospheric circula-tion regimes in the Euro-Atlantic sector. Quarterly Journal of the Royal Meteorological Society, 146, 2801–2814. https://doi.org/10.1002/qj.3818

Grams, C. M., Beerli, R., Pfenninger, S., Staffell, I., & Wernli, H. (2017). Balancing Europe’s wind-power output through spatial deployment informed by weather regimes. Nature Climate Change, 7, 557–562. https://doi.org/10.1038/nclimate3338

Vautard, R. (1990). Multiple weather regimes over the North Atlantic: Analysis of precursors and successors. Monthly Weather Review, 118,2056–2081. https://doi.org/10.1175/1520-0493(1990)118<2056:MWROTN>2.0.CO;2

Wallace, J. M., & Gutzler, D. S. (1981). Teleconnections in the geopotential height field during the Northern Hemisphere winter. MonthlyWeather Review, 109, 784–812. https://doi.org/10.1175/1520-0493(1981)109<0784:TITGHF>2.0.CO;2

Line 64-66: This is a very strong critic on all prior classifications and should be further explained why none fit your purpose. These classification procedures were all used extensively in the literature. If you state this, you should at least demonstrate how your classification is superior.

Data and methods

Line 80: If you use ERA-interim and not ERA5 reanalysis, you should at least say why, and mention some of the studies comparing the two data sets. I do not expect much difference for large-scale weather regimes, but this should be at least discussed.

Line 82: Please justify why you use 12:00UTC and not daily or all 6-hourly data.

Line 82: How did you coarse grain the data and why to 2×3 degrees?

You often use ‘synoptic scale’, but I think it is more accurate to consider these regimes as large-scale features. I would try being more accurate on this. Perhaps change throughout the text.

Line 95: Why 151 days of smoothing? Please justify this choice.

Results

Lines 436-440: I do not completely understand how you obtained high resolution relative to coarse resolution in figure 9.

Line 454-456: Your motivation was not to use centroids in the introduction and methods section, but then you test your medoids and say that they are very similar to the centroids. Is this not a circular argument?

Section 4.6: Perhaps provide some illustrations of the different classes in the CMIP6 models, in addition the quality indices in the table.

Table 3: I believe that there is not much difference between the models in the ‘transit’ and ‘persist’ values because there are so many classes. In addition, for the other indices the standard deviation is rather low, which is a bit surprising for more than 30 models. They all do pretty much the same job, which is again a bit surprising.

Are the models evaluation criteria significantly different from one another? I think you should test this.

Conclusions

This section is rather very short and should have a bit more discussion with respect to other articles evaluating models using a classification procedure. The article would also benefit from explaining what is better or similar in the new classification with respect to other methodologies used in the literature. The potential use of this methodology in climate projections or extended-range weather forecasts should probably also be discussed.

Technical comments:

Line 82-84: Please rephrase, something is missing here.

Line 307: This should be ‘Results’ and not ‘Method’ section.

Line 318: Change ‘gives us an evidence that’ to ‘provides evidence that’.

Line 357: Change ‘gives an evidence that’ to ‘provides evidence that’.

Figures:

Figure 4: It is very hard to see anything with so many panels.

Figure 10: I think you mixed up between left and right in the caption. In addition, are there significant difference in the right panels?

Table 3: It should probably be DJF for winter in the upper row and not ‘JDF’.
Citation: https://doi.org/10.5194/esd-2022-29-RC2
- AC3: 'Reply on RC2', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC3-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC3
RC3:
'Comment on esd-2022-29', Anonymous Referee #3, 09 Sep 2022

This paper describes a novel method of clustering circulation fields, and then applies this method to assess the ability of CMIP6 models to simulate realistic circulation patterns. The paper is generally clearly written and straightforward to understand, but I feel that the authors have not sufficiently justified the use of their method over something simpler like k-means. The analysis of circulation in the CMIP6 models is also rather brief. I therefore recommend major revisions.

Major comments

The bulk of the paper describes a new two-step classification method, arguing that previously used methods are 'suboptimal'. However, I don't think that the authors have sufficiently motivated their choice of method - my suspicion is that standard k-means clustering would give similar results.

The authors argue that k-means clustering has a number of drawbacks:

i) the number of clusters has to be pre-specified.

(But the authors' similarity threshold parameter seems to play a similar role, as it is subjectively chosen and also influences the number of clusters.)

ii) k-means centroids could be misleading and unrepresentative of the fields in the cluster.

(But does this not also apply to medoids, as a single field chosen to represent a set of fields? Surely any daily field will contain its own set of small scale features that don't resemble those of other fields. The authors appear to find that the cluster centroids and medoids are pretty similar anyway.)

iii) k-means clusters could be sensitive to outliers. (But does this actually happen in the case of the geopotential height fields?)

The authors quote image processing references to justify the similarity metric used here over (say) mean square error. It would be more convincing if the authors could show actual examples of deficiencies in k-means clusters constructed from their circulation data, and/or that clusters produced using their method were superior to those produced using k-means (for example, using the criteria set out in section 3.3).

2. The analysis of the CMIP6 models is rather limited - there's a ranking of the models according to various metrics, but not much more. Why did the authors choose these particular metrics over the wide variety of other possibilities? Do the HIST statistics correspond to

biases in the mean state of the models? Can the authors suggest any reasons why some models are better than others - eg resolution?

Also, the transition statistics are likely to be very noisy with 43 different circulation types. How can we be confident that the transition results from ERA-Interim are a meaningful benchmark - is there enough reanalysis data to do this?

Again, it would be interesting to know if the results of the model evaluation analysis are signficantly different if k-means derived clusters are used instead.

Minor comments

Line 49 - "Hochman et al proved" - I think 'proved' is only an appropriate word when discussing mathematical proofs. I suggest something like 'argued' or 'demonstrated'. Also, people arguing that clusters represent genuine low-frequency weather regimes tend to find relatively few of them (four in winter seems a popular choice). Presumably the authors are not arguing that the 43 types they analyse here each represent a physical weather regime in this sense?

Line 58 - 'the moving atmosphere' - I'm not sure what this means.

Line 90 onwards - standardising the height fields means that information about the amplitude of the circulation anomalies is lost. But different amplitude anomaly patterns could produce quite different responses in eg surface air temperature and precipitation, so I'm not sure the standardisation step is beneficial.

line 111 - "The k-means clustering assigns every data element to the cluster center that is closest to it, if only by a small margin." Isn't this true of any method that assigns each field to one of a set of a classes?

line 112 - "This makes the method sensitive to noise in the data and may lead to an assignment of a data element to a structurally dissimilar cluster center." - what does "structurally dissimilar" mean here? How can we distinguish the noise from the structure in any given field? Can the authors show examples of fields that are far apart under the Euclidean distance metric but close together under the similarity metric, or vice versa?

line 116 - Doesn't using medoids also risk inflating the significance of small-scale noise in the daily field chosen as the medoid?

line 137 - "Wang and Bovik (2009) demonstrated that the MSE has serious disadvantages when applied on data with temporal and spatial dependencies" - dependencies on what? Does this mean temporal and spatial correlations?

line 194 - is the similarity between two clusters measured using their medoid fields?

line 267 - Is the algorithm stable if applied to slightly different initial subsets of the data? The number of patterns may be stable, but do the same patterns emerge from the clustering?

Figure 3 - it would make more sense to have the transition between the blues and reds in the colour bar at zero, not +0.25.

Line 245 - should there be a reference to figure 6 here?

Line 282 - "However, it is necessary to demand that a cluster medoid represents all cluster elements and their whole entity as a group." Does comparing the mediod and centroid really guarantee this?

Line 307 - is section 4 meant to be labelled 'Method', the same as section 3?

Figure 4 - Can the colour bar be included in the figure? There's room in the bottom row of panels.

Line 320 - "This correspondence gives us an evidence that, albeit not tuned to and not required to mimic semi-manual classifications, the new classification method determines not just arbitrary synoptic patterns but those described by experts in semi-manual classifications."

I'm not convinced - given that there are 43 different types, it seems quite likely that some of them could resemble Grosswetterlagen patterns by chance.

Figure 7 - the text in the figure labels could be much larger for legibility.

line 447 - again, I don't think one can infer that this is an inherent advantage of the SSIM method without making a comparison with other cluster methods.

Citation: https://doi.org/10.5194/esd-2022-29-RC3
- AC4: 'Reply on RC3', Kristina Winderlich, 07 Oct 2022
  
  The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC4-supplement.pdf
  
  Citation: https://doi.org/10.5194/esd-2022-29-AC4
AC5: 'Comment on esd-2022-29', Kristina Winderlich, 17 Jan 2023

The comment was uploaded in the form of a supplement: https://esd.copernicus.org/preprints/esd-2022-29/esd-2022-29-AC5-supplement.pdf

Citation: https://doi.org/10.5194/esd-2022-29-AC5

Kristina Winderlich, Clementine Dalelane, and Andreas Walter

Viewed

Total article views: 1,288 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
841	368	79	1,288	73	93

HTML: 841
PDF: 368
XML: 79
Total: 1,288
BibTeX: 73
EndNote: 93

Views and downloads (calculated since 29 Jul 2022)

Month	HTML	PDF	XML	Total
Jul 2022	57	10	4	71
Aug 2022	164	31	6	201
Sep 2022	67	21	4	92
Oct 2022	55	26	6	87
Nov 2022	33	13	1	47
Dec 2022	22	11	0	33
Jan 2023	33	6	2	41
Feb 2023	26	14	0	40
Mar 2023	20	9	0	29
Apr 2023	11	8	0	19
May 2023	9	7	2	18
Jun 2023	13	3	1	17
Jul 2023	8	22	0	30
Aug 2023	9	12	0	21
Sep 2023	22	16	1	39
Oct 2023	11	10	1	22
Nov 2023	6	3	0	9
Dec 2023	10	10	0	20
Jan 2024	8	5	0	13
Feb 2024	13	11	2	26
Mar 2024	17	6	1	24
Apr 2024	22	6	6	34
May 2024	22	5	6	33
Jun 2024	22	3	2	27
Jul 2024	10	6	10	26
Aug 2024	17	5	8	30
Sep 2024	22	4	3	29
Oct 2024	5	1	0	6
Nov 2024	10	4	3	17
Dec 2024	6	8	0	14
Jan 2025	12	7	1	20
Feb 2025	9	4	3	16
Mar 2025	12	5	0	17
Apr 2025	15	6	1	22
May 2025	20	15	2	37
Jun 2025	18	23	1	42
Jul 2025	5	12	2	19

Cumulative views and downloads (calculated since 29 Jul 2022)

Month	HTML	PDF	XML	Total
Jul 2022	57	10	4	71
Aug 2022	164	31	6	201
Sep 2022	67	21	4	92
Oct 2022	55	26	6	87
Nov 2022	33	13	1	47
Dec 2022	22	11	0	33
Jan 2023	33	6	2	41
Feb 2023	26	14	0	40
Mar 2023	20	9	0	29
Apr 2023	11	8	0	19
May 2023	9	7	2	18
Jun 2023	13	3	1	17
Jul 2023	8	22	0	30
Aug 2023	9	12	0	21
Sep 2023	22	16	1	39
Oct 2023	11	10	1	22
Nov 2023	6	3	0	9
Dec 2023	10	10	0	20
Jan 2024	8	5	0	13
Feb 2024	13	11	2	26
Mar 2024	17	6	1	24
Apr 2024	22	6	6	34
May 2024	22	5	6	33
Jun 2024	22	3	2	27
Jul 2024	10	6	10	26
Aug 2024	17	5	8	30
Sep 2024	22	4	3	29
Oct 2024	5	1	0	6
Nov 2024	10	4	3	17
Dec 2024	6	8	0	14
Jan 2025	12	7	1	20
Feb 2025	9	4	3	16
Mar 2025	12	5	0	17
Apr 2025	15	6	1	22
May 2025	20	15	2	37
Jun 2025	18	23	1	42
Jul 2025	5	12	2	19

Viewed (geographical distribution)

Total article views: 1,287 (including HTML, PDF, and XML) Thereof 1,287 with geography defined and 0 with unknown origin.

Country	#	Views	%

Cited

Latest update: 18 Jul 2025

Download

This preprint has been withdrawn.

Preprint (2639 KB)
Metadata XML

Short summary

This paper presents a new classification method for synoptic circulation patterns and its application on ERA-Interim reanalysis data. The output fields of the CMIP6 models are assigned to the reanalysis-derived classes and a new quality index, built on the statistics between each model and the reference, is introduced to quantify the “quality” of the respective model. CMIP6 models are ranked according to the new quality score.


Total:	0
HTML:	0
PDF:	0
XML:	0

Classification of synoptic circulation patterns with a two-stage clustering algorithm using the structural similarity index metric (SSIM)

Interactive discussion

Interactive discussion

Viewed

Viewed (geographical distribution)

Cited

1 citations as recorded by crossref.