Space-time dependence of compound hot-dry events in the United States: assessment using a multi-site multi-variable weather generator

Compound hot and dry events can lead to severe impacts whose severity may depend on their time scale and spatial extent. Despite their potential importance, the climatological characteristics of these joint events have received little attention regardless of growing interest in climate change impacts on compound events. Here, we ask how event time scale relates to (1) spatial patterns of compound hot-dry events in the United States, (2) the spatial extent of compound hot-dry events, and (3) the importance of temperature and precipitation as drivers of compound event occurrence. To study such rare spatial and 5 multivariate events, we introduce a multi-site multi-variable weather generator (PRSim.weather), which enables generation of a large number of spatial multivariate hot-dry events. We show that the stochastic model realistically simulates distributional and temporal autocorrelation characteristics of temperature and precipitation at single sites, dependencies between the two variables, spatial correlation patterns, and spatial heat and meteorological drought indicators and their co-occurrence probabilities. The results of our compound event analysis demonstrate that (1) the Northwestern and Southeastern United States are most 10 susceptible to compound hot-dry events independent of time scale and susceptibility decreases with increasing time scale, (2) the spatial extent and time scale of compound events are strongly related with sub-seasonal events (1–3 months) showing the largest spatial extents, and (3) the importance of temperature and precipitation as drivers of compound events varies with time scale where temperature is most important at short and precipitation at seasonal time scales. We conclude that time scale is an important factor to be considered in compound event assessments and suggest that climate change impact assessments should 15 consider several instead of a single time scale when looking at future changes in compound event characteristics. The largest future changes may be expected for short compound events because of their strong relation to temperature.

(2020) and Wu et al. (2020) have shown that the area affected by concurrent hot-dry extremes has increased significantly over the past few decades in the US and globally for long, i.e. seasonal, time scales. However, it remains to be investigated how the time scale of compound events influences their characteristics and spatial extent.
This study aims to deepen our understanding of how the time scale of compound hot-dry events in the US relates to (1) spatial patterns of compound event affectedness, (2) spatial extents of compound events, and (3) the role of temperature and precipita-40 tion as drivers of compound events by focusing on multivariate and spatially compounding extreme events (Zscheischler et al., 2020). To answer the question of how time scale shapes compound event characteristics, we determine the probability, extent, and drivers of spatial compound heatwaves and meteorological drought over the conterminous US (CONUS) for different time scales ranging from weekly to annual events.
Studying such spatial compound events is challenging because they are rare in observational records (Zscheischler et al.,45 2018). This challenge can be tackled by developing stochastic simulation approaches to generate large data sets with similar statistical properties as the observations (Vogel and Stedinger, 1988). A stochastic approach to simulate spatial compound hot-dry events at different time scales needs to (1) represent spatial dependencies between sites to capture the spatial aspect, (2) represent dependencies between variables to capture dependencies between precipitation and temperature, and (3) be continuous to enable studying time scales from weeks to years. However, existing models often only fulfill one or two out of 50 these three requirements. On the one hand, existing spatial models for simulating spatial extreme events such as the conditional exceedance model by Heffernan and Tawn (2004) are event-based (Keef et al., 2013;Diederen et al., 2019) and often applied to one variable, e.g. flood peaks. On the other hand, continuous stochastic approaches such as autoregressive moving average type models (Stedinger and Taylor, 1982) or bootstrap approaches (Rajagopalan et al., 2010) do not represent spatial dependencies well. Therefore, Brunner and Gilleland (2020) recently proposed a novel stochastic approach for simulating continuous stream-  et al., 2019). It is able to simulate continuous, spatially consistent time series but has so far only been applied to one variable (streamflow). 60 We extend PRSim.wave, here, to multiple variables by proposing a multi-site multi-variable stochastic weather generator (PRSim.weather) that simulates long time series of spatially consistent temperature (T ) and precipitation (P ) time series.
This multi-site multi-variable stochastic model reproduces local variable distributions using flexible distributions for T and P and introduces spatio-temporal and variable dependence using the wavelet transform (Torrence and Compo, 1998) and phase randomization (Schreiber and Schmitz, 2000;Lancaster et al., 2018). Using this multi-site multi-variable generator to simulate a large set of spatial compound hot-dry events will help to shed light on the question of how time scale shapes compound event characteristics including spatial extent. Thus, this analysis will provide crucial information to increase preparedness and develop adaptation measures to potentially impactful spatial compound events.

Methods and Materials
We develop a multi-variable multi-site weather generator that stochastically simulates spatially consistent T and P time series 70 for a large number of locations. We apply this model to a gridded T and P data set in the CONUS to generate a large sample of spatial compound hot-dry events. We subsequently use this sample to determine which regions in the US are susceptible to compound events and large spatial compound event extents at different time-scales. Last, we look at how the importance of T and P for compound event development varies with time scale.

75
The analysis is performed using a gridded data set of T and P time series for 894 equally-spaced grid cells in the CONUS. T and P data were obtained from the ERA5-Land reanalysis for the period 1981-2018 (ECMWF, 2019). ERA5-Land relies on atmospheric forcing from the ERA5 reanalysis (Hersbach et al., 2020) and provides variables at a spatial resolution of 9 km for the period 1981 to present. We chose a subset of regularly-spaced grid cells by sampling 1500 grid cells over the extent of the CONUS which resulted in 894 grid cells over land that are used for this analysis.

Stochastic multi-site multi-variable modeling
To study spatial compound hot-dry events, we develop a multi-site multi-variable weather generator PRSim.weather that enables simulation of large sets of spatially consistent compound hot-dry events. PRSim.weather combines an empirical spatiotemporal model based on the wavelet transform and phase randomization with two flexible distributions for T and P . It builds 85 on the spatial stochastic model PRSim.wave (Phase Randomization Simulation using wavelets) proposed by Brunner and Gilleland (2020), which simulates continuous streamflow time series at multiple sites. We expand the functionality of PRSim.wave to simulate multiple variables, i.e. T and P , at multiple sites. The weather generation procedure implemented in PRSim.weather consists of five main steps ( Figure 1): 1. feed in observed daily T and P time series for multiple sites (here grid cells); 90 2. fit monthly distributions to T and P time series at each site to capture seasonal variations in distribution parameters.
For T , we use the flexible skewed exponential power (SEP) distribution with 4 parameters (Fernández and Steel, 1998), which generalizes the Gaussian distribution, can reproduce different skewness and kurtosis, and has been previously applied for multi-site temperature simulation (Evin et al., 2019). The parameters of the SEP distribution are estimated using L-moments (R-package lmomco; Asquith, 2020). For P , we use an extended Generalized Pareto distribution (E-GPD; Pa-95 pastathopoulos and Tawn, 2013) with three parameters to model positive precipitation values. The E-GPD jointly models non-extreme and extreme values of P while bypassing the threshold selection problem as it enables smooth transitioning between a gamma-like distribution and a heavy-tailed Generalized Pareto distribution (GPD; Naveau et al., 2016). The E-GPD has been demonstrated to be valuable in multi-site precipitation modeling thanks to its flexibility (Evin et al., 2018). We combine the E-GPD with as many zero-values as in the observations to obtain the full P distribution; 100 3. transform the T and P time series from the time to the frequency domain by decomposing the series into an amplitude and phase signal using a continuous wavelet transform with the Morlet wavelet (Torrence and Compo, 1998); 4. generate a random time series using bootstrap resampling on the time series of one randomly sampled site by sampling years with replacement. Use the wavelet transform to also decompose this bootstrapped series in order to obtain a random phase signal; 105 5. generate stochastic time series for T and P by applying the inverse wavelet transform to the observed amplitude signals and the randomly generated phases. Rank-transform the newly generated time series to the desired distribution for each month using the monthly distribution parameters derived in Step 2 (SEP parameters for T and E-GPD parameters for P ).
The spatial and variable dependencies are introduced in Step 5 by using the same random phases in the wavelet transform at 110 all sites and for both variables.
The stochastic multi-site multi-variable model is evaluated with respect to the following characteristics: (1) T and P distributions (cdfs) at individual sites, (2) temporal autocorrelation of T and P (acfs) at individual sites, (3) spatial dependencies across sites for T and P (variograms), (4) T -P variable dependencies (scatter plots), and (5) simulated spatial patterns of the standardized temperature index (STI), the standardized precipitation index (SPI), and the probability of concurrent high STI 115 and low SPI anomalies at a 1-month aggregation level for moderate, severe and extreme events according to the empirical copula (see Section 2.2.2).
PRSim.weather is finally run n = 100 times for the 894 grid cells in the US in order to substantially increase the sample size available for the assessment of spatial compound hot-dry events (28 years * 100 = 2800 years). (2) fit monthly SEP distribution to T and E-GP distribution to P time series of all sites; (3) decompose T and P time series of all sites into an amplitude (A) and phase (P) signal using the wavelet transform; (4) generate one random time series using bootstrap resampling and decompose that random series into an amplitude and phase signal too, and (5) generate random T and P time series by combining the observed amplitude signal of each site and variable with the randomly generated phase signal and by backtransforming the signals to the time domain using the inverse wavelet transform. Rank-transform the newly generated signal to the desired distribution using the parameter estimates from Step 2.

Compound event analysis
120 While the focus is on the simulated series, compound events and their corresponding T and P characteristics are identified at different time scales in both the observed and stochastically simulated time series to assess the reliability of the stochastic model. To look at different time scales, we first convert the T and P series to weekly/monthly series using mean values and sums, respectively. We work with aggregation levels of 1 week to represent 'flash' compound events and of 1, 3, 6, and 12 months to represent sub-seasonal, seasonal, and annual time scales. In a second step, we transform the aggregated T and P 125 series to series of standardized indices, which we will use to study relationships between the marginal behavior of compound events because they guarantee variable and site comparability. Standardized precipitation index (SPI) series (McKee et al., 1993) for each location are computed by transforming the P values to a standardized normal distribution (mean = 0 and sd = 1) using a site-specific Gamma distribution (Kolmogorov-Smirnov test did not not reject Gamma in over 80% of the grid cells). Similarly, we compute standardized temperature index series (STI; Zscheischler et al., 2014) using the SEP distribution 130 for transformation. Last, compound hot-dry events are identified for each time scale and grid cell using a bivariate empirical copula (Deheuvels, 1979;Genest and Favre, 2007), which describes the bivariate distribution of T (STI) and P (-SPI). We change the sign of the SPI values to convert negative to positive anomalies as we are interested in events where both STI and -SPI are jointly exceeded. The empirical copula is described as:

135
where R i and S i represent pairs of ranks, n the sample size, and C n (u, v) the rank-based estimator of the copula C(u, v). An example of how the empirical copula (purple) is related to the margins STI (yellow) and -SPI (blue) is provided in Figure 2. Using the time series of empirical bivariate distribution values, we identify moderate, severe, and extreme droughts using three thresholds at 0.8, 0.9, and 0.95, respectively (see Figure 2 for an example with a threshold of 0.9).
To assess the spatial extent of compound events at different time scales, we determine the percentage of grid cells affected 140 by each of the compound events identified at individual grid cells. Then, for each grid cell, we determine the median spatial extent of those events it is affected by.
To explain the role of the individual variables T and P in compound event occurrence, we compute Kendall's correlation between the median bivariate distribution (empirical copula) and the median standardized indices STI and SPI at different time scales. This correlation analysis is performed for nine hydro-climatic regions in the United States (Bukovsky;Bukovsky, 2011) 145 to quantify the regional spread in the role of STI and SPI for compound event development. We look at correlations for different time scales and event extremeness levels to assess to which degree these two factors influence STI and SPI importance.

Evaluating the weather generator
The multi-site multi-variable stochastic simulation approach PRSim.weather is capable of reproducing the observed statistical local T and P distributions well as indicated by the good match of simulated with observed densities (Figure 3a,b). The temporal autocorrelation in both variables is realistically reproduced as shown by the good agreement of simulated with observed autocorrelation functions thanks to the observed frequency spectrum information used in the inverse wavelet transform (Figure   3c,d). The simulated time series well mimick the main temporal characteristics of the observed time series including season-155 ality and temporal event distribution/clustering as illustrated by three years of observed and simulated T and P data ( Figure   3e,f). The T -P variable dependence is also generally well captured thanks to the use of the same random phases for both variables when applying the inverse wavelet transform (Figure 3g,h). However, the number of high T -low P events at a daily scale is slightly underestimated. In addition to these local characteristics, spatial correlations are captured as illustrated by the similarity of observed and simulated variograms ( Figure 4). However, the spatial correlation of T is slightly overestimated by 160 the simulations. Achieving a 'perfect' joint representation of the three forms of dependence -temporal, spatial, and variableis very challenging. The model is considered suitable for the analysis of compound hot-dry events because it has an acceptable performance with respect to all three aspects.   PRSim.weather enables simulation of a large sample of extreme events in terms of standardized temperature (STI) and precipitation indices (SPI), which reflect the spatial STI and SPI patterns detected in the observations for different levels 165 of extremeness ( Figure 5). While the simulated spatial STI and SPI patterns look similar as the observed ones, they are more expressed because of the larger sample size available. The spatial pattern for STI is rather weak with STI values being relatively homogeneously distributed except for the Pacific Northwest and along the West Coast where STI values are slightly higher than in the rest of the country. In contrast, the spatial pattern of median SPIs is expressed with substantially higher negative anomalies in the western than the eastern US and particularly strong negative anomalies in the Southwest. The spatial STI and SPI patterns are reflected in the spatial distribution of the probability of concurrent hot-dry events, which is also realistically represented by PRSim.weather ( Figure 6). The highest probability of concurrent hot-dry events at a monthly time scale are found in the Pacific Northwest, along the West coast, in the Rocky Mountains, and the Southeast, in particular in Texas. In contrast, concurrent hot-dry events are relatively rare in the Great Plains, the Midwest, and Florida. For the remainder of our analysis, we are focusing on the stochastic simulations because of their large sample size, which allows us to study rare 175 spatial compound hot-dry events.

Concurrent hot-dry events
The stochastically simulated compound hot-dry events reveal that the probability of co-occurring hot and dry periods is highest in the northwestern and southeastern US independently of the time scale considered (Figure 7). However, the probability of concurrent events decreases with increasing time scale, as can be expected due to the increasing aggregation of multiple weather  events (Figure 9). While ∼20% of the CONUS may be jointly affected by moderate and short compound events, spatial extents of compound events become small to non-existent for extreme and long-lasting (i.e. annual) compound events.
195 Figure 8. Spatial patterns of median spatial compound event extent per grid cell for different time scales and extremeness levels over nine hydro-climatic regions. The darker the color, the higher the median spatial extent of compound events a grid cell is affected by.  The importance of STI and SPI as drivers of compound event occurrence varies by time scale and level of extremeness ( Figure 10). T is a particularly important driver at short time scales (Figure 10a). The importance of P as a driver of compound event occurrence increases with time scale up to event durations of 6 months but decreases with level of extremeness ( Figure   10b). In summary, the longer the time scale, the more important P becomes as a driver compared to T (up to a seasonal time scale  The multi-site multi-variable stochastic model PRSim.weather proposed for the joint simulation of T and P at multiple sites has been shown to be suitable for the simulation of spatial compound hot-dry events. It reproduces the distributional and temporal autocorrelation characteristics of T and P at single sites, the dependence between the two variables, the spatial correlation of of spatial concurrent pluvial, river, and coastal flooding by jointly modeling precipitation, discharge, and water levels or the joint simulation of wildfire drivers such as wind speed, temperature, and humidity. Please note that while the model will be able to retain the statistical dependencies between variables to some degree, individual simulated events may not necessarily be physically consistent if many variables are jointly simulated. If physical consistency is a requirement for a specific application, stochastic approaches may be combined with physical approaches as e.g. in the weather generator AWE-GEN-2 by Peleg et al. 215 (2017).
The finding that the western and southeastern US are most likely to be affected by compound hot-dry events suggests that the likelihood of compound event occurrence is somehow related to precipitation seasonality with regions receiving most of their precipitation in winter or spring and comparably less in summer and fall (Finkelstein and Truppi, 1991) being most likely to be affected by compound events. In 'normal' years, both the western and southeastern US receive a large part of their precipitation 220 through recurrent patterns such as atmospheric rivers (Rutz et al., 2015) or tropical cyclones (Kunkel et al., 2012), respectively.
Anomalies can arise because of temporal shifts or a weakening of these patterns in specific seasons/years. In addition, the regions most likely to experience compound events are the regions found to be most susceptible to heatwaves in the US (Smith et al., 2013).
Our finding that spatial extents of compound events are largest for moderate events at subseasonal time scales implies that 225 while these moderate events may have less severe impacts at a local scale, they may still be highly relevant at a regional scale. Compound events with large spatial extents represent a particular management challenge because they may preclude the transfer of resources and emergency supplies from one to another region. Consequently, the societal impacts of large-scale compound events can be amplified, since many coping strategies are predicated on some degree of resource transfers from less severely affected adjacent regions.

230
The finding that temperature is a comparably more important driver for short compound events only while precipitation is comparably more important at seasonal time scales corroborates the findings of previous studies about the importance of different hydro-meteorological drivers at different time scales. Zhang et al. (2020) have shown that temperature is the most important hydro-meteorological driver of short term concurrent hot-dry extremes, which aligns with our findings. In addition, which is in line with our finding that temperature is an important driver of extreme compound hot-dry events at seasonal to annual time scales.
Future changes in the frequency and severity of compound hot-dry events are expected because of both changes in temperature and precipitation and their interdependence. The importance of temperature as an important driver of short and extreme compound hot-dry events suggests that the increasing temperatures associated with climate change may induce future changes 240 in the frequency and magnitude of short and extreme compound events. Such future increases have been projected globally  and regionally e.g. for China (Zhou and Liu, 2018). In addition, previous studies have shown that the number and intensity of compound hot-dry events may increase because temperature and precipitation may become increasingly coupled/correlated in summer (De Luca et al., 2020;Zscheischler and Seneviratne, 2017) possibly as a consequence of an intensification of land-atmosphere feedbacks (Seneviratne et al., 2010). As the number of compound events increases locally, the 245 area exposed to compound hot-dry events is projected to increase with global warming (Vogel et al., 2019) continuing a trend that has been already observed during the past few decades (Alizadeh et al., 2020). How exactly future changes in compound event extents relate to changes in drought spatial extent (Brunner et al., 2020) and in heatwave spatial extent remains to be investigated.

250
We introduce the multi-variable multi-site stochastic model PRSim.weather to simulate continuous and spatially consistent multivariate time series. The model is shown to realistically simulate distributional and temporal autocorrelation characteristics of temperature and precipitation at single sites, dependencies between the two variables up to moderate extremes, spatial correlation patterns, and spatial heat and drought indicators and their co-occurrence probabilities for a gridded large-sample data set in the United States. However, future work is needed to improve the representation of very extreme hot-dry events. We 255 apply the stochastic model to generate a large set of spatial and multivariate hot-dry events and use these simulated compound events to assess how event time scale and extremeness influence the spatial affectedness by compound hot-dry events over the United States, the spatial extent of compound events, and their main drivers temperature and precipitation. Our results show that (1) the Northwest and Southeast are most likely to be affected by compound hot-dry events independent of time scale; (2) the spatial extent of compound hot-dry events decreases with increasing event extremeness and time scale, i.e., the events with 260 the largest spatial extents are typically short and only moderately extreme; and (3) temperature is an important driver of short compound events while precipitation is an important driver at seasonal time scales particularly for the moderately extreme events. These findings highlight that occurrences of compound events are strongly influenced by the time scales at which they are defined. Research to quantify current compound event risk and to project it into the future will need to take time scale into consideration, especially as it also influences the sensitivity to different climate drivers and their potential future changes.

265
Considering space-time scales in compound event assessments will allow us to make nuanced statements about which types of compound events may be changing because of increasing temperatures in a warming world. For example, short compound