Exploring the relationship between temperature forecast errors and Earth system variables

,

represent more details of physical processes, and they benefit from assimilating comprehensive Earth observation data as well as increasing computing power. However, with ever-growing model complexity, it becomes increasingly difficult to pinpoint 5 weaknesses in the forecast models' process representations which is key to improving forecast accuracy. In this study, we use a comprehensive set of observation-based ecological, hydrological and meteorological variables to study their potential for explaining temperature forecast errors at the weekly time scale. For this purpose, we compute Spearman correlations between each considered variable and the forecast error obtained from the ECMWF subseasonal-to-seasonal (S2S) reforecasts at lead times of 1-6 weeks. This is done across the globe for the time period 2001-2017. The results show that temperature forecast 10 errors globally are most strongly related with climate-related variables such as surface solar radiation and precipitation, which highlights the model's difficulties in accurately capturing the evolution of the climate-related variables during the forecasting period. At the same time, we find particular regions in which other variables are more strongly related to forecast errors. For instance, in central Europe, eastern North America and southeastern Asia, vegetation greenness and soil moisture are relevant, while in western South America and central North America, circulation-related variables such as surface pressure relate more 15 strongly with forecast errors. Overall, the identified relationships between forecast errors and independent Earth observations reveal promising variables on which future forecasting system development could focus by specifically considering related process representations and data assimilation.

Introduction
Forecasts at the subseasonal-to-seasonal time scale (S2S) have received growing attention in recent years (Kirtman et al.,20 2014; Vitart et al., 2017;White et al., 2017). Accurate predictions at the S2S are helpful, for instance, to anticipate extreme 1 events like droughts and floods up to two weeks ahead (Bauer et al., 2015;Mariotti et al., 2018;Vitart and Robertson, 2018;Pegion et al., 2019;Pendergrass et al., 2020) and to optimize resource management (Robertson et al., 2015;White et al., 2017). The S2S bridges the gap between weather forecasts for the coming two weeks and seasonal climate predictions (Vitart, 2014;Robertson et al., 2015). However, it is particularly challenging to predict because the lead time is too long for the initial 25 atmospheric conditions to provide useful information, and too short for some slowly varying Earth system components, such as the ocean, to effectively inform the forecasts (Vitart, 2014;Mariotti et al., 2018).
The chaotic nonlinear interactions between Earth system components and the uncertainties in the initial boundary conditions play a role in forecast errors (Newman et al., 2003). Forecasting systems need to capture all possible Earth system sources of subseasonal predictability to minimize these errors (Vitart and Robertson, 2018;de Andrade et al., 2019). The assimila-30 tion of climate variables, such as radiation and precipitation, and hence the representation of energy and water supply at the Earth's surface, is relevant to inform the weather forecast model to make accurate predictions of the evolution of temperature (Miller et al., 2021). This requires a global and dense observational network of respective measurements and/or regular satellite observations as a basis for an adequate data assimilation.
Related with climate variables, recent studies have identified potential circulation sources of S2S predictability such as 35 Madden Julian Oscillation (MJO), El Niño Southern Oscillation (ENSO) and North Atlantic Oscillation (NAO). Atmospheric circulation patterns influence synoptic weather regimes and high/low pressure systems, which in turn modulate large-scale weather several weeks ahead (Büeler et al., 2021;Falkena et al., 2022).
Also, the land surface initial conditions (e.g. soil moisture, vegetation states, snow cover) contribute to subseasonal predictability through the memory, i.e. persistence in time, of related quantities, particularly in the case of soil moisture (Koster 40 et al., 2010b, a;Saha et al., 2014;Mariotti et al., 2020;Kim et al., 2021;Meehl et al., 2021). Soil moisture can affect the partitioning of incoming radiation energy to sensible and latent heat fluxes, and therefore the near-surface temperature and humidity (Seneviratne et al., 2010). Soil moisture anomalies that are present and known at forecast initialization can persist for several weeks (Orth and Seneviratne, 2012;Shin et al., 2020), therefore an accurate representation of the soil moisture effect on surface heat fluxes supports weather forecast skill, particularly in water-limited regions (Miralles et al., 2012;Dirmeyer and 45 Halder, 2017).
In contrast to soil moisture, the potential of vegetation phenology in improving weather forecast skills is not yet fully exploited, and in most forecasting systems, only the seasonal cycle is prescribed (Boussetta et al., 2013;Balsamo et al., 2018).
Vegetation phenology (in terms of e.g. greenness and photosynthesis) affects surface heat fluxes through evapotranspiration, and exhibits memory. Some studies have outlined the potential of taking into account initial vegetation anomalies in the fore-50 casting systems (Boussetta et al., 2015;Koster and Walker, 2015;Nogueira et al., 2020). They reveal the potential of the vegetation state in predicting hydrometeorological variables with high accuracy, especially during extreme events such as droughts and heat waves, permitting mitigation of their impacts (Meng et al., 2014;Albergel et al., 2019;Miralles et al., 2019).
Overall, these potential sources of forecast skill at the subseasonal time scale can vary in strength across space and time, and they are utilized to different degrees in current forecasting systems. Therefore, it is not obvious which is the most promising 55 source of forecast skill to further exploit in forecasting system development. To provide guidance in this context, we study the temporal co-variability of errors of near-surface temperature forecasts with a comprehensive dataset of observation-based Earth system variables representing potentially under-exploited sources of forecast skill at the subseasonal time scale. With this setup, our goal is to reveal where (across regions) and when (across seasons) climate, circulation and land surface variables can explain forecast error variability.

Data and methods
Our methodology is summarized in Fig. 1, and described in more detail in the following subsections. Figure 1. Schematic summary of the workflow. Within each grid cell, we calculate weekly averages of temperature forecast errors (lead times 1-6 weeks) and then compute their Spearman correlations with Earth system variables averaged across the week prior to the forecast start.

Data
Our analysis focuses on S2S reforecasts of 2m air temperatures from the European Centre for Medium-range Weather Forecasts (ECMWF) (Vitart et al., 2017), hereafter referred to as forecast. We use the average of 51 global ensemble members that 65 differ with respect to i) initial conditions (reflecting uncertainties in the data assimilation), and ii) model parameterizations (reflecting modeling uncertainties). We consider the global land area and the period 2001-2017. Each forecast is run until a lead time of 46 days. We use forecasts generated in 2020 with two model versions: CY46R1 (all forecasts from 4 February to 29 June of each year) and CY47R1 (all forecasts from 30 June to 3 February of each year) (accessed on 3 February 2021 from https://apps.ecmwf.int/datasets/data/s2s/). There are no differences between the two model versions which have the potential 70 to affect our results. The forecasts are forced with atmospheric data from the ERA5 reanalysis (Hersbach et al., 2020) and therefore benefit from a consistent model initialization.
In order to estimate errors in the temperature forecasts we compare them with respective data from the Climate Prediction Center (CPC) (accessed on 21 December 2021 from https://psl.noaa.gov/data/gridded/data.cpc.globaltemp.html). These data are model-independent and derived through interpolation of station observations. We calculate daily mean surface air temperatures as the average of each day's maximum and minimum temperatures, as recommended on the CPC website.

80
To assess the relationships of temperature forecast errors with multiple Earth system variables, we use the comprehensive set of data described in Table 1. A total of 21 variables are selected based on (i) literature about sources of predictability in seasonal forecasts and (ii) physical processes and quantities which can affect surface air temperature. According to the Earth system component they are associated with, we classify these variables into three groups: climate, circulation, and land surface. All datasets cover the entire globe and are linearly aggregated to a spatial resolution of 0.5°x 0.5°in case this is not their native resolution, following a 2-dimensional Piecewise linear interpolation method (Barber et al., 1996). In addition to considering absolute values of all variables listed in Table 1, we also consider their anomalies in our analysis of relationships with temperature forecast errors, except for the circulation indices ENSO, MJO, NAO, AAO, AO and PNA which already come as anomalies. To compute anomalies we (i) subtract the long-term trend from each variable's time series, which we infer 90 through a Lowess smoothing filter fit, and (ii) we remove the mean seasonal cycle, calculated at weekly time-steps. Therefore, our final set of Earth system variables contains 36 variables (15 variables with both absolute values and anomalies, plus 6 circulation indices that are already anomalies).
We furthermore filter the vegetation-related Earth system variables in the land surface group (enhanced vegetation index (EVI), albedo, evaporative fraction) to include only sufficiently active vegetation (and hence evapotranspiration) that may 95 physically affect temperatures (Seneviratne et al., 2010;Miralles et al., 2012). For this purpose, we do not consider grid cells nor weeks with EVI lower than 0.3 or temperatures below 10°C.

Forecast error
We compute the weekly temperature forecast errors using an unbiased difference metric (Yu et al., 2006;McDonnell et al., 2018) between ECMWF-S2S temperature forecasts and CPC observational reference values, as shown in Eq. (1). Thereby, at 100 each grid cell, (i) we compute the seasonal cycle as the climatological temperature of each week of the year, based on the 2001-2017 period, for the frorecast and reference, respectively; (ii) we subtract the respective seasonal cycles from the temperature forecasts and the reference temperatures; and finally (iii) we compute the difference between the de-seasonalized forecast and reference temperatures. The contributing stations of the reference temperature dataset are not uniformly distributed across the globe (Fig. A1). In our analysis, we omit grid cells without nearby stations as they can potentially exhibit larger interpolation errors. This way, we 110 only consider grid cells with at least one temperature station located within the grid cell or in one of its eight neighboring grid cells. We assess and analyze the temperature forecast errors separately for each season, December-January-February (DJF), March-April-May (MAM), June-July-August (JJA) and September-October-November (SON), such that for each grid cell and season we consider weekly data across 3 months per season and 17 years, resulting in approximately 220 weeks per season.

115
Our hypothesis is that a high temporal correlation between temperature forecast errors and any of the Earth system variables ( Table 1) would indicate that the Earth system process represented by the respective variable is not yet sufficiently exploited by the forecasting system. We use the Spearman rank correlation coefficient (Wilks, 2011) to measure the strength of the relationship between forecast errors and each considered Earth system variable. This metric is chosen to account for potentially non-linear relationships. It is calculated with Earth system variables averaged over the week prior to the forecast initialization 120 and temperature forecast errors at lead times between 1 to 6 weeks, respectively. This way, we calculate correlation values for each of the 36 considered variables and for each weekly lead time, grid cell and season. Finally we determine the highest correlation among them (in absolute terms), which indicates the most relevant Earth system variable and consequently the most promising information source to further improve temperature forecasts. Our interpretation focuses on the relevance of variables at the level of the aforementioned groups (climate, circulation, land surface).

125
6 We only consider correlations which are significant at the 5% level. To infer significance in the light of the multiple testing at global scale (36 correlation values for the 36 Earth system variables across the globe), we perform the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995) to ensure control of the false discovery rate (Farcomeni, 2008;Cortés et al., 2020) by applying a correspondingly lower significance level. Additionaly, we randomly permute the time series of each Earth system variable and compute their individual Spearman correlation values with forecast errors to compare their magnitude with our 130 results.
Even though we analyze all weekly lead times (1 to 6), we focus on the lead time week 3 throughout most of the results section because this lead time represents well the subseasonal time scale (Brunet et al., 2010;Vitart et al., 2017;Pegion et al., 2019), which is less understood than the weather and climate scales (White et al., 2017). We find that the results after the lead time week 3 do not vary substantially. 135 We note a potential caveat in our approach due to existing relationships among our selected set of Earth system variables.
Accordingly, we compute individual Spearman correlations among the 36 variables to account for the magnitude of these associations and to identify the most affected variables.

Case study: forecast errors in focus regions
We select six focus regions across all continents to further analyze the relationship between the temperature forecast errors and 140 the respective most relevant Earth system variable. The regions are named according to the most relevant variables: specific humidity (sq), evaporative fraction (ef), total precipitation (tp), surface soil moisture (sm surf), meridional surface pressure differences (meridional sp) and ENSO (enso). We choose the regions to reflect clusters of grid cells with the same determined most important variable, and to represent different continents and Earth system variable groups. Table 2   Finally, we compute a measure of the potential to improve temperature forecasts in week 3 based on percentiles of (i) forecast errors and (ii) Spearman correlations of these errors with the most relevant Earth system variable identified in Sect. 3.2.

150
Regions that exceed the 90th percentile for both quantities (forecast error and correlation values) have relatively large errors which at the same time can be attributed to an Earth system variable such that we determine this as the highest potential for improvement. For medium and low potential the grid cell values need to exceed the 70th and 50th percentiles of all global grid cells, respectively.
3 Results and discussion 155

Forecast error variability
At first we analyze the temperature forecast errors as reflected by the unbiased differences between the forecasted temperature and the reference data (Fig. 2). The largest mean seasonal errors occur during DJF with negative or positive biases depending on the region. We suggest that the biases in predicting temperatures are a consequence of imperfect model physics, initial conditions and boundary conditions (Durai and Bhradwaj, 2014). 160 We find large errors in Asia around the Himalayas and in western North America. Dutra et al. (2021) found similar patterns in temperature forecast error patterns for the Northern hemisphere, which they linked with the imperfect model representations of snow and soil freezing. In the tropics, the biases are smaller than in the extratropics, probably because the predictability in tropical areas is provided mainly by slowly-varying forcings such as ENSO, while extratropic predictability is mostly derived from synoptic-scale atmospheric dynamics (Jiang et al., 2017;de Andrade et al., 2019;Judt, 2020). Moreover, forecast errors 165 in the extratropics might be larger than in the tropics due to a higher temperature variability (Tamarin-Brodsky et al., 2020).
The largest biases occur mainly around grid cells with low observational station density (as seen in Balsamo et al. (2018) and mountainous regions, especially in DJF and MAM (Fig. A2). We attribute this to three possible reasons: (i) difficulties in representing hydrometeorological processes such as freezing and thawing in those regions (Hagedorn et al., 2008), (ii) the misrepresentation of mountain topography in the land surface model (Bento et al., 2017), and (iii) low observational station 170 density implies greater uncertainty in the interpolation of the reference temperature dataset.
9 3.2 Relationships between temperature forecast errors and Earth system variables Focusing on the Earth system variables with the strongest correlation with temperature forecast errors within a grid cell, we find that climate variables are overall most relevant at the global scale (Fig. 3). Variables in the climate group are also the most studied variables in terms of improving temperature forecast skill. For example, an inaccurate SST representation in climate 175 models is associated with ENSO biases, which in turn affect temperature and precipitation on land (Kim et al., 2017;Ehsan et al., 2021;Liu et al., 2021). The most relevant individual variables for this group are precipitation and surface solar radiation (Table 3), which control the land-atmosphere interactions and specifically surface temperature through their influence on the water and energy balances, respectively. Since the forecasts are initialized with data from the model-based ERA5 reanalysis, errors in forcing data (e.g. precipitation) can propagate to temperature forecasts with time.  The second most relevant group is the land surface, dominated by soil moisture as the most important variable within this group. We find that the soil moisture, especially the moisture in the uppermost layer (Table 3), is important in the Northern hemisphere during MAM and JJA (Fig. 3). Soil moisture plays a key role in the exchange of heat and water between both land and atmosphere and is one of the most studied land surface variables as a source of subseasonal predictability (Seneviratne 185 et al., 2010;Guo et al., 2011;Orth and Seneviratne, 2014;Koster et al., 2017Koster et al., , 2020. Furthermore, soil moisture affects temperature forecasts particularly through its profound memory characteristics, which effectively project dry or wet anomalies into the future (Koster and Suarez, 2001;Orth and Seneviratne, 2012;Galarneau and Zeng, 2020). As a result, they can affect temperature forecast quality in regions with strong land-atmosphere coupling (Koster et al., 2004;Orth, 2021). Many recent model developments are targeted at an improved representation of soil moisture dynamics and soil moisture data assimilation, 190 and given our results, this is a promising avenue towards reducing biases in temperature estimates (Koster et al., 2011;Albergel et al., 2013;Dirmeyer and Halder, 2017;Dirmeyer et al., 2018). Koster et al. (2004) highlighted Africa, North America and India as global "hot-spots" of land-atmosphere coupling in JJA, therefore it is expected that soil moisture-related variables would be particularly relevant for forecast errors there. While soil moisture does not appear to be the most relevant variable (Fig. A3), the land surface variables are still largely related to We find that variables from the circulation group are the least relevant for explaining temperature forecast errors globally, their importance is confined to small regions scattered across the world. The most relevant variables within this group are the large scale circulation indices NAO and MJO and the zonal and meridional surface pressure differences (Table 3), which determine large-scale advection of air masses and hence the spatial temperature fields. A case study on this matter was presented 210 by Grams et al. (2018). They found a connection between a strong temperature bias over Europe, predicted with the ECMWF's Integrated Forecasting System, and the misrepresentation in the onset of a blocking regime. This misrepresentation generated a chain reaction in simulated latent heat release that amplified and propagated the forecast error to a larger region. Such a finding reveals the potential of the assimilation of circulation information to reduce forecast errors and uncertainties (Lei and Anderson, 2014;Smith et al., 2020).

215
The overall picture of the importance of each group does not substantially change with lead time (Fig. A4). Around 18% of the grid cells show (absolute) correlations higher than 0.3 (Fig. 4). The strongest correlations (>0.6) and hence most promising regions for improvement are found in eastern South America, eastern Europe, southeast Asia and Southern Africa. It should, however, be noted that despite being significant after correction for multiple testing, and also being substantially different from correlations based on non-sense data (Fig. A5), these correlations from the present exploratory analysis do not directly indicate 220 the extent to which the forecast error could potentially be improved based on these variables.
We acknowledge that some of the selected Earth system variables, especially those that belong to the climate group, are highly correlated (Fig. A6). The most cross-correlated variables are surface terrestrial radiation downwards, sea surface temperatures, specific humidity and snow cover fraction. Nevertheless, with our approach of individual correlations between each variable and the temperature forecast error we avoid the risk of co-linearities among the set of variables.

225
Next, we focus separately on the relevance of each group of variables for temperature forecast errors, as this may be hidden behind the most relevant group/variable but still exhibit significant correlations. The climate group is not only the most important group from a global perspective (Fig. 3), but it also shows significant correlations in many areas where it is not topranked (Fig. 5). Land surface variables are globally relevant for forecast errors in similar areas as the climate group. The partly 13 Figure 5. Ranking of groups of Earth system variables (climate, circulation, and land surface) according to the Spearman correlation coefficient of the most relevant variable within each group. Gray areas indicate that no variable from the group exhibits a significant correlation with temperature forecast errors.
coinciding relevance of the climate and land surface groups is related to the strong coupling between them, as for example 230 solar radiation or precipitation are related to soil moisture, vegetation state and evapotranspiration. The circulation group is still found to be of minor relevance at the global scale. We attribute this to the short time scale variability of the circulation variables such as the winds and the surface pressure differences.

Possible physical mechanisms behind forecast errors in focus regions
Within the six focus regions highlighted in Fig. 3, the relationships between forecast errors and its most relevant predictors show 235 strong seasonal variations (Fig. 6). In the sq focus region (first row in Fig. 6), the most relevant variable is specific humidity during the onset of the regional warm season (SON). We mainly find errors close to zero around zero specific humidity anomalies and increasing errors magnitude with increasing specific humidity anomalies. Incorrect initial humidity status can lead to forecast errors in temperature through multiple ways: (i) the potential for cloud formation and related reduced incoming solar radiation depends on the amount of water vapor in the air, (ii) evapotranspiration is largely determined by convergence 240 of water vapor, and (iii) water vapor content in the air affects plant's stomatal resistance and therefore evaporative cooling. We attribute these temperature forecast errors to an inadequate representation of these three processes in the forecasting system.
The most relevant variable in the ef focus region (second row in Fig. 6) is evaporative fraction anomaly. For negative (positive) values of this variable, we see overestimation (underestimation) of temperature. One possible reason is the outdated representation of the low vegetation cover in HTESSEL around this region, as found in Boussetta et al. (2021). An inaccu-245 rate prescription of vegetation type and cover fraction can lead to a misrepresented evapotranspiration response resulting in temperature errors.
In the tp focus region (third row in Fig. 6), the most relevant variable is precipitation. The region is semi-arid, therefore it is expected that precipitation events affect air temperature. We mainly see the influence of precipitation on temperature forecast errors after a threshold around 10 mm/d, and generally the related errors are positive. One pathway in which precipitation 250 influences temperature after three weeks lead time is through the infiltration in the soil and the subsequent evaporative cooling (Miralles et al., 2012;Orth and Seneviratne, 2014). We attribute this association between large precipitation values and positive temperature forecast errors with the misrepresentation of the evaporative cooling resulting from elevated (surface) soil moisture in this region.
In the sm surf focus region (fourth row in Fig. 6), the surface soil moisture is the most relevant variable. During JJA, there 255 is temperature overestimation (underestimation) in the presence of low (high) soil moisture content. This region is located in a transitional climate regime (between water and energy-controlled conditions) where the land-atmosphere coupling is typically strong (Seneviratne et al., 2010;Orth, 2021). The strong variability between the smoothing lines in this region in Fig. 6 reflects the heterogeneous land surface with variable soil and vegetation characteristics. This is particularly difficult to represent in the forecasting model and similar to sq focus region, the sm surf focus region's vegetation representation in HTESSEL is outdated, 260 for both low and high vegetation cover (Boussetta et al., 2021), therefore we attribute the high correlation between temperature forecast errors and soil moisture to the misrepresentation of the land-atmosphere coupling.
For the meridional sp focus regions (fifth row in Fig. 6), the most relevant variable is related to processes that drive wind's magnitude and direction and also moisture and heat flux transport. The representation of winds in these regions might be affected also by the high biases in the vegetation cover in HTESSEL which include potentially erroneous assumptions on 265 surface roughness. Even though these processes mainly occur over short time scales, there are cases in which a blocking regime can last for several days and affect temperature forecasts (Grams et al., 2018).
The ENSO focus region (sixth row in Fig. 6), is located in eastern South America. ENSO is the main source of interannual hydroclimatic variability in this region (Poveda et al., 2006;Arias et al., 2021). We see overestimation (underestimation) of temperature during the positive (negative) phase of ENSO. The positive phase is associated with reduction in precipitation and 270 increase in surface temperature over the region. By contrast, its negative phase is associated with an increase in precipitation 15 Figure 6. Relationships between the most relevant Earth system variable and forecast error in the focus regions introduced in Table 2 and Inset numbers represent the percentage of grid cells in each region with significant correlation between forecast error and the Earth system variable. and a reduction in surface temperature (Wang, 2004;Salas et al., 2020). These changes also modify the strength of the regional winds (both easterlies and westerlies) that can additionally explain the temperature forecasting errors found along with increased magnitude of ENSO index values.
3.4 What is the potential to improve temperature forecasts at the S2S time scale? 275 We identify in Fig. 7 regions where (i) forecast errors are relatively high and (ii) rather well explained with any of the considered Earth system variables, which indicates some potential for improvement. We use the quantile criteria described in Sect. 2.5.
Regions that exceed the 90th percentile in both tested conditions have the highest potential for improvement. We find the most promising regions around the northern hemisphere extratropics, especially in western North America, Eastern Europe and in the eastern Himalayas. We also see medium potential over some regions in central Africa, western Australia and central and 280 eastern South America. These regions display a climate variable as the most relevant Earth system variable. This could be due to (i) deficiencies in the assimilation of climate variables prior to forecast initialization and (ii) imperfect process representations in the forecasting model which limits its ability to accurately propagate the initial climate state into the subsequent weeks.
In the case of the climate variables, the detected potential for improving temperature forecasts is probably related to sparse precipitation observations available for assimilation, as well as an imperfect representation of the coupling of precipitation 285 and radiation with temperature in the model. The latter might be more relevant in Europe and North America with transitional conditions between energy and water-controlled evapotranspiration, while additional precipitation information could probably improve forecasts around the Himalayas and in some regions in Africa.
In contrast, we find relatively low potential mainly across eastern North America, Australia and India. This low potential is probably related to low forecast errors (Fig. 2) and not necessarily low correlation values (Fig. 4).

4 Conclusions
We analyze temperature forecast errors in subseasonal forecasts of the ECMWF and their relationship with a set of Earth system variables. Thereby we compute correlations between forecast errors and Earth system variables as a measure of their potential to inform even more accurate temperature forecasts. While previous studies have assessed the role of individual (sets of) variables, we move beyond the state-of-the-art by analyzing a comprehensive set of Earth system variables covering 295 multiple components of this system including several potential sources of sub-seasonal forecast skill.
The results show that climate-related variables, particularly precipitation and surface solar radiation, are globally the most relevant variables from our considered set of Earth system variables. There is still room for improvement in terms of climate data assimilation in data-sparse regions, and in terms of the modeled evolution of the climate over the forecasting period through, for instance, continuously improving descriptions of e.g. atmospheric dynamics, convection and condensation 300 schemes. Next to this, the land surface-related variables, particularly surface soil moisture, are also important in many regions. This is specially relevant for subseasonal forecasts given the profound memory characteristics of land surface variables for which anomalies present at the forecast start can persist for weeks or months. Although to a smaller extent than for the climate predictors, we find regions and seasons in which the land surface information is strongly correlated with the temperature forecast 305 errors. This also emphasizes the importance of accounting for the land surface variability in the weather forecasting system, thereby taking advantage of the recent developments in respective in-situ observations and satellite based data (Balsamo et al., 2018;Eyre et al., 2022). For instance, regions with water-limited conditions (typically semi-arid regions) can experience impacts of soil moisture variability on temperature (Ford et al., 2018;Jach et al., 2022). Even though surface soil moisture has less memory than deep soil moisture, we find that the first one is typically more relevant than the second one for forecast errors.

310
This is mainly related to its more direct impact on surface temperature. Surface soil moisture affects evaporation from soils, the transpiration from short vegetation without deep-reaching roots, and the transpiration of vegetation with both shallow and deep roots as it might extract water from surface layers as long as possible. Through these pathways, it can have a significant impact on the surface energy balance and hence temperature. Furthermore, surface soil moisture typically exhibits a larger variability compared with deeper soil moisture which can also lead to stronger impacts on temperature. In dense forest regions 315 we expect that the deep soil moisture has a more substantial effect on temperature forecast errors than the surface soil moisture because their rooting systems are more suitable for extracting water from the deepest layers.
Moreover, the vegetation (in terms of biomass, greenness and photosynthetic activity) can amplify the land surface effect on temperature through evaporative cooling via the LAI and stomatal resistance. This is particularly relevant at the S2S scale after extreme events like droughts and heatwaves (Bastos et al., 2020;Byrne, 2021) in regions with strong land-atmosphere 320 coupling where vegetation functioning can be affected even after the event is over because of legacy effects such as hydraulic damage or depleted carbon reserves.
The circulation-related variables, even though they are globally the least relevant variables, also highlight regions where temperature responds to large scale circulation patterns and surface pressure regimes. These results align with recent literature on subseasonal forecasts which contains a relatively strong focus on weather phenomena, such as NAO, MJO and ENSO 325 (Scaife et al., 2016;Vitart and Robertson, 2018;Mariotti et al., 2020;Falkena et al., 2022;Smith et al., 2020;Kim et al., 2021;Liu et al., 2021;Meehl et al., 2021). These phenomena are mainly driven by surface pressure differences and sea surface temperatures (Ehsan et al., 2021). An improved representation of these variables therefore offers the potential to enhance forecast skill in some regions of the world.
A main limitation in our analysis approach is that the considered Earth system variables are somewhat correlated with each 330 other. As our methodology is based on individual correlations it is not directly affected by relationships among the considered variables which would be the case for, e.g., multivariate approaches. Yet, we note that our results can only indicate first-order effects by pinpointing the most relevant variables and future work should focus on disentangling the joint effects of multiple variables to advance the understanding of temperature forecast errors even further. We note that the reported correlations, and the rankings between them, are not solely related to physical linkages and respective deficiencies in the forecast model and 335 data assimilation system, but can also be affected by different levels of precision of the observational data sources. This means that noisy observation-based records might not correlate as strongly with temperature forecast errors as they would if they were measured more accurately. Therefore, our analysis reflects the current potential of each Earth system variable to inform more accurate temperature forecasts, rather than the full potential which might be exploited by even more accurate future Earth observations. Finally, our results are limited also by our considered set of Earth system variables; even though we use a 340 comprehensive set of variables, there might be variables or states which are not yet observed at large spatial scales but relevant for improving the forecast model and its forecasts.
Despite the limitations of this study, our findings do provide information on processes that can be improved to increase the temperature forecast skill of the ECMWF-S2S forecasting system. Even though our results are based on just one forecasting system we suggest that other forecasting models can exhibit similar patterns in sources of predictability when using the same 345 drivers that we explore here. Machine-learning-based studies in temperature forecasts from different models at the subseasonal scale have highlighted some of the same drivers found here as important sources of predictability (Herman and Schumacher, 2018;Rasp et al., 2020;Ardilouze et al., 2021). Accurate predictions at this time scale are particularly important for issuing early warnings of extreme events, for adequately managing resources, and, more importantly, for minimizing human risks (White et al., 2017;Vitart and Robertson, 2018).   23 Figure A4. Fraction of grid cells (%) where each group of Earth system variables is the most relevant across lead times (Y axis) for each season (X axis). Figure A5. Strongest absolute Spearman correlation coefficient between the Earth system variables in Table 1 randomly permuted and the temperature forecast error in Fig. 2. The inset barplots represent the histograms of these global Spearman correlation values. Figure A6. Spearman cross-correlation matrix among the 36 considered Earth system variables. A PDF version of this plot is provided in Supplementary material.
Code and data availability. The S2S forecast temperature data are available at https://apps.ecmwf.int/datasets/data/s2s/. The CPC reference temperature data are available at https://psl.noaa.gov/data/gridded/data.