A method to estimate freezing rain climatology from ERA-Interim reanalysis over Europe

A method for estimating the occurrence of freezing rain (FZRA) in gridded atmospheric data sets was evaluated, calibrated against SYNOP weather station observations, and applied to the ERA-Interim reanalysis for climatological studies of the phenomenon. The algorithm, originally developed at the Finnish Meteorological Institute for detecting the precipitation type in numerical weather prediction, uses vertical profiles of relative humidity and temperature as input. Reanalysis data in 6 h time resolution were analysed over Europe for the period 1979–2014. Mean annual and monthly numbers of FZRA events, as well as probabilities of duration and spatial extent of events, were then derived. The algorithm was able to accurately reproduce the observed, spatially averaged interannual variability of FZRA (correlation 0.90) during the 36-year period, but at station level rather low validation and cross-validation statistics were achieved (mean correlation 0.38). Coarse-grid resolution of the reanalysis and misclassifications to other freezing phenomena in SYNOP observations, such as ice pellets and freezing drizzle, contribute to the low validation results at station level. Although the derived gridded climatology is preliminary, it may be useful, for example, in safety assessments of critical infrastructure.


Introduction
Freezing rain (FZRA) is liquid, supercooled precipitation which freezes when coming into contact with solid objects, forming a coating of ice (World Meteorological Organisation, 2010, 2011).It is a relatively rare but high-impact wintertime weather phenomenon, and in Europe it affects mainly the central, eastern and northern parts of the continent.Although major events resulting in heavy ice accretion are not as common as lighter cases, the direct damages they cause to critical infrastructure (transportation, communication, and energy), and forestry are substantial.For example, the ice coating formed on trees and power lines causes them to fail, leading to severe power outages, transportation disruption, delays in emergency responses and severe economic losses (Call, 2010;Lambert and Hansen, 2011).Lighter freezing rain events are also harmful because of their indirect effects, the most important being the reduced friction on road surfaces that results in increased rates of accidents, injuries, and difficulties in transportation (Degelia et al., 2015).
During recent years some major events have been experienced across Europe.Between 31 January and 3 February 2014 a prolonged, heavy FZRA and blizzard event hit the Alpine region, Hungary, and the Balkan Peninsula.In Slovenia, over these 4 days, 40-300 mm precipitation fell and resulted in 10 cm accumulation of ice (Markosek, 2015).Severe damage was caused to critical infrastructure, e.g. 30 km of power lines were completely destroyed and 174 km were inoperative, while railways and road traffic were heavily disrupted or closed for several days.Over 0.5 million hectares of forest were damaged, and the total cost resulted in around EUR 400 million (Vajda, 2015).In Croatia, over 80 % of the population were left without electricity (Editorial, 2014).Other severe cases include the Moscow FZRA event during 25-26 December 2010, when flights were cancelled at the Domodedovo Airport and power supplies to trains, trams, and buses were destroyed; and the 13 December 2012 case, when the British Isles and France suffered from a mix of freezing rain and snow.
Published by Copernicus Publications on behalf of the European Geosciences Union.
Despite the severe impacts of FZRA events, only a few publications concerning the European climatology of this phenomenon currently exist.Carrière et al. (2000) provided a climatology of freezing precipitation (including FZRA, freezing drizzle, and ice pellets) based on SYNOP weather station reports from three winters.Bezrukova et al. (2006) and Groisman et al. (2016) presented climatological information of FZRA over Russia and eastern Europe.Most of the other studies focus on the occurrence of in-cloud or nearsurface icing (e.g.Bernstein et al., 2009;Bernstein and Le Bot, 2009;Le Bot, 2004; Le Bot and Lassegues, 2004).More comprehensive studies have been undertaken on ice storm and FZRA climatologies over North America, including studies for various regions (Cortinas, 2000;Changnon, 2003;Cortinas et al., 2004), impacts of ice storms on different sectors (Proulx and Greene, 2001), and changes in FZRA climatology (Cheng et al., 2007(Cheng et al., , 2011;;Lambert and Hansen, 2011;Klima and Morgan, 2015) as well as on various wintertime precipitation types (Stewart et al., 2015) and impacts and details of FZRA storms and related synoptics (Hosek et al., 2011;Call, 2010;Roberts et al., 2008).
The two formation mechanisms of FZRA are rather well known.In the majority of cases a near-surface freezing layer with an accompanying melting layer above leads the hydrometeors -formed above these layers -to be in a liquid, supercooled phase when they hit the ground and freeze on contact with objects at the surface (World Meteorological Organisation, 2010).Other FZRA cases occur without the cold layer -melting layer structure, as a result of the warm rain process (Bocchieri, 1980;Huffman and Norman, 1988;Rauber et al., 2000) where collision and coalescence of the small droplets ensure the liquid form.The latter mechanism is usually associated with drizzle or freezing drizzle, but in some cases it may lead to formation of FZRA.
Several approaches have been developed for identifying wintertime precipitation types (e.g.snow, ice pellets, freezing rain) in numerical weather prediction (NWP) models, most of them for North America and many of them reviewed by Cortinas et al. (2002).With varying complexity, all of them are based on the vertical temperature profile which is used to predict the state of the hydrometeors in the atmosphere and on the surface level in particular.Usually the vertical humidity profile is used as well.For example, Ramer (1993) developed an empirical method which explicitly resolves the melting and freezing of the descending hydrometeors.This method has been widely used in NWP and in related studies (e.g.Reeves et al., 2014), showing good skill among the other precipitation typing algorithms.A slightly more indirect, and perhaps simpler approach was presented by Bourgouin (2000), who estimated the phase of precipitation based on areas between the temperature profile and the 0 • C isotherm on a tephigram.
Three-dimensional gridded meteorological data sets at daily or subdaily temporal resolution, such as output from numerical climate models or reanalysis models, are com-monly used in climate studies to account for gaps in time series of weather station data and to fill the sparsely covered areas, like seas and the above-surface atmosphere.However, a variety of issues complicate their use for estimating precipitation types.One important uncertainty arises from the fact that it is not straightforward to compare point-like weather station observations (representing local climate) and grid cells (representing climate of a larger area).As shown, e.g. by Stewart et al. (2015), Reeves et al. (2014), andRyzhkov et al. (2014), even small details in the vertical distribution of temperature can affect the surface precipitation type.However, because a gridded data set typically has rather coarse horizontal, vertical, and temporal resolutions, its vertical temperature structure may differ locally from the reality.Similarly, very minor modelling uncertainty or natural uncertainty related to subgrid-scale processes, might cause the predicted temperatures of the freezing or melting layers to be slightly off from the values that would lead to FZRA.
In this study we introduce a freezing rain detection algorithm, originally developed and operationally used in the numerical weather prediction at the Finnish Meteorological Institute (denoted by FMI NWP ), and implemented here for climatological applications.The algorithm was applied to the ERA-Interim reanalysis data (Dee et al., 2011) to test the applicability of a coarse-resolution method in derivation of proxies of FZRA and to provide a climatology of FZRA in Europe for the period 1979-2014.First, the SYNOP weather station and reanalysis data sets used in calibration of the algorithm are introduced, along with an optimization-based calibration procedure; second, the calibration results are validated using multiple approaches; and lastly the climatology is produced and analysed shortly.In the analysis the following statistics are focused on the total number of 6 h FZRA cases at each station or grid cell during the 36-year study period and the average frequency of these cases per location in the whole study domain and in three subregions as a function of time.In addition, duration and spatial extent of the FZRA events are studied.

Materials and methods
The FMI NWP algorithm uses threshold values of the air temperature and humidity in the near-surface freezing layer and in the above melting layer to distinguish the FZRA events from non-events.In order to estimate the FZRA climatology in Europe based on reanalysis data, these threshold values needed to be reconsidered for two reasons.First, some of the original threshold values were subjectively selected by meteorologists in Finland, which involves uncertainty related to subjective decisions.Second, the values are likely to be somewhat sensitive to potential biases in temperature and humidity, and these biases may be different in NWP data and in reanalysis data.Consequently, a calibration procedure was developed that employed SYNOP weather station observa-tions to redefine the threshold values of the parameters, as discussed below.The calibrated version of the FMI NWP algorithm, used in the climatological analysis of FZRA, is denoted here by FMI CLIM .

SYNOP weather station data
For validation and algorithm-calibration purposes, the observed occurrence of FZRA events was derived from 3 h SYNOP weather station recordings.Data from 4600 manually operated stations were collected from the Meteorological Archival and Retrieval System (MARS) of the European Centre for Medium-Range Weather Forecasts (ECMWF).Automated stations were not accepted due to their reduced ability to distinguish different types of precipitation, especially freezing rain and freezing drizzle (Marijn, 2007).The present weather part of a SYNOP observation (World Meteorological Organisation, 2011) consists of 100 codes describing the most important weather at the time of observation and 1 h before it.In this study the WMO codes directly referring to FZRA were selected to represent the phenomenon: 24 (freezing rain within the previous hour but not at observation time), 66 (light freezing rain), and 67 (moderate to heavy freezing rain).
To be included in this study, the stations were required to contain a valid present weather code in > 80 % of the 3 h time steps during the period 1979-2014.If stations with shorter or less regular records had not been excluded, the observed number of FZRA events per station might have been distorted.Besides, regularly working and maintained stations with high-frequency observations are assumed to be more reliable.Altogether 525 stations out of 4600, presented in Fig. 1, passed these first conditions.During the cold season months September-May the proportion of missing data was 10 % on average.
For calibration and validation purposes, and for further analyses of FZRA in station locations, the data were filtered further: -Stations located above 2000 m above sea level (8 stations) were excluded as the algorithm does not have enough pressure levels for high elevations.
-Stations having less than 10 FZRA observations (224 stations) were excluded because reliable observations might be difficult for observers with limited experience of the phenomenon due to its rarity.
-To exclude grossly erroneous recordings, the FZRA observations with surface temperatures below −15 • C or above +5 • C were rejected.
Applying the above-mentioned restrictions excluded 224 stations (circles in Fig. 1) and thus reduced the total number of stations to 293.These remaining stations were used in the calibration and validation process of this study.Finally, the time steps 00:00, 06:00, 12:00, and 18:00 Z were picked from the 3 h SYNOP observations to allow direct comparisons of the present weather codes with predictions of the FZRA events that were derived with the FMI CLIM algorithm using the 6 h ERA-Interim reanalysis data.The comparisons were conducted both for the individual stations and for spatially clustered stations.
In order to divide the stations into clusters, variables which best explain the spatial differences in the total number of observed FZRA cases in 1979-2014 were sought.Therefore, different linear models were fitted.Strictly speaking, the number of cases are count data (non-negative integer values from counting) and should be modelled using the non-linear Poisson regression, but this did not change our conclusions and linear models are somewhat easier to understand.Finally, the best variables were used to classify stations into subgroups and further analyses were performed separately for these groups.The variables studied were the distance to the nearest coastline (NASA, 2009), station elevation, and longitude.When modelling the number of cases with only one variable, the distance to the coastline and elevation were the best ones in explaining the variance (e.g. the adjusted R 2 for distance was 0.06, p 0.001).Because elevation and dis-tance are rather strongly correlated, using both or either interaction term did not improve the results substantially; that is, it did not change the adjusted R 2 .For this reason, only the distance to the nearest coastline was used for the classification.The stations were grouped as "coastal" (0-140 km), "semi-coastal" (140-330 km), and "continental" (> 330 km).Boundaries of classes, shown in Fig. 1, were selected so that each group contained an equal number of validation stations.

ERA-Interim reanalysis data
Relative humidity and temperature from 925, 850, 700 hPa and 2 m levels, surface pressure, and precipitation of the ERA-Interim reanalysis data set (Dee et al., 2011) were used as predictor data.The variables were derived from the 6 h analysis part of ERA-Interim except that for precipitation the 6 h forecasted part was used and the 12 h precipitation sums were transformed into 6 h sums.For calibrating and validating the FZRA detection algorithm, the data were bilinearly interpolated to station locations.The climatology of FZRA was derived using the gridded data in the original 0.7 • × 0.7 • resolution.
When the predicted and observed FZRA events were compared at station level, time steps without SYNOP observations of present weather codes were also excluded from the interpolated reanalysis data, which ensured the comparability.Even though excluding those time steps leads to an underestimation of the total number of FZRA cases, the effect is minor, because the proportion of missing data was only 10 % on average during winter months.

FMI NWP algorithm
The FZRA identification part of the precipitation typing algorithm (FMI NWP ) used at the weather service of the Finnish Meteorological Institute for numerical weather predictions, was adopted in this study.The algorithm uses temperature and relative humidity from four pressure levels (surface, 925, 850, and 700 hPa), and surface air pressure.Surface pressure is used to avoid analysing below-surface data in mountainous regions.The FMI NWP algorithm is originally used to predict locations where FZRA is conceivable, and as such it does not take the modelled precipitation intensity (Pr) into account.In our analyses the precipitation intensity was included to identify the actual FZRA cases.For FMI NWP the Pr value 0.15 mm 6 h −1 was selected so that the identified total number of FZRA cases, calculated as a sum over the whole study domain and all the years, corresponds to the observed total of 11 000 cases in the 293 stations.
A pseudo-code representation of the algorithm is shown in Fig. 2. First, the preconditions for FZRA are checked: (1) the near-surface air temperature T2m = T cold has to be lower than its predefined threshold value T thr cold , (2) the maximum temperature of the above-surface layers T max , and (3) the surface precipitation rate Pr need to be higher than their thresh-old values (T thr melt and Pr thr , respectively).In the next step the upper level of the near-surface cold layer p cold is defined by selecting the pressure level closest to the ground surface, while additionally taking into account the minimum acceptable cold layer depth h thr cold (in hPa) which ensures that the falling raindrops are properly supercooled.Finally the existence of a moist and warm melting layer is checked by investigating the layers above the cold layer.If temperature and humidity RH in at least one of those layers are above their predefined thresholds T thr melt and RH thr melt , FZRA is predicted.Compared to other precipitation detection algorithms, such as those presented by Ramer (1993) and Bourgouin (2000), the FMI NWP is presumably faster to implement and run, which makes it ideal for analysing large climatological data sets.Of these two algorithms, FMI NWP resembles Bourgouin (2000) more, mainly because the depth and temperature of the near-surface cold layer (h cold = p surf − p cold and T cold , respectively) together describe the energy required to supercool the raindrops.FMI NWP assumes that the melting layer and the layer where precipitation is generated are the same, even though in reality they can be separated so that precipitation is formed above the melting layer.

Calibration
In calibration, each of the threshold parameters (h thr cold , T thr cold , T thr melt , RH thr melt and Pr thr ) was discretized to cover the practical, realistic range, and the calibration was then performed in a multiple loop, where each combination of the parameters was tested using the algorithm and a suitable reward function.
The comparison between the algorithm results and FZRA observations can be presented in a 2 × 2 contingency table (Table A1 in Appendix A), so standard verification measures (e.g.Jolliffe and Stephenson, 2012) can be used.Four candidates were tested to find the appropriate reward function for optimization, namely the proportion correct (PC), the critical success index (CSI), the Heidke skill score (HSS), and the symmetric extremal dependence index (SEDI; see Appendix A for the definition of these and other verification measures used in this study).All candidates are positively oriented so that the higher the value, the better the agreement between the predictions and observations.
When only one measure is used, the CSI is the best one, because strongly biased solutions automatically get worse CSI values.However, all tested reward functions tend to favour biased solutions, either by overestimating (PC, HSS, SEDI) or underestimating (CSI) the total number of FZRA events over all stations and years.This is not desirable as the main interest of the study is in the occurrence climatology of FZRA.Therefore an additional, bias-dependent term was added, and the final form of the reward function was  where B is the bias (Eq.A7).The logarithm scales B from −∞ to +∞, the best, unbiased value being zero.Therefore biased solutions result a non-zero bias term that is then subtracted from CSI. Adjusted values used in the algorithm are presented in Table 1.

Climatological analysis
Climatological analysis was performed separately for SYNOP observations, for ERA-Interim in station locations, and for ERA-Interim in the original grid.The following statistics were calculated.
-Total numbers, mean annual numbers, and mean monthly numbers of 6 h FZRA events per station or grid cell were calculated in 1979-2014.In this analysis one FZRA event occurs when one station or grid cell encounters freezing rain in one time step.
-Spatially averaged annual mean numbers of FZRA events in 1979-2014 were calculated.Spatial averaging was performed over stations in subgroups, and over all stations.Definition of an event is the same as above.
-Duration of FZRA events were calculated separately for station data and for gridded data.In this analysis one FZRA event occurs when one station or grid cell encounters freezing rain in one or in successive time steps.Durations were calculated from a 1-dimensional vector containing the time series of stations or grid cells in a row.Station data with 293 stations and 53 000 6 h time steps comprise a total of 15 million data points.
The gridded reanalysis data consist of 4000 grid cells and 53 000 time steps with 200 million data points altogether.
-Spatial extent of 6 h FZRA events were calculated separately for station data and for gridded data.In this analysis one FZRA event occurs when one or multiple stations or grid cells encounter freezing rain simultaneously.The spatial extent was calculated based on the number of impacted stations or, for gridded data, based on the spatial coverage of impacted grid cells over the domain, using an approximative 6000 km 2 grid cell size.
After deriving durations and spatial extents of events, empirical probability distributions (containing events and nonevents) were formed.

Results
In the following, we first have a look at the refined threshold values in the FMI CLIM algorithm.The performance of the algorithm in predicting FZRA cases is then assessed using the observed weather station data, and finally, the findings concerning climatological features of FZRA are presented.The cross-validation results are based on calibrations in subperiods, while other validation results are based on the final calibration.

Cross-validation of calibration
To study the sensitivity of the threshold values to the selection of the calibration period, a cross-validation framework was applied, where the total of 36 years of data was divided into five non-overlapping subperiods.The calibration was then performed for each of them separately, using Eq.(1).Each subperiod contained a 29-year calibration part and a 7year validation part so that the validation years were different in all subperiods.The means calculated over the results from different calibration periods were used for the climatological analysis of FZRA: they are close to the values that were achieved using the whole 1979-2014 period for calibration.
The calibration changed most of the threshold parameter values only slightly (Table 1), which confirms that (1) the FMI NWP algorithm performs as designed, and (2) no strong biases in mean values of different variables exist in the ERA-Interim data.For example, if the mean temperatures in some of the layers studied were far from the reality, the optimal value would have drifted away from the physically motivated 0 • C limit in the calibration.As an exception, the minimum depth of the near-surface cold layer h thr cold was notably altered by the calibration, as the optimal value appeared to be 69 hPa, which is over 300 % larger than the original value, 15 hPa.In the lower troposphere these correspond roughly to the depths of 600 and 130 m respectively.The observed range of the minimum cold layer depth varies considerably between sta- tions, as shown by Bernstein (2000), being between 100 and 900 m; 600 m is thus likely to be more representative than 130 m over all European stations.The bias-dependent part of the reward function (Eq. 1) was useful in stabilizing the calibration and excluding the less credible combinations of threshold values.Without it, calibration introduced new biases (not shown) either in the annual number of the FZRA cases in subgroups (Fig. 3), or in the total number of cases per station (Fig. 1).
As Table 2 shows, the calibration enhanced all validation metrics except bias and F .On average, the hit number a was slightly improved, and the false alarm b and miss c numbers were both reduced.While H slightly improved, the variability of b was rather large, and the change of F is not statistically significant.CSI was improved by 8 %, HSS by 7 % and SEDI by 2 % in calibration, and these changes were statistically significant.The absolute values of CSI and HSS are rather modest, but it is well known (Jolliffe and Stephenson, 2012) that these measures tend to zero when the base rate is very low, as is the case of FZRA.SEDI is designed for the evaluation of rare events and gives much higher values.Still, only one event in five is correctly detected (H ≈ 0.20).

SYNOP weather code classification
The algorithm-based (FMI CLIM ) classifications of weather situations to FZRA events were compared with SYNOP observations of present weather at the validation stations.In Fig. 6a, the distribution of observed SYNOP codes when an event was classified as FZRA is presented.The distribution shows that only 22 % of classified events coincide with SYNOP codes for FZRA (codes 24, 66, and 67).About 32 % of classifications coincide with codes when no rain of any kind was observed at the SYNOP stations (codes 2 and 10), although reanalysis data did imply rain of at least 0.4 mm 6 h −1 (Table 1).Codes associated with light precipitation of other types (drizzle, rain and snow, codes 56, 61, and 71) coincide with FZRA classification, light rain about 6 % and other two about 7 % each.However, most of these codes occur much more often than codes associated with FZRA.
In order to illustrate how the relative numbers of false alarms deviate between different SYNOP present weather codes, the proportion of classified FZRA events in each code is shown in Fig. 6b.For SYNOP codes associated with FZRA, this proportion can be interpreted as the hit rate H (Eq. A5).For SYNOP codes not associated with FZRA, this proportion can be interpreted as the false alarm rate F (Eq. A6

Number of freezing rain events at weather stations
The total number of FZRA events at each station according to the observations was compared with the number of FZRA events according to FMI CLIM (Fig. 5).The comparisons were performed for all the 293 stations in the calibration set and separately for the three subgroups of the stations.
In each case the root mean square (rms) error and the mean error (ME) were calculated, and the FMI CLIM was modelled as the function of observations using the local polynomial regression method, loess (Venables and Ripley, 2002).The mean number of events calculated with FMI CLIM is almost the same (ME = 0.0) with those observed for "all stations" (Fig. 5a).However, the distributions are rather different, the distribution of the FMI CLIM result is somewhat symmetric around the mean, but in the case of the observations the number of stations with a small number of FZRA events is higher, and the tail of stations with a high frequency of events is much longer.For "all stations" and small numbers of events, FMI CLIM models the average number of events well with some overestimation, and the loess curve is very near the diagonal line.However, for larger values, the loess curve is nearly horizontal, implying that FMI CLIM cannot properly model the stations where the large values occur.In the continental group (Fig. 5d), the curve is horizontal, or even negatively correlated, for all values.In the coastal group the rms error is the smallest but there is an underestimation (ME < 0), while for the continental group the rms error is the largest and there is an overestimation (ME > 0).Smaller rms error in coastal areas can be partly explained by the lack of large values that would contribute to rms.The spatial averages were computed over all the 293 stations in the calibration set and separately for the stations in the coastal, semi-coastal, and continental subgroups.Because of the uncertainties in the SYNOP weather station observations or in the ERA-Interim reanalysis, calibration had only a minor effect on the spatially averaged annual numbers of FZRA events inside the subgroups or over the whole station network (Fig. 3 and Table 3).Compared to the observations, the calibrated and uncalibrated algorithms both slightly underestimate the coastal mean number and overestimate the continental mean number of events, the semi-coastal group being better modelled.Both versions of the algorithm over- estimate the standard deviation in all groups and at individual stations.The annual numbers of FZRA events are reproduced better as spatial averages across the weather stations than at individual sites.The rms error of spatial averages is smaller than the standard deviation of observations, except for the continental group, implying that results in general are better using FMI CLIM or FMI NWP than just using the climatology.For individual sites, however, the rms error is slightly worse than the standard deviation of observations.An area of underestimation can be seen, e.g. in Germany, Czech Republic, and in most stations of Poland, while overestimation mostly happens in the eastern validation stations (Figs. 1 and 4).These biases are spatially homogeneous, independent of algorithm versions (not shown), and do not change with time.The lack of freezing rain, mostly seen in coastal stations, is modelled well, assuming that observations are correct.

Validation of near-surface predictor variables
In addition to the distance to the coastline, orography is likely to affect the occurrence of FZRA (Sect.2.1).This suggests that highly variable terrain might cause strong local maxima in the observed occurrence, as the fringe areas of large plains surrounded by mountains are more prone to FZRA.For example, the slopes and valleys surrounding the Great Hungarian Plain might experience cold air damming, a phenomenon which is associated with ice storms in North America (Forbes et al., 1987).This phenomenon can also happen on smaller scales, but presumably it cannot be resolved in our work due to the coarse spatial resolution of the reanalysis data.To explore the possible effects of resolution on the accuracy of reanalysis variables, the correlation coefficients between SYNOP observations and ERA-Interim reanalysis surface variables were calculated at the validation stations.Indeed, correlations for near surface temperature and humidity of low-altitude stations (0.81 and 0.65 for temperature and humidity) are higher than at high altitudes (0.52 for temperature, 0.44 for humidity).Presumably, the highest stations are usually located in highly variable terrain, and rather large grid cells of the reanalysis do not represent the observed variability of these stations well.No strong differences in observed and ERA-Interim mean values of the variables were found, but the differences are again larger in the highest stations, so that ERA-Interim overestimates, for example, the temperatures by 1.6 • C at higher and only 0.3 • C at lower altitudes.

Vertical profiles of temperature and humidity
Figure 7 shows vertical temperature and humidity profiles of ERA-Interim in the validation stations during the observed and predicted FZRA events.As intended, the FMI CLIM algorithm picks the cases for which well-defined melting and freezing layers exist (Fig. 7c).However, as the wider variability ranges in Fig. 7a show, FZRA was observed in much more diverse temperature profiles than predicted.The same feature can be seen in the humidity profiles (Fig. 7b, d).The FMI CLIM algorithm selects cases where the melting layer humidity is high, even though in reality the precipitation can be formed above the melting layer and thus humidity in the melting layer can be low during the FZRA events.The observed and predicted number of FZRA cases (Fig. 7a-e) is 11 000, but only 2300 events happened simultaneously in the observations and prediction (Fig. 7e, f). Figure 5.The total number of FZRA events according to the observations compared with the number of FZRA events according to FMI CLIM for stations in the calibration set, using (a) all 293 stations and stations in (b) coastal, (c) semi-coastal, and (d) continental groups.The curves superimposed in the scatter plots show FMI CLIM as the function of observations using the loess method (Venables and Ripley, 2002) together with the 95 % confidence interval.Note that plotting symbols for groups in (b), (c), and (d) are used also in (a).The bar diagrams present relative frequency distributions.

Duration and spatial extent of freezing rain events
The probability of the most common FZRA events, those detected during a single 6 h time step only, is predicted similarly as in observations by both FMI algorithms (Fig. 8a).
For longer-lasting events the algorithms produce overestimates that increase towards the extreme tail of the durations; at the 10 −6 probability level the overestimation is about 6 h.The spatial extent of FZRA is also overestimated in the extreme tail (Fig. 8b) so that the algorithms overestimate the number of simultaneously impacted stations by about 30 % at the 10 −4 probability level.Additionally the most frequent events, i.e. one impacted station, are slightly underestimated by both algorithms.Durations are modelled better at continental stations than at coastal and semi-coastal stations, and number of impacted stations is modelled almost correctly at coastal stations but poorly at continental stations, with overestimation of almost 100 % by FMI CLIM at the lowest 10 −5 probability level (not shown).

Climatology of freezing rain in Europe
The spatially averaged observed annual mean number of FZRA events in different subgroups varies from 0.8 in coastal stations to 1.2 in continental stations as seen in Table 3. Figure 3 shows that the interannual variability of FZRA events is substantial.Even more importantly, the coefficient of variation -standard deviation divided by mean -is large especially in the continental subgroup: there are years with less than 0.5 FZRA events per station, and years with more than two events on average.Large coefficient of variation may hamper the anticipation and allocation of resources in road maintenance, for example.Weak positive lag 1-year autocorrelations were found in the annual numbers, ranging from 0.20 in semi-coastal stations to 0.32 at coastal stations, which indicates weak but non-zero predictability of the annual FZRA number based on the number of the previous year.
The spatial distribution of the annual mean number of events (Fig. 9a) shows that FZRA is most frequent in eastern Europe.Large areas in Belarus, Ukraine and Russia encounter 2-3 6 h FZRA events per year.The maximum annual number of FZRA cases is situated near the Carpathian mountains, where locally over five events were found on average.The spatial distribution of maximum durations of the events (Fig. 9b) follow qualitatively the mean occurrence distribution in general, but some areas where FZRA is relatively rare, for example the Benelux countries and the Oslo Fjord, have encountered at least one prolonged event.Almost regardless of the latitude, the coastal and marine areas do not experience FZRA as often as the other regions, apparently because warm water bodies effectively prevent the occurrence of near-surface cold layers.An exception is the northern Baltic Sea, having a long ice cover season and relatively frequent FZRA cases.
The FZRA season begins in northern parts of the Fennoscandia as early as September, which can be seen in the monthly climatology maps (Fig. 10).In that area the phenomenon is experienced most frequently in November.After that, in December-February, the temperatures drop so low that the melting layer seldom forms.It is probably for this reason that the total number of FZRA events in northern Europe is rather small, even though the season is the longest, lasting until May.In central Europe and especially in eastern Europe the season is shorter but much more intense, so Top row: profiles when FZRA was reported in SYNOP messages (11 000 events in total).Middle row: FZRA profiles according to the calibrated FMI CLIM algorithm (11 000 events).Bottom row: profiles where both the FMI CLIM algorithm and observations indicated FZRA (2300 events).
that the annual number of FZRA events is larger than in the northern parts of the continent.
The most widespread (Fig. 8b, d) FZRA events at the 10 −4 probability level covered over 600 000 km 2 and impacted over 10 % of the weather stations simultaneously.The longest-lasting events below the 10 −7 probability level lasted over 30 h (Fig. 8a, c).It is worth noting that the longest-lasting cases are not necessarily the same as the most widespread events.The proportion of simultaneously impacted stations in the subgroups varies from 13 % (coastal) through 16 % (semi-coastal) to 30 % (continental stations) at the 10 −4 probability level (not shown).

Discussion
In this paper a freezing rain detection algorithm has been introduced with a method to calibrate it.After validation the algorithm was applied to a reanalysis in order to construct the European occurrence climatology of freezing rain.So far, no complete gridded climatologies of freezing rain have been presented for Europe in the literature.A physically justified, statistically adjusted algorithm, which is mainly based on the vertical temperature profile of the atmosphere, was used in the study to ensure the credibility of the result.Subdaily, quality-controlled Europe-wide SYNOP weather station data were used in the statistical adjustments.
In validation, the gridded meteorological data set is compared with the point-like surface observations.Each grid cell represents spatial means in the 0.7 • resolution, while weather stations represent more local variability of the atmosphere.It is possible that in some cases FZRA has not been observed at a station even though it has occurred nearby.Although hypothetical, this suggests that our estimates, derived from ERA-Interim, might at least occasionally represent the occurrence of FZRA inside the 0.7 • grid cells better than the stations do.

Possible sources of uncertainty
Potential sources of uncertainty in the gridded climatology of freezing rain in Europe, besides the detecting algorithm itself, include human errors in observing FZRA, deficiencies in the ERA-Interim reanalysis data, and effects of subgridscale orography.These issues are discussed in more detail in the following.
Observing FZRA correctly remains a challenge for observers, as the phenomenon can be easily confused with ordinary, non-freezing rain.Particularly minor cases are difficult to detect for two reasons: firstly, they do not necessarily cause significant ice accretion on structures, which would help the identification of FZRA, and secondly, confusion of FZRA with freezing drizzle or ice pellets might happen, especially by inexperienced observers.Additionally, short-term events, which are more common than longer ones (Ressler et al., 2012;Cortinas, 2000) observations.Short-term events are difficult to predict using spatially and temporally smoothed 6 h reanalysis data.
A large number of SYNOP stations was used in the study, which is believed to average out random errors in calculation of spatially or temporally aggregated results, such as mean annual numbers of events in subgroups (Fig. 3).Additionally, the stations having the most complete time series and regular, high-frequency manual observations were included.Still, the effect of human errors can not be totally removed by applying selection techniques to the existing observations.The strongest difference between observations and algorithm results were identified in eastern Europe (algorithm overestimation and/or observational underestimation, as discussed in Sect.2.1 and as seen in Fig. 4) and in continental central Europe (algorithm underestimation and/or observational overestimation).Whether the reason for these differences is in the reanalysis data or in the SYNOP observations remains unclear.
The low-level wintertime temperature inversions in the ERA-Interim reanalysis data are known to be lacking at Arctic latitudes as shown by Serreze et al. (2012).Arguably their result might be valid at least to some extent outside the Arctic.This uncertainty was considered by calibrating the original FMI NWP algorithm instead of using it as such, but apparently the impact of the bias can not be fully compensated by simple adjustments of the threshold values in FMI CLIM .In our analysis the mean temperature profile (red line in Fig. 7a) of the ERA-Interim reanalysis do not show a clear near-surface freezing layer below the melting layer, which either indicates problems in the above-surface temperatures of the reanalysis, or highlights the importance of the warm rain formation mechanism of FZRA compared to the melting layer -cold layer mechanism; FMI CLIM is not able to see the former cases.
Locally, near the mountainous regions, a potentially major source of uncertainty is caused by the orography, which is strongly smoothed in the 0.7 • resolution of the ERA-Interim.The results may be especially biased in subgrid scale valleys, where prolonged FZRA events might be caused by trapped cold air mass.It is possible that the optimal value of h thr cold , found here for FMI CLIM , differs from the corresponding uncalibrated value because high-elevation or mountainous stations were included in the calibration, and the original, uncalibrated version is mostly used to predict FZRA over the mostly flat terrain of Finland, where smaller h thr cold might work well or well enough.
In the algorithm, two thresholds were used to detect situations favouring subcooling of raindrops, namely, minimum required depth and the maximum allowed temperature of the near-surface cold layer.However, fulfilling both criteria mentioned above does not totally guarantee the liquid phase: a too cold or a too deep cold layer refreezes the hydrometeors, which is not taken into account in the current version of the algorithm.This could explain, at least partly, the occasional misclassifications to ice pellets.

Future work
Further exploration of existing data, i.e. observations and the reanalysis, is needed to deepen the knowledge of the phenomenon, including synoptic analysis of the most extreme cases, calculating the precipitation amounts, and studying other freezing phenomena, i.e. freezing drizzle and ice pellets.In addition to these, the following are the key issues for improving the credibility of the current occurrence results.The list is ordered so that the most important tasks are presented first.
1.If the bias structure between the eastern and central European SYNOP stations (Fig. 4) is caused by observational uncertainties and not by reanalysis or algorithmdependent uncertainties, identification and rejection of low-quality stations could enhance the calibration and validation processes.Slightly better validation scores between the observations and prediction were achieved when eastern stations were excluded from the calibration (not shown).
2. Increasing the vertical resolution of the FMI NWP algorithm would be helpful, as small differences in vertical layers easily affect the result (Stewart et al., 2015).
3. Validation of the vertical temperature profile of the reanalysis data against observational soundings would allow the division of the uncertainty into methoddependent and data-dependent components.
4. The description of the near-surface cold layer in the algorithm could be enhanced by defining a closed range in which the parameter h thr cold (or T thr cold ) should hit, instead of considering lower (or upper) limits alone.Additionally, separating the moist and melting layers to be independent from each other should be tested, as well as using wet bulb temperatures instead of air temperatures as predictors in the algorithm.
5. In addition to the FMI algorithm, a new identification methodology could be developed or adopted and tested, including statistical classification methods and more complex but well-performing physical methods, such as ones which more explicitly simulate the melting and freezing of descending hydrometeors (e.g.Ramer, 1993).
6. Small uncertainties in the location, in space and time, of the moving precipitation patterns in the reanalysis increase the uncertainty of algorithm-based FZRA detection because of the typically short duration of the FZRA events.To some extent, this uncertainty could be studied by taking into account the preceding and following time steps in the observational records, as the original SYNOP data are 3 h.Additionally, preliminary analyses (not shown) indicate that 1-or 5-day averaging could also enhance the correlation of the results with similarly averaged SYNOP observations.
New predictor data sets need to be tested when available, for example the ERA-5 of the ECMWF, which is designed to be the successor of ERA-Interim.The most important criteria when selecting new predictor data would be the accuracy of the vertical temperature structure and high temporal and spatial resolutions.Additional observational in situ data sets, such as METAR aviation weather reports and atmospheric soundings, could be used in further development of the FMI algorithm.
As discussed above, neither the observations of FZRA nor methods to predict it are perfect; that is, the ground truth is missing.The methods of then estimating the real base rate of a phenomenon and the verification results of detection are much discussed in social and medical sciences (see, e.g.Lewis and Torgerson, 2012), but are little-known in atmospheric sciences (see Hyvärinen et al., 2015 for the first steps).Ideally, these methods require more than two independent sources of data, for example, different observations and method results.For FZRA this requirement can be difficult to fulfil, as there are not many different sources of observations available and methods are usually developed using all available observations.However, exploring these methods would contribute to the better estimation of the occurrence of FZRA.

Conclusions
A method for detecting FZRA in gridded meteorological data sets is presented, followed by a climatological Europe-wide mapping for the occurrence of FZRA.The objective of this paper was to develop an algorithm that is simple enough to be applicable on spatially, vertically, and temporally coarse public gridded climate data sets such as output from climate models, and on the other hand is physically sensible enough to model the complicated conditions leading to FZRA.The low validation results at station locations indicated that uncertainties related to the observations, to the identification method, and to the temporal and spatial resolution of the reanalysis, deteriorate the algorithm-based identification of FZRA events.However, it is not clear which uncertainties are the most important, and it is likely that their relative importance can vary in space and even in time.
The freezing rain detection algorithm selected for this study was originally developed in numerical weather prediction.The physically motivated internal thresholds of the algorithm were calibrated using the ERA-Interim reanalysis and SYNOP weather station observations.Values of the thresholds did not change considerably in the calibration process, and the simple calibration did not reveal strong biases in the reanalysis, showing that the original thresholds are already adequate for climatological analysis of freezing rain in ERA-Interim.
According to the algorithm-based analysis of the gridded reanalysis data, freezing rain is more common in central and eastern Europe than in the northern parts and over the coastal regions.The FZRA season begins in September and lasts until May in northern Europe.In central and eastern Europe the season is shorter, beginning in October and lasting until April, but much more intense, leading to more yearly events (typically 1-2 events yr −1 ) than in northern countries (typically 0.5-1.5 events yr −1 ).In 1979-2014, the longest-lasting FZRA events lasted over 30 h and the most widespread events covered over 600 000 km 2 in Europe.
Spatially and temporally coherent information about occurrence of FZRA in Europe has been lacking thus far.The gridded output of this study is a preliminary approach to answer this demand, and as such, the current work can be used as a basis for risk analyses if the underlying uncertainties are carefully kept in mind.For example, questions such as what year contained the most freezing rain events in continental regions of Europe can be answered reliably using the current data and method.Analysing spatially aggregated FZRA re-

Figure 1 .
Figure 1.Total number of FZRA cases 1979-2014 in Europe according to 6 h SYNOP observations (orange) and the FMI CLIM algorithm (cyan), applied to the 6 h ERA-Interim reanalysis.The size of the markings indicates the frequency classes.The distance to the nearest coastline (in km) is shown with black isolines, which divide the stations into coastal, semi-coastal, and continental groups.Stations that were excluded from the calibration, validation, and further analyses of the FMI CLIM algorithm are indicated with circles.

Figure 2 .
Figure 2. A pseudo-code representation of the FMI algorithm.See text for definitions of symbols and for a description of the logic.

Figure 3 .
Figure 3. Annual, spatially averaged mean number of FZRA cases per station in all 293 stations and in groups based on distance to the nearest coastline according to SYNOP observations (black), the FMI NWP (red), and the FMI CLIM (blue) algorithm.Definitions of groups can be seen in Fig. 1.Statistics calculated from the numbers presented here are shown in Table3.Note the different y axis scales.

Figure 4 .
Figure 4. Logarithm of bias B of the FMI CLIM results in different 12-year periods.Blue (red) shows the overestimation (underestimation) of predicted total number FZRA events compared to the observations.Sizes of markings indicate the magnitude of bias.
Figure 6.(a) The distribution of observed SYNOP present weather codes when an event was classified as FZRA by the FMI CLIM algorithm.The distribution sums up to 100 %.(b) The proportion of cases classified as FZRA in each observed SYNOP present weather code.Each code can have a proportion from 0 to 100 %.

Figure 8 .
Figure 8. Probability of duration (left column) and spatial extent (right column) of FZRA events at station locations (top row) and in all grid cells (bottom row) according to the detection algorithms (blue, red) and observations (black).

Figure 9 .
Figure 9. (a) Mean annual number of FZRA events and (b) maximum duration of events in the 1979-2014 study period.FMI CLIM algorithm is applied to the ERA-Interim reanalysis data.The highest elevations were excluded (grey) because of larger uncertainties in FZRA detection.

Table 1 .
Uncalibrated (upper row) and calibrated (bottom), optimal values of threshold parameters in the 29-year calibration periods.Mean values of the optimal values are shown, computed for calibration periods using the sample variance of period values.Mean values are used in the final analysis of the gridded data set.h thr cold = minimum cold layer depth; RH thr melt and T thr melt = minimum humidity and minimum temperature in the melting layer; T thr cold = maximum cold layer temperature; and Pr thr = minimum surface precipitation rate.

Table 2 .
The cross-validation measures and scores in 7-year validation periods when the predicted 6 h FZRA result is compared with observed 6 h events.Mean values and standard errors, computed for validation periods using the sample variance of period values, are shown.See text and Appendix A for definitions of measures and scores.

Table 3 .
Statistics calculated from numbers presented in Fig.3.Correlation coefficient of algorithm results compared to the observations (r), mean value (x), standard deviation (s), and rms error of annual mean numbers of FZRA cases per station averaged over all stations, averaged over groups based on distance to the nearest coastline, and in individual stations are shown.
, might not be recorded in the 6 h www.nat-hazards-earth-syst-sci.net/17/243/2017/Nat.Hazards Earth Syst.Sci., 17, 243-259, 2017 sults from climate model output with the presented methodology should be feasible as well.Station level analyses, however, require further studies to be carried out.Members of the ECMWF can access the MARS archive for the SYNOP weather station data used in this study.ERA-Interim reanalysis data can be obtained from the public server of the ECMWF.Processed data files and Python code are available on request from the corresponding author.