Interactive comment on “ Three variables are better than one : detection of European winter windstorms causing important damages ”

logical variables, independent of exposure, identifies the most damaging storms. My conclusions from the study are the obvious finding that, when you consider overlapping storms in multiple variables, you reduce the number of storms detected. The ultimate success of the study would be a positive answer to the question “Would I adopt this methodology to identify the most damaging storms in my dataset?”. Based on what is presented here, the answer would be “no”.

It is of course the near-surface wind that is ultimately causing the damages, which is why studies on damagepotential mostly rely of wind-based loss or severity indices. It is not clear however that the local maximal wind gusts are well represented in reanalysis datasets, or worse in general circulation models. The surface winds depend on boundary-layer parameterizations, and the low spatial resolution of some datasets may not capture smaller-scale dynamical (fronts) or topographical features that lead to extreme winds. Variables such as SLP or vorticity could thus also have a good and independent predictive value. We tried using instead of the maximum wind a classical storm footprint measure by spatial and time integration of the cube of the surface wind speed. The results were similar; in particular the percentile of the reference storms, and vorticity and SLP still gave lots of added value. Exposure is not used directly in the method (we are looking for potential damage in terms of storm intensity only); but the geographical window chosen focuses on areas with lots of exposure (see point B).

R1.GC: The reason for the choice of the specific parameters is also not clear. There are other parameters which could be as relevant (vertical stability and gustiness).
The three variables are chosen from the existing literature on detection and tracking of ETCs and on storm severity measures. The variables are also easily derived from the initial available data (including GCM archives such as CMIP5) and do not require important amounts of data storage. Upper-level variables such as the intensity and location of the jet stream at 200 hPa have been tested but did not give satisfactory results as detection variables. Variables such as vertical stability or moisture content are probably useful predictors of cyclone development, but maybe less of current intensity at a given time. The gustiness would of course be a good damage predictor but it is not a usual output from models and its computation from coarse resolution data might not give better results than the 10-m wind speed. It is of course possible that different variables could improve the method; the aim of the paper is not to claim that this particular choice is optimal, but to point out the value of using multiple variables.
To better explain the choice of variables in the paper, we reformulate the first paragraph of the subsection P5L25 on the variables into: "In order to characterize extra-tropical cyclones with high damage-potential over Europe, several variables at different levels of the troposphere have been analysed. We favour variables that are standard outputs from models or that require as little computation from the initial data as possible. Among the variables that have been considered, we choose three (near-) surface variables: the relative vorticity at 850 hPa, the mean sea-level pressure and the 10 m wind speed. These variables are commonly used either to detect and track ETCs (Ulbrich et al., 2009;Neu et al., 2012) or to assess potential impacts of ETCs (Leckebusch et al., 2008;Pinto et al., 2012). We briefly illustrate in Fig. 1 the spatial patterns of these three variables in the case of the major storm Lothar (December 1999)."

Main point B: Choice of spatial windows
Several comments from the reviewers concern the choice of the geographical window for the detection, and how it affects the storms detected. To test for this, we applied our method to a much larger window encompassing most of Scandinavia; and we compared the storms detected with both windows with the XWS database, as asked by Rev. 1. Results are described below. A discussion subsection on the choice of the window and its consequences, summarizing the main points presented below, will be added in the paper.

B1: Offset of pressure and wind centres
R1.GC: Thus, the question of the size of the area (which obviously must be discussed). The area must include both the grid point with maximum normalized windspeed over land AND the pressure minimum, plus the vorticity maximum (there could be more than one relative vorticity maximum, and even more than one pressure minimum associated with the cyclone, with only part of them located within the frame considered). Considering an area which does not, for example, include the location of the pressure minima and wind maxima simultaneously will inevitably lead to failure of the approach. R1.C2 b): Also, there is an offset between the low pressure cores and the area of maximum wind speeds. This would imply a shift of windows used for the different parameters in order to catch events properly. With the current configuration, some major relevant events may be missed. I guess this would become apparent when going beyond the 10 events listed. Is this the case? How well are relevant apart from the ensemble of 10 ETCs caught For example use the Extreme Windstorms Catalogue XWS (in total 50 events).
There is indeed an offset between the low-pressure systems and the associated maximum wind speed (or vorticity), with the low-pressure centre typically located farther north. The initial window was chosen to minimise that effect, i.e. it should extend enough northward to capture the low-pressure centres associated with extreme wind over areas of Western Europe with high exposition (to simplify, Great Britain, France and Germany). To check whether this was enough, we compare the final detected events from ERA Interim and NCEP2 to the XWS dataset as suggested. The extreme windstorms (XWS) catalogue gathers 50 events from 1981 to 2012 over Europe. There are 16 insurance storms associated to a loss and 34 non-insurance storms that are not associated to a loss. The events are selected on a wind-based index computed with ERA Interim at 0.25°. The 44 events with the highest value of the index are automatically selected and 6 other known events (including one insurance event) are manually added because they are out of the first 50 events according to the index.
Here we consider the XWS events over the period 1987-2010 that is used in our paper. Over this period there are 38 storms with 14 insurance storms including the ten events from the Munich Re (MR) ranking (our reference storms) and four others: Herta (1990), Wiebke (1990), Gero (2005 and Emma (2008). These 14 insurance storms are the events that are we consider as relevant for our methodology since they actually generated known economic damages. Table 1 shows the results presented in our paper, separating insurance and non-insurance storms as done within the XWS database. With our method applied to ERA Interim Herta (1990), Gero (2005) and Emma (2008) are not detected. Gero and Emma are example of events that are not detected because of the offset between a low-pressure core and the associated wind speeds (note that Emma had to be manually added in XWS). For Herta the value of the mean sea level pressure anomaly is lower than the chosen threshold. Using NCEP2, Gero and Emma are again not detected because of the offset, while Herta (1990), Wiebke (1990) and Lothar (1999) reach values of mean sea level pressure or relative vorticity lower than the thresholds.
It is delicate to assess the relevance of the non-insurance storms. The ratio of insurance storms over noninsurance storms would perhaps be a better argument to judge the quality of the final set of events, in terms of selectivity. With the configuration presented in the paper, better results are obtained for ERA Interim than for NCEP2, but this seems more an issue of intensity than localisation.
The offset of low-pressure cores thus seems to have a minor but real impact: it prevents the detection of 2 XWS insurance storms but none of the 10 reference ones. It should probably be taken into account, and this will be added in the paper, as well as the comparison with XWS events (P11 and P15).

R1.C2 a): Some ETCs like Lothar are essentially small scale disturbances related to a larger System (Klaus [Martin?], in this case)
. It seems to be not straightforward to find a strong low pressure core in the narrow spatial window used. Lothar considered at low resolution may be an example for this problem.
Lothar had a low-pressure core inside the spatial window we are considering. As described above, Lothar is missed in NCEP because of its low relative vorticity maximum, not because of the anomaly of mean sea level pressure.

B2: Size of the window
R2.C1: The geographical extent has masked out large areas where potentially damaging storms occur but with little exposure: Sweden, Norway, northern Scotland and Ireland. I suspect if these regions were added, many of the most extreme storms detected would not be the most damaging economically and the reference events would be demoted in their ranking.
First we would like to outline that the most damaging storms are already not the most extreme storms in term of meteorological signature in the window considered in the paper. Damages occur because of the coincidence of extreme winds and exposure. While we do not use exposure directly in the paper, we restricted the window to the general geographical area with most insurance exposition, in order to compare our detected storms with the insurers' databases.
To check how the method holds for a much larger window, we apply the methodology to a wider window that includes Spain, Portugal, Ireland, Sweden and Norway (Figure 1). This is also a way to answer the problem of the low-pressure centre offset (point B1 above).
The strong activity occurring in the north-western part of the larger window leads to the detection of maxima of both relative vorticity and anomaly of mean sea level pressure that reach higher values than within the smaller window. The 10-m wind speed ratio is less affected when widening the geographical window since its computation is based on land grid-points. The detection of higher values of maxima lead to higher values of the detection and selection thresholds defined as the 95 th and 98 th percentiles of the maxima distributions, and some reference storms are then missed. However if we follow the approach described in the paper, the detection and selection thresholds used for the methodology are derived from the maxima distributions and the minimum values reached by the reference storms. Using maxima over the larger window, the detection and selection thresholds must be lowered to the 90 th and 95 th percentile of the maxima distributions (which are close to the 95 th and 98 th for the small window). Results are shown in Table 2.  Comparing with table 1, there are more events detected, which is logical using a wider window. The ratio of insurance storms is lower (although their number increases), because the initial window focused on the region with high exposition. But this doesn't mean that the new events are less intense.
Detailed results: With ERA Interim we get 331 events with the relative vorticity, 250 with the anomaly of mean sea level pressure and 220 events with the 10-m wind speed. The final set gathers 62 events including the 24 events detected within the small window. We now detect Gero (2008) that was missed before because of the offset of the low-pressure core and the associated wind speed. We still miss Herta (1990) and Emma (2008) because of the value of the thresholds. However it should be noted that if the selection process of the XWS database actually stopped at the 50 most extreme events according to the index, Emma would not be selected.
With NCEP2 we get 312 events with the relative vorticity, 260 events with the anomaly of mean sea level pressure and 315 with the 10-m wind speed. At the end, we have 75 events including 26 of the 33 events detected within the small window. We now detect Gero and Emma that were not detected before because of the offset. However Herta, Wiebke and Lothar are still missed because of the thresholds.
To conclude, the answer to the comment is that with a larger window that includes more of the storm track but with less-exposed areas, the most damaging events are not the most extreme ones (in terms of meteorology) and have a lower ranking. However they are still detected once the thresholds are adapted to the new window.
Widening the spatial window reduces the risk of offset described in B1, although the ratio between the number of insurance storms and non-insurance storms is not as good as the one obtained within the small window. A great number of the non-insurance events detected with our methodology are localised over Scandinavia where, for now, there is little exposure. These conclusions will be added in the discussion section of the paper.

R1.C1: It is not clear why the Mediterranean is excluded.
We have not look for windstorms associated with ETCs in the Mediterranean region. What we mean by "The Mediterranean region is excluded because of the high regional cyclonic activity occurring there" is that the few grid points that are over the Genoa Gulf are not considered when detecting the maximum of each variable over the window.
In order to clarify this point, the sentence will be rephrased in the paper P5L12-15: "The geographical window used for the detection of events is restricted to Western Europe where most of the exposure to the peril is localised. The grid-points over the Genoa Gulf are masked in order to avoid the detection of events occurring in this part of the spatial window and that are not in the scope of this study (Fig. 1)." R1.C3: Describing the "final" methodology before providing reasoning for the chosen procedure makes this section difficult up appreciate. I thus suggest a re-ordering. Also, it should be clear throughout the chapter that the individual detection methods will be considered and compared, not just the combination.
If this improves the understanding of the approach, a reordering is possible, explaining for each step its objective and added value.

R1.C4: On page 4266, a gathering of timesteps into one event is described which does not just look for a maximum of consecutive time steps but involves a "simple tracking". Why?
The "simple tracking" formulation is probably not adequate. In fact, we only apply a condition of eastward movement between consecutive timesteps, in order to separate events occurring in close succession, such as Lothar and Martin (1999) or Vivian and Wiebke (1990). In both cases, a vorticity maximum can enter the domain from the west as the previous one is leaving in the east or dying. This condition is otherwise easily satisfied by storms, and no major event is missed because of it.
To explain this, the sentence P10L2-5 will be rephrased into: "Second, in order to be gathered into possible events, consecutive maxima above the 95th percentile must fulfil two conditions: the distance between two consecutive maxima should be lower than 900 km, with an eastward shift. Indeed, ETCs being driven by the westerly jet stream, they follow an eastward trajectory and their travelling speed rarely exceeds 150 km per hour. These conditions enable to separate events such as Vivian and Wiebke (1990) or Lothar and Martin (1999) that occurred in quick succession." R1.C5: When combining the conditions assigned to the parameters, must they be fulfilled simultaneously? Could they as well be fulfilled at different time steps (e.g., 6, 12 or even 24 hours apart)?
Yes, the conditions must be fulfilled simultaneously for the three variables at least at one time step. Hence if the first time step of an event of pressure is detected 6 hours after the last time step of an event of vorticity, they will not be associated. Fig. 7 does not point at the result that the ordering of events according to damage seems to be met by the ordering in every single parameter considered when using ERA Interim. This result is not reproduced in the reduced resolution version of ERA Interim and NCEP. You should mention the reasons for this disagreement: Are, for example, the differences between the parameters small so that small differences due to smoothing produce large differences in the ordering? Figure 7 is probably misleading since the labels of the X-axis are not the same for the three subplots. We chose to take ERA Interim as the reference and ranked the events according to their ranking in ERA Interim. This is the reason why the reference events seem to be ranked according to both losses and ERA Interim whereas this is not the case.

R1.C6: Section 4.3 describing results shown in
In order to make clearer the fact that there is no variable that reproduce the ranking according to the loss, we plot the same figure but this time the ten storms are ranked according to their loss, Lothar being the costliest. The ranking storms are represented by blue squares for ERA Interim at 0.75°, turquoise diamonds for ERA Interim at 2.5° and black triangles for NCEP2. This figure shows that ranking is not a robust parameter to use when dealing with different models at different resolution. R1.C7: The agreement of the method's success between different re-analysis datasets does not automatically warrant a successful application on GCM output, as GCMs may produce a different agreement of the parameters as re-analysis.
Yes, but in a way GCMs are probably more able to reproduce statistics of large-scale features such as lowpressure centres and their evolutions, rather than details of extreme surface winds. So the kind of method we propose may be more adapted to GCM output than a refined regional model.

R2
.C2: The study shows that Lothar has little signature in NCEP in RV850 and MSLP, and therefore was rejected. However the conclusion overlooks this fact and reads as if all major events were detected.
The underestimation of Lothar was shown in another paper using wind-based indices (Pinto et al., 2012, see paper for the complete reference). Lothar was a small-scale system that may not be reproduced in a model at low resolution. Missing Lothar in a dataset such as NCEP2 is more an evidence of the limitation, with regard to the spatial resolution, of any study on damage-potential of windstorms associated with ETCs, than an evidence of the inability of our methodology to detect this event. This point will be added to the conclusion.

R2.GC: My conclusions from the study are the obvious finding that, when you consider overlapping storms in multiple variables, you reduce the number of storms detected.
Taking the intersection of different catalogues of events will reduce their number. What was not so obvious is that in the process, the most damaging events (reference storms) are retained. If taking higher thresholds in one variable such as the surface wind also reduces the number of events, a number of these reference events are lost.
The main conclusion of the paper is therefore not that you get a reduced number of events but that using a multi-variables approach leads to more selective results with our datasets than using wind-based indices. We thus hope it will provide a new perspective on detection studies.