Improvement of RAMS precipitation forecast at the short-range through lightning data assimilation

This study shows the application of a total lightning data assimilation technique to the RAMS (Regional Atmospheric Modeling System) forecast. The method, which can be used at high horizontal resolution, helps to initiate convection whenever flashes are observed by adding water vapour to the model grid column. The water vapour is added as a function of the flash rate, local temperature, and graupel mixing ratio. The methodology is set up to improve the shortterm (3 h) precipitation forecast and can be used in real-time forecasting applications. However, results are also presented for the daily precipitation for comparison with other studies. The methodology is applied to 20 cases that occurred in fall 2012, which were characterized by widespread convection and lightning activity. For these cases a detailed dataset of hourly precipitation containing thousands of rain gauges over Italy, which is the target area of this study, is available through the HyMeX (HYdrological cycle in the Mediterranean Experiment) initiative. This dataset gives the unique opportunity to verify the precipitation forecast at the short range (3 h) and over a wide area (Italy). Results for the 27 October case study show how the methodology works and its positive impact on the 3 h precipitation forecast. In particular, the model represents better convection over the sea using the lightning data assimilation and, when convection is advected over the land, the precipitation forecast improves over the land. It is also shown that the precise location of convection by lightning data assimilation improves the precipitation forecast at fine scales (meso-β). The application of the methodology to 20 cases gives a statistically robust evaluation of the impact of the total lightning data assimilation on the model performance. Results show an improvement of all statistical scores, with the exception of the bias. The probability of detection (POD) increases by 3–5 % for the 3 h forecast and by more than 5 % for daily precipitation, depending on the precipitation threshold considered. Score differences between simulations with or without data assimilation are significant at 95 % level for most scores and thresholds considered, showing the positive and statistically robust impact of the lightning data assimilation on the precipitation forecast.


Introduction
The inclusion of the effects of deep convection in the initial conditions of numerical weather prediction (NWP) models is one of the most important applications for reducing the spin-up time and improving initial conditions (Stensrud and Fritsch, 1994;Alexander et al., 1999).In recent years, several studies have shown the positive impact that lightning assimilation has on the weather forecast and especially on the precipitation forecast (Alexander et al., 1999;Chang et al., 2001;Papadopulos et al., 2005;Mansell et al., 2007;Fierro et al., 2013;Giannaros et al., 2016).
Lightning data are a proxy for identifying the occurrence of deep convection, which relates to convective precipitation (Goodman et al., 1988).In addition to their ability to locate precisely the deep convection and heavy precipitation, lightning data have several advantages: availability in real time with few gaps (reliability), compactness (a low band is required to transfer the data), and long-range detection of storms over the oceans and beyond the radars (Mansell et al., 2007).
Because of these properties, several techniques have been developed, in recent years, to assimilate lightning data in NWP.In the first studies (Alexander et al., 1999;Chang et al., 2001), lightning was used in conjunction with rainfall estimates from microwave data of polar orbiting satellites to derive a relation between the cloud to ground flashes and rainfall.Then the rainfall estimated from lightning was converted to latent heat nudging, which was assimilated in NWP (Jones and Macperson, 1997).These experiments showed a positive impact of the lightning data assimilation on the 12-24 h weather forecast.Papadopulos et al. (2005) nudged relative humidity profiles associated with deep convection and the adjustment was proportional to the flash rate observed by the ZEUS network (Lagouvardos et al., 2009).A modification of the Kain-Fritsch (Kain and Fritsch, 1993) convective parameterization in COAMPS (Coupled Ocean-Atmosphere Mesoscale Prediction System; Hodur, 1997) was introduced by Mansell et al. (2007).They enabled lightning to control the cumulus parameterization scheme activation.Recently, Giannaros et al. (2016) implemented a similar approach in the Weather Research and Forecasting (WRF) model, showing the positive and statistically robust impact of the lightning data assimilation on the 24 h rainfall forecast for eight convective events over Greece.Fierro et al. (2013) and Qie et al. (2014) introduced two lightning data assimilation schemes for the WRF model intervening on the mixing ratios of the hydrometeors (water vapour in the case of Fierro et al., 2013, and ice crystals, graupel and snow in Qie et al., 2014).Both studies, which are performed at cloud resolving scales, show that lightning assimilation improves the precipitation forecast.
Most of the studies cited above are based on a case study approach.However, Giannaros et al. (2016) applied the methodology to eight convective cases that occurred in Greece from 2010 to 2013.Considering a larger number of cases allowed them to statistically test the improvement of the precipitation forecast through lightning data assimilation.Moreover, their methodology is designed to be realistic and usable in the operational forecast.
In a recent study, Federico et al. (2014) introduced a scheme to simulate lightning in RAMS (Regional Atmospheric Modeling System).Because the lightning distribution is well correlated to areas of deep convection, they concluded that lightning simulation can be a useful tool to evaluate the reliability of the NWP forecast in real time.In their study, however, lightning observations were used as a diagnostic tool.
In this paper, a total lightning data assimilation algorithm is used in the RAMS model.The assimilation scheme is similar to that of Fierro et al. (2013), with few modifications to account for different spatial and temporal resolutions of the two studies and for the different model suites.In addition, the methodology presented in this paper is designed to be used in real-time NWP.This paper considers the short-term forecast (3 h), even though the results for daily precipitation, accumulated from the 3 h precipitation forecast, are also shown for completeness and for comparison with other studies.
To evaluate statistically the impact of the lightning data assimilation on the precipitation forecast, we consider 20 convective cases that occurred in fall 2012 over Italy, which is the target area of this study.Most of these events occurred during the HyMeX SOP1 (HYdrological cycle in the Mediterranean Experiment -first special observing period), which was held from 5 September 2012 to 6 November 2012.
HyMeX (Drobinski et al., 2014;Ducroq et al., 2014) is an international experimental program that aims to advance scientific knowledge of water cycle variability in the Mediterranean basin.This goal is pursued through monitoring, analysis, and modelling of the regional hydrological cycle in a seamless approach.In HyMeX special emphasis is given to the topics of the occurrence of heavy precipitation and floods, and their societal impacts, which were the subjects of the SOP1.One of the products of the HyMeX SOP1 is a database of hourly precipitation available for 2944 raingauges over Italy belonging to the Italian DPC (Department of Civil Protection; Davolio et al., 2015).This database extends behind the period of the HyMeX SOP1 and contains all the events considered in this paper.
The paper is organized as follows: Sect. 2 shows the RAMS configuration, the methodology used to assimilate total lightning data, and the strategy used in the simulations.Section 3 gives the results: first a case study of deep convection occurred over Italy during HyMeX SOP1 is considered to show how the lightning data assimilation works (Sect.3.1); then the scores for the 20 cases are shown in Sect.3.2, which also shows the statistical robustness of the difference between the precipitation forecasts of the simulations with or without total lightning data assimilation.The discussion and conclusions are given in Sect. 4.

The RAMS model configuration
The RAMS model is used in this study.This section is a brief description of the model set-up, while details on the model are given in Cotton et al. (2003).
We use two one-way nested domains at 10 km (R10) and 4 km (R4) horizontal resolutions, respectively (Fig. 1).The model is configured with 36 terrain-following vertical levels for both domains.The model top is at 22 400 m (about 40 hPa).The distance of the levels is gradually increased from 50 to 1200 m.Below 1000 m the spacing between levels is less than 200 m; above 5000 m the distance between levels is 1200 m.
The Land Ecosystem-Atmosphere Feedback model (LEAF) is used to calculate the exchange between soil, vegetation, and atmosphere (Walko et al., 2000).LEAF uses a patch representation of surface features (vegetation, soil, lakes and oceans, and snow cover) and includes several terms describing their interactions as well as their exchanges with the atmosphere.
Explicitly resolved precipitation is computed by the WRF single-moment six-class microphysics scheme (WSM6; Hong and Lim, 2006).This scheme was recently implemented in RAMS (Federico, 2016) and showed the best performance among the microphysics schemes available in the model for a forecast period spanning 50 days of the HyMeX SOP1 at 4 km horizontal resolution.The WSM6 scheme accounts for the following water variables: vapour, cloud water, cloud ice, rain, snow, and graupel.The best configuration of Federico (2016) is used in this paper and is hereafter referred to as control (CNTRL).
Subgrid-scale effect of clouds is parameterized following Molinari and Corsetti (1985).They proposed a form of the Kuo scheme (Kuo, 1974) accounting for updrafts and downdrafts.The convective scheme is applied to the 10 km grid only.
The unresolved transport is parameterized by the K theory following Smagorinsky (1963), which relates the mixing coefficients to the fluid strain rate and includes corrections for the influence of the Brunt-Vaisala frequency and the Richardson number (Pielke, 2002).
The Chen and Cotton (1983) scheme is used to compute short and long-wave radiation.The scheme accounts for condensate in the atmosphere but not for the specific hydrometeor type.
The initial and dynamic boundary conditions (BC) are introduced in Sect.2.3.
Before concluding this section, it is important to note that 4 km horizontal resolution of the finer grid corresponds to the grey area for convection and it is slightly below actual standards (2-3 km).This resolution was motivated by operational purposes: the methodology of this paper is implemented in a real-time weather forecasting system at ISAC-CNR and we study the performance of this specific system.Preliminary results of the impact of the horizontal resolution on the lightning assimilation are discussed in Sect. 4.

Lightning data and assimilation procedure
Lightning data used in this paper are those observed by LINET (LIghtning detection NETwork; Betz et al., 2009), which is a European lightning location network for highprecision detection of total lightning, cloud to ground and intra cloud lightning, with utilization of VLF/LF techniques (in range between 1 and 200 KHz).
The network has more than 550 sensors in several countries worldwide, with very good coverage over central Europe and central and western Mediterranean (from 10 • W to 35 • E in longitude and from 30 to 65 • N in latitude).The lightning three-dimensional location is detected using the time of arrival difference triangulation technique (Betz et al., 2009).The lightning strokes are detected with high precision (150 m for an average distance between sensors of 200 km) in both horizontal and vertical directions.The LINET "strokes" are grouped into "flashes" before assimilation in the model.In particular, all events recorded by LINET that occur within 1 s and in an area with a radius of 10 km are binned into a single flash (Federico et al., 2014).
Observed flashes are mapped onto the RAMS grid for assimilation in space and time.In particular, the assimilation procedure computes the number of flashes occurring in each RAMS grid cell in the past 5 min (X).Then the water vapour mixing ratio is computed as where A = 0.86, B = 0.15, C = 0.30 D = 0.25, α = 2.2, q s is the saturation mixing ratio at the model atmospheric temperature, and Q g is the graupel mixing ratio (g kg −1 ).The water vapour mixing ratio derived from Eq. ( 1) is similar to Fierro et al. (2013)
The water vapour derived from Eq. ( 1) is substituted to the simulated value at a grid point where electric activity is observed and relative humidity is below 86 %.By this choice we only add water vapour to the simulated field, leaving it unchanged if the simulated water vapour is larger than that of Eq. ( 1).Moreover, the water vapour is substituted only in the charging zone (from 0 to −25 • C), which is the mixed-phase graupel-rich zone associated with electrification and lightning activity (MacGorman and Rust, 1998).The increase of q v , Eq. ( 1), is inversely proportional to the simulated graupel mixing ratio.When Q g is 3 g kg −1 the second term of the right-hand side of Eq. ( 1) is ineffective (see Fig. 7 of Fierro et al., 2013 for the dependency of Eq. 1 on the graupel mixing ratio).For a given value of Q g between 0 and 3 g kg −1 , the water vapour of Eq. ( 1) increases as a function of the gridded flash rate X.
It is noted that we change the water vapour in the charging zone between 0 and −25 • C, without a relaxing zone.The water vapour, however, is redistributed by the model advection, diffusion, and diabatic processes and is considerably changed outside the charging zone (see the discussion of this paper; Federico et al., 2016).

Simulation strategy and verification
Twenty case studies that occurred in fall 2012 were selected.The events are reported in Table 1 and were all characterized by widespread convection, lightning activity, and moderateheavy precipitation over Italy.The events of Table 1 comprise eight of the nine IOP (intense observing period) declared in Italy (see Table 5 of Ferretti et al., 2014, for the complete list of the IOP) during HyMeX SOP1 and few other cases of November 2012.
A 36 h forecast at 10 km horizontal resolution is performed for each case (R10).The initial and BC for this run are given by the 12:00 UTC assimilation-forecast cycle of the ECMWF (European Centre for Medium-Range Weather Forecasts).Initial and BC are available at 0.25 • horizontal resolution.The R10 forecast starts at 12:00 UTC of the day before the day of interest (actual day, Table 1) and the first 12 h, which also account for the spin-up time, are discarded from the evaluation.The R10 forecast is made to give the initial and BC to the 4 km horizontal resolution forecast (R4), avoiding the abrupt change of resolution from the ECMWF initial conditions and BC (0.25 • ) to the R4 horizontal resolution.
Starting from R10 as initial and BC, three kind of simulations, all using the R4 configuration, are performed for each event.(a) Simulation CNTRL is performed by nesting R4 in R10 using a one-way nest and without doing lightning data assimilation.Each CNTRL simulation starts at 18:00 UTC of the day before the actual day and the first 6 h, which account for the spin-up time, are discarded from the evaluation.(b) Simulations F3HA6 consist of eight runs of 9 h duration.During the first 6 h, lightning data are assimilated following the procedure described in the previous section.Then, a short-term 3 h forecast is made.Eight F3HA6 simulations are needed to span the forecast of a whole day (Fig. 2).The first simulation starts at 18:00 UTC the day before the actual day, using as initial and BC the R10 forecast, and gives the forecast for the hours 00:00-03:00 UTC of the actual day.The second F3HA6 simulation starts at 21:00 UTC of the day before the actual day using as initial conditions the previous R4 forecast, belonging to F3HA6 set of simulations, and as BC the R10 forecast.Lightning is assimilated from 21:00 UTC of the day before to 03:00 UTC of the actual day, while the forecast is valid for 03:00-06:00 UTC of the actual day.The F3HA6 forecasts from three to eight proceed as the second but shifted every time 3 h ahead.Please note the switch of the initial conditions between the first and second F3HA6 simulations from R10 to R4.This is done to maximize the impact of lightning data assimilation on the F3HA6 run, since the initial conditions provided by R4 are produced by a simulation using lightning data, while in R10 lightning data are not used.(c) Simulation ASSIM is performed by nesting R4 in R10 using a one-way nest and doing lightning data assimilation for the whole run.Each ASSIM simulation starts at 18:00 UTC of the day before the actual day and the first 6 h of forecast are considered as spin-up time and are discarded from the evaluation.The ASSIM simulation continuously assimilates lightning data and, because it represents better convection during the events compared to CNTRL and F3HA6, has the best performance (Sect.3.2).The ASSIM configuration can be useful when analysing the events but cannot be used for the forecast because it needs real-time lightning data as the integration time advances.
It is noted that the configuration F3HA6 was chosen because it can be applied in the operational context.The simulation R10 takes less than 1 h to complete the 36 h forecast on a 64 core state of the art cluster.Each simulation F3HA6 takes 20-25 min using a 64 cores state of the art cluster, which makes the forecast available for operational purposes.Continuous advancing of computing power will give the possibility to apply the methodology at finer horizontal resolutions for extended areas, as that considered in this paper, as well as to reach the kilometric scale for limited areas.
Even though the main focus of this paper is on the shortterm (3 h) forecast, the daily precipitation accumulated from the 3 h forecasts is also considered for comparison with other studies available in the literature.For F3HA6 the daily precipitation is given by adding the eight 3 h forecasts available for the actual day (Fig. 2).
One of the products of the HyMeX initiative is a database of hourly precipitation from the network of the DPC of Italy, which consists of 2944 rain gauges all over Italy.The dataset is available at http://mistrals.sedoo.fr/?editDatsId= 1282&datsId=1282&project_name=MISTR&q=DPC and it is used to derive 3 h and daily rainfall, which is then used to verify the model.
For the verification of the quantitative precipitation forecast (QPF), the model output at the closest grid point of a rain gauge is considered.When two or more rain gauges fall in the same model grid cell, the average precipitation recorded by these rain gauges is considered.
The POD gives the fraction of the observed rain events that were correctly forecast.The FAR gives the fraction of rain forecast events that did not occur.The bias tells us the fraction of rain forecast events with respect to the rain observed events.The ETS measures the fraction of observed and/or forecast rain events that were correctly predicted, adjusted for hits associated with a random forecast, where the forecast occurrence/non-occurrence is independent of observation/non-observation.
In order to have a measure of the difference between the CNTRL and F3HA6 forecast, a hypothesis test to verify that the score difference between the two competing models is significant at a predefined significance level (90 %, α = 0.1; or 95 %, α = 0.05) is made.The test was originally proposed by Hamill (1999), is based on resampling, and is discussed in Appendix A.

The 27 October 2012 case study
The event studied in this section is taken from the HyMeX SOP1 campaign, which was focused on heavy precipitation and its societal impact (Ducroq et al., 2014;Ferretti et al., 2014).Nine of the 20 IOPs considered in SOP1 occurred in Italy.
During SOP1, several upper level troughs extended from the northern and central Europe toward the Mediterranean Basin or entered in the basin as deep trough.Few of them developed a cut-off low at 500 hPa; the interaction between the upper level troughs and the orography of the Alps generated a low pressure pattern at the surface in Northern Italy, and usually the whole system moved along the Italian peninsula.The 27 October 2012 case study, also referred as IOP16a, belongs to this class of events, and it eventually evolved in a cut-off at 500 hPa on 28-29 October (IOP16c).This event, characterized by widespread convection and intense lightning activity, caused huge precipitation all along the peninsula and also peak values of water level on the Venice Lagoon, where the sea level exceeded twice the warning level of 120 cm (Casaioli et al., 2013;Mariani et al., 2015).
Figure 3 shows the synoptic situation at 12:00 UTC on 27 October 2012.At 500 hPa, Fig. 3a, a trough extends from NE Europe toward the western Mediterranean.The interaction between the trough and the Alps generated a mesolow over Northern Italy, as shown by the 990 hPa contour in Fig. 3b, that caused a cyclonic circulation over most of the peninsula.
In these synoptic conditions, winds over the Tyrrhenian Sea are from W and SW and bring humid and unstable air over the mainland of Italy.The interaction between the unstable air and the orography of Italy reinforced convection, which was already occurring over the sea as shown by the intense electric activity over the Tyrrhenian Sea (see below).
Figure 4a shows the lightning distribution observed by LINET on 27 October 2012.From Fig. 4a, convection is apparent over the Tyrrhenian Sea and it is enhanced over land because of the interaction between the humid and unstable air masses from the sea and the orography of Italy.
The daily precipitation (Fig. 4b) shows the widespread convection over the Apennines, with several stations reporting more than 90 mm day −1 .More than 200 mm rain is reported in two stations in Southern Italy (15.84 • E, 40.31 • N; 207 mm) and (15.98 • E, 40.16 • N; 220 mm), while the largest precipitation recorded in NE Italy is 141 mm (13.54 • E, 45.85 • N).Note also the abundant precipitation over Sardinia and over the north-east of Italy.It is important to note that the rainfall of Fig. 4b is computed by summing the 1 h precipitation registered by the rain gauges.If one of the 1 h observations is unavailable, the rain gauge does not appear in Fig. 4b.So, when verifying the precipitation for shorter timescales, different rain gauges could appear compared to those of Fig. 4b.
Figure 5a and b show the daily precipitation forecast of the CNTRL run and the daily accumulated precipitation of the F3HA6 run. Figure 5a and b shows a high precipitation amount over the Apennines (> 90 mm day −1 ) all along the peninsula, in agreement with observations.However, the precipitation is overestimated by both CNTRL and F3HA6, especially above 30 mm day −1 .This is apparent by comparing the area of the 90 mm day −1 threshold in Fig. 5a and b with the comparatively few rain gauges reporting this precipitation amount.As it will be shown in the next section, this is a general behaviour of the RAMS model with the set-up used in this paper.Other features shown by Fig. 5a and b are a very heavy precipitation spell in NE Italy, whose area is overestimated by CNTRL and F3HA6; a high precipitation spell over the Liguria-Tuscany area, which is only partially revealed by observations due to the lack of data; a moderate precipitation over Sardinia, which is underestimated by the CNTRL forecast both for the precipitation area and amount.
Even if CNTRL and F3HA6 share several precipitation features in common, there are important differences between Fig. 5a and b.Convection over the sea is underestimated by CNTRL.Even if we cannot prove it by the precipitation amount, the intense electrical activity over the central Mediterranean Sea, and especially over the Tyrrhenian Sea, shows that the convective activity over the sea is underestimated by CNTRL.
Convection over the sea is simulated by F3HA6 thanks to the lightning data assimilation.When convection is advected over the land it increases the precipitation.This is clearly shown by the precipitation over Sardinia, which increases both in areal coverage and rainfall amount for F3HA6 compared to CNTRL.
Other differences between the precipitation field of CN-TRL and F3HA6 can be discussed more easily by the difference of the precipitation fields.Figure 5c shows the precipitation difference between CNTRL and F3HA6 in this order, so that positive values show larger precipitation for CNTRL, while negative values show larger precipitation for F3HA6.
From Fig. 5c it is apparent that the precipitation of F3HA6 increases over large areas of the domain, especially over the Tyrrhenian Sea.The rainfall over Sardinia increases up to 40 mm day −1 , showing the important impact of the light- ning assimilation on the forecast.However, the largest differences are found along the Apennines with values up to 80 mm day −1 .
In general, the lightning assimilation increases the precipitation, but Fig. 5c also shows areas where the precipitation of F3HA6 decreases compared to CNTRL because of the different evolution of the storm in the two simulations.This is especially evident over the Adriatic coast of the Balkans where positive-negative patterns alternate every few tens of kilometres.We will discuss further this point later on in this section.
Up to now, we considered the impact of the lightning assimilation on the daily precipitation, i.e. when the rainfall of the eight F3HA6 forecasts in a day are added, but the main focus of this paper is on the short-term precipitation forecast.To consider this point, Fig. 6a shows the observed precipitation accumulated between 06:00 and 09:00 UTC, and the corresponding precipitation for the CNTRL (Fig. 6b) and F3HA6 (Fig. 6c).
Figure 6a shows considerable precipitation spells (about 40 mm/3 h) over NE Italy, in some spots over the Apennines all along Italy, and, somewhat smaller, over Sardinia.
Comparing Fig. 6b with Fig. 6a it is apparent that the CN-TRL forecast is able to catch several features of the precipitation field, as the local spots of heavy rain over the Apennines or the rain spell over NE Italy, the main error being the scarce precipitation simulated over Sardinia.This issue is in part solved by the F3HA6 forecast (Fig. 6c), which shows larger precipitation compared to CNTRL over Sardinia.
To better focus on the improvement given by the lightning data assimilation on the short-term QPF, we consider the precipitation hits, i.e. the correct forecasts, of the contingency tables.Figure 7a shows the difference between the hits of the F3HA6 and CNTRL (in this order) for the 1 mm/3 h (8 mm day −1 ) threshold.In Fig. 7a, the +1 (red asterisk) shows a station where the CNTRL forecast did not predict a precipitation equal or larger than the threshold, while the F3HA6 correctly predicted a rainfall equal or larger than the threshold at the rain gauge.The −1 value (blue asterisk) shows the opposite behaviour.In Fig. 7a there are 52 new correctly predicted events for F3HA6.They are located in the Apennines and, mostly, over Sardinia, where CNTRL missed the forecast (Fig. 5a and b).There are also two stations where the lightning assimilation worsens the forecast, because of the different evolutions of the storms in CNTRL and F3HA6, but the benefits of the lightning data assimilation on the short-term QPF are nevertheless apparent for the 1 mm/3 h threshold.
Figure 7b shows the difference between the hits of F3HA6 and CNTRL for the 10 mm/3 h (80 mm day −1 ) threshold, which is more interesting when considering moderate-high rainfall amounts.For this threshold, the lightning data assimilation improves the forecast because 12 new events are correctly predicted by F3HA6 along the Apennines and over Sardinia.
It is important to note the precision of the correction to the precipitation field given by the lightning data assimilation.The positive-negative pattern of the difference between the precipitation fields of CNTRL and F3HA6 (shown for the daily precipitation in Fig. 5c, with amplitudes of tens of kilometres in the Central Apennines) is also found, with lower amplitude, for the 3 h forecast (not shown).The F3HA6 forecast gave the correct prediction of several new stations for both 1 mm/3 h (52 rain gauges) and 10 mm/3 h (12 rain gauges) thresholds, while losing only two stations correctly predicted by CNTRL for the 1 mm/3 h threshold.This shows not only that the precipitation is added where necessary but also that it is subtracted where it did not occur; i.e.only two correct forecasts are lost by the lightning data assimilation.For example, between 03:00 and 06:00 UTC there are 110 stations where the precipitation is reduced by more than 1 mm/3 h, 20 stations where it is reduced by more than 5 mm/3 h, and 7 stations for which the precipitation is reduced by more than 10 mm/3 h.
It is worth noting that the stations correctly forecast by both CNTRL and F3HA6 for a given precipitation threshold do not appear in Fig. 7a and b.This occurs, for example, for the rain gauges in NE Italy.
This section showed how the data assimilation technique of this study works and how it is able to add new correct forecasts (hits) to CNTRL for a case study.In the following section, scores based on contingency tables are presented for a total of 20 case studies in order to quantify, in a statistically robust way, the benefits of the total lightning data assimilation on the short-term QPF.

Statistical scores
In this section we discuss the statistical scores of the F3HA6 forecast in comparison to CNTRL.The results of the ASSIM run are also presented as the benchmark for lightning data assimilation.First we discuss the results for the daily precipitation accumulated starting from 3 h rainfall forecasts.Figure 8a shows that the bias increases with the threshold from 0.8-1.0(1 mm day −1 threshold, depending on the type of simulation) to 2.3-2.6 (60 mm day −1 threshold), showing a considerable overestimation of the forecast area for the larger thresholds (> 40 mm day −1 ).The lightning data assimilation improves the bias up to 10 mm day −1 (both F3HA6 and ASSIM), while performance is worsened by data assimilation for larger thresholds.As expected, the ASSIM shows the largest bias, followed by F3HA6 and CNTRL.This is caused by the addition of water vapour by the data assimilation, which is larger for ASSIM (assimilation performed continuously) compared to F3HA6 (assimilation is not performed in the forecast phase).The statistical test to assess the bias difference between CNTRL and F3HA6 shows that the two scores are different at 95 % significance level for all thresholds, showing the significant impact of the lightning data assimilation on the precipitation forecast.
The overestimation of the precipitation area for higher thresholds is evident, as discussed in the previous section, in Fig. 5a and b over the Apennines for the 90 mm day −1 threshold (the ASSIM simulation, not shown, does not differ substantially from F3HA6).Comparing the result of the bias with the same result of Federico (2016), where the same configuration of the RAMS model of CNTRL was used, we note a considerable increase of the bias in this work.This difference is caused by the fact that Federico (2016) considered 50 consecutive days of the HyMeX SOP1, i.e. with heavy, moderate, and small precipitation, while this study considers only cases with deep and widespread convection.The RAMS with WSM6 scheme shows the tendency to overestimate the bias for increasing precipitation (Federico, 2016; see also Liu et al., 2011, for a general comparison of the WSM6 microphysical scheme and other microphysical schemes available in the WRF model), and this tendency is amplified for the heavy precipitation events considered in this work.
Figure 8b shows the ETS score.For CNTRL it decreases from 0.35 (1 mm day −1 ) to 0.17 (60 mm day −1 ).The ETS increases for F3HA6, especially for thresholds lower than 40 mm day −1 , showing the positive impact of the lightning assimilation on the precipitation forecast.The difference of the ETS for F3HA6 and CNTRL is statistically significant at 95 % level for thresholds up to 20 mm day −1 and not significant for larger precipitation.The ASSIM simulations show a further increase of the ETS compared to F3HA6 because of their ability to better represent convection during the simulation through lightning data assimilation.
The POD (Fig. 8c) for CNTRL decreases from 0.70 (1 mm day −1 ) to 0.52 (60 mm day −1 ); i.e. half of the potentially dangerous events are correctly predicted.It is also noted the rather stable value of the POD (0.6) between the 10 and 40 mm day −1 thresholds.The POD increases for F3HA6.The lowest increment is attained for 60 mm day −1 (0.04, i.e. 4 % more potentially dangerous events are correctly forecast compared to CNTRL), the largest for the 1 mm day −1 (6.5 %).Differences between the POD of CN-TRL and F3HA6 are significant at 95 % level for all thresholds showing the robust improvement of the performance for this score using lightning data assimilation.Notably, the AS-SIM run increases the POD of 8-10 %, depending on the threshold.
The FAR for CNTRL (Fig. 8d) increases from less than 0.2 (1 mm day −1 threshold; i.e. less than 20 % of the forecasts are false alarms) to 0.8 (60 mm day −1 threshold; i.e. 80 % of the forecasts are false alarms).The lightning assimilation improves the performance for the FAR but differences are statistically significant for 1 mm day −1 (90 % level), 5 and 10 mm day −1 (95 % level).The inspection of the contingency tables shows that the improvement of the FAR for those thresholds is attained by a larger number of hits but there is also an increase of the false alarms.In general, the lighting assimilation increases the precipitation, which is already overestimated for the larger thresholds by CN-TRL.So, the POD and the hit rate are increased by lightning data assimilation but also the false alarms, which were already reported in CNTRL, especially for the larger thresholds (> 30 mm day −1 ).In any case, we believe that the result is overall helpful for operational purposes.
Figure 9a shows the bias for the 3 h precipitation forecast.The bias for CNTRL increases from about 1 (0.2 mm/3 h threshold) to 2.5 (20 mm/3 h threshold).The bias differences between CNTRL and F3HA6 are significant at 95 % level for all thresholds.
The ETS score (Fig. 9b) for CNTRL shows a decrease from 0.33 (0.2 mm/3 h threshold) to 0.13 (20 mm/3 h threshold).The ETS is larger for F3HA6 compared to CNTRL and the differences of the scores are significant at 95 % level for all thresholds.It is also noted that, while the ETS is positive for all thresholds, the ETS value is rather low for the 20 mm/3 h threshold, limiting the usefulness of the forecast.
Figure 9d shows the FAR for the 3 h forecast.The FAR increases from 0.3 to 0.83 for the CNTRL forecast.The FAR for F3HA6 decreases (1-3 % depending on the threshold) and the improvement is the result of the increase of the hits but it is also associated with an increase of the false alarms.

Discussion and conclusions
This study shows the application of a total lightning data assimilation technique, developed by Fierro et al. (2013), to the RAMS model with WSM6 microphysics scheme (Federico, 2016).The technique adds water vapour to grid columns where flashes are observed, and the water vapour added at constant temperature depends on the flash rate and on the graupel mixing ratio.Water vapour is added to the model when suitable, while the water vapour is unchanged when the model predicts a value larger than that of the data assimilation algorithm.This paper shows a realistic implementation of the assimilation-forecast procedure that can be adopted in operational weather forecast.
The results of this paper show that the methodology is effective at improving the short-term (3 h) precipitation forecast.More in detail, the analysis of 27 October shows that the total lightning data assimilation is able to trigger convection over the sea and, when convection is advected over the land, it improves the short-term precipitation forecast.This effect is apparent over Sardinia for the case study.The humid marine air masses interact with the local orography, causing or reinforcing convection.Also, the lightning data assimilation improves the rainfall forecast adding precipitation where it is observed and increasing the hits of the short-term forecast.
The advection of convection from the sea to the land was important in most case studies considered in this paper, and we can conclude that it plays a fundamental role.There are cases, however, when it is less important, as for the severe and localized storm that occurred in NE Italy on 12 September 2012 (Manzato et al., 2014).For this case, the storm developed and evolved over land, and the difference between the precipitation field of the CNTRL and F3HA6 is confined inland, over NE Italy, and it is larger than 40 mm (see the discussion of this paper for the map of the precipitation difference between CNTRL and F3HA6; Federico et al., 2016).
The analysis of the scores for the 3 h precipitation forecast, computed for 20 cases characterized by intense lightning activity and widespread convection, confirms the improvement of the precipitation forecast using lightning data assimilation.The ETS and POD increase for all thresholds considered for F3HA6 compared to CNTRL and the difference between the scores of the competing forecasts is significant at 95 % level for all thresholds.The FAR is also improved and the difference between the scores of F3HA6 and CNTRL is statistically significant for all thresholds with the exception of the 15 mm/3 h.The FAR improvement of F3HA6 is caused by the increase of the hits, but it is also associated with a larger number of false alarms.
The bias is the only score that worsens with lightning data assimilation.The bias of the RAMS model with the WSM6 microphysics scheme is larger than one for most thresholds for the case studies of this paper.Because the lightning data assimilation adds water vapour to the model, the tendency to overestimate the precipitation area, especially for the larger thresholds, is worsened by the lightning data assimilation.
In addition to the 3 h forecast, the scores and precipitation field are analysed for the daily precipitation for completeness and for comparison with other studies.Recently, Giannaros et al. ( 2016) presented the WRF-LTNGDA, a lightning data assimilation technique implemented in WRF.They presented the results for eight cases in Greece.Their assimilation strategy focuses on the daily rainfall prediction (tomorrow daily precipitation).Their analysis (see their Fig.3; note also that the maximum precipitation threshold is 20 mm day −1 in their study) shows that the POD increases when lightning data assimilation is compared to CNTRL, and the increase of the POD is up to 5 %.Moreover, for some thresholds, the lightning assimilation lowers the POD because of the different patterns followed by the storms in the simulations with or without lightning data assimilation.
Our results show that the POD improves for all precipitation thresholds when lightning data assimilation is used and the percentage of improvement is slightly better than that reported in Giannaros et al. (2016) for the lower thresholds (below 10 mm day −1 ).Even if we cannot give a definitive answer to this issue, because of the many important differences between this study and that of Giannaros et al. (2016), the lightning data assimilation technique has a role.In our case, lightning data are assimilated also for the actual day (6 h assimilation before the forecast start time followed by 3 h forecast, Fig. 2), while in Giannaros et al. (2016) the assimilation is done only for the day before the actual day (6 h assimilation followed by 24 h forecast).So, our technique should improve the correct location of convection during the actual day compared to their approach, as shown by the improvement, i.e. the difference between the POD of the simulations with or without lightning data assimilation.
However, other differences play a role: first the two studies refer to different regions and to different events.In our case the extension of the region, the number of the events, and the number of verifying stations are larger.Moreover, two different model suites are used (WRF and RAMS).These differences are clearly seen in the score values.The POD of Giannaros et al. ( 2016) is larger than that of this study, especially for thresholds lower than 20 mm day −1 .Another important difference arises from the different convective nature of the storms considered in the two works.The performance of the precipitation forecast is clearly dependent on the type of event, i.e. widespread or localized convection (Giannaros et al., 2016), and, because the events considered in the two studies are different, the comparison can be only qualitative.Nevertheless, both studies show that the lightning data assimilation improves the precipitation forecast robustly and can be used in the operational context.
While the results of this study are encouraging, there are a number of issues that need further investigation.The water vapour is added to the grid column where the lightning is observed.However, the lightning is often the result of a process involving larger scales than the horizontal grid spacing considered in this paper (4 km).A spatial extension of the influence of the lightning perturbation on the water vapour field should be explored.For this approach the applications of the methods involving the model error matrix are foreseeable and will be investigated in future studies.The problem of the spatial extension of the water vapour perturbation caused by lightning to the model was considered in Fierro et al. (2013) by remapping the flashes onto a coarser horizontal resolution grid (9 km), while no similar approach is done in this study.
A problem arising with the RAMS model using the WSM6 microphysics scheme is the overestimation of the precipitation area for large rainfall thresholds.This tendency was already noted in Federico (2016), and it is amplified for the cases of widespread convection considered in this study.The high number of false alarms decreases the ETS score for high precipitation, reducing the applicability of the method for the largest thresholds (> 100 mm day −1 ).The application of different microphysical schemes could mitigate this issue.
Finally, horizontal resolutions higher than those of this paper are needed to better resolve the orography and its interaction with air masses.To quantify this point preliminary, we increased the horizontal resolution of the second domain from 4 to 2.5 km for the 15 October and 27 October case studies.Results for the two cases show that the impact of the resolution is notable because the precipitation patterns, especially for larger thresholds (> 50 mm day −1 ), are less spread in the 2.5 km horizontal resolution experiment compared to 4 km forecast (see the discussion of this paper for the daily precipitation maps for the two cases; Federico et al., 2016).This impact could be beneficial for the scores of the F3HA6 forecast because it has the tendency to overestimate the precipitation area at high thresholds, as shown in this paper.However, these results are preliminary, and future studies are needed to quantify the important impact of the horizontal resolution on the lightning data assimilation forecast.

Data availability
The dataset of daily and 3-hourly precipitation are not publicly available but can be requested from the first author.Send your request by e-mail.For the dataset of hourly precipitation of this paper see the Assets tab.

Figure 1 .
Figure 1.The two domains (D1, D2).D1 has 301 grid points in both the WE and SN directions; D2 has 401 grid points in both WE and SN directions.

Figure 2 .
Figure 2. Synopsis of the simulations F3HA6 (below the timeline).The blue line is the assimilation stage, while the red line is the forecast stage; d, d + 1, and d − 1 are the actual day, the day after, and the day before the actual day, respectively.In the upper part of the figure the CNTRL and ASSIM simulations are shown.

Figure 3 .
Figure 3. Synoptic situation at 12:00 UTC on 27 October 2012.(a) 500 hPa: temperature (black contours from 236 to 263 K every 3 K), geopotential height (filled contours, values shown by the colour bar at the bottom), and wind vectors (maximum wind value 41 m s −1 ).(b) Surface: sea level pressure (contour from 975 to 1020 hPa every 5 hPa, the thick line is the 990 hPa contour), equivalent potential temperature (filled contours, values shown by the colour bar at the bottom), and winds (maximum wind vector is 17 m s −1 ) simulated at 25 m above the underlying surface in the terrain-following coordinates of RAMS.This figure is derived from the RAMS run at 10 km horizontal resolution.The bottom and left axes show the grid point number, while the top and right axes show the geographical coordinates.

Figure 4 .
Figure 4. (a) Lightning density on 27 October 2012 (number of flashes/16 km 2 ).The lightning number is obtained by remapping the lightning observed by LINET onto the RAMS grid at 4 km horizontal resolution.Note that the lightning is cut on all sides (this is especially evident on the eastern bound) because of the data availability.The bottom and left axes show the grid point number, while the top and right axes show the geographical coordinates.(b) Daily precipitation (mm) recorded by available rain gauges on 27 October 2012.

Figure 5 .
Figure 5. (a) Daily precipitation (mm) forecast of CNTRL (maximum value 300 mm in Southern Italy; over NE Italy the maximum value is 135 mm); (b) daily precipitation (mm) forecast obtained by summing the eight 3 h forecasts of F3HA6 (the maximum value is 320 mm in Southern Italy; over NE Italy the maximum simulated value is 132 mm); (c) difference of daily precipitation (mm) between CNTRL and F3HA6.

Figure 6 .
Figure 6.(a) Precipitation (mm) recorded by rain gauges between 06:00 and 09:00 UTC; (b) as in (a) for the CNTRL forecast; (c) as in (a) for the F3HA6 forecast.

Figure 7 .
Figure 7. (a) Difference between the hits of the contingency tables of F3HA6 and CNTRL for the 1 mm/3 h (8 mm day −1 ) forecast; (b) as in (a) for the 10 mm/3 h (80 mm day −1 ) threshold.

Figure 8 .
Figure 8. Scores for the daily precipitation computed by summing the contingency tables of all 20 case studies: (a) bias (the line of the perfect score 1.0 is shown in black); (b) equitable threat score; (c) probability of detection; (d) false alarm ratio.F3HA6 is in green, ASSIM is in red, and CNTRL in blue.The asterisks above the x axis show the results of the hypothesis testing (95 % blue, 90 % red) of the difference between F3HA6 and CNTRL scores.

Figure 9 .
Figure 9. Scores for the 3 h precipitation computed by summing the 160 contingency tables of the 20 case studies: (a) bias (the line of the perfect score 1.0 is shown in black); (b) equitable threat score; (c) probability of detection; (d) false alarm ratio.F3HA6 is in green, ASSIM is in red, and CNTRL in blue.The asterisks above the x axis show the results of the hypothesis testing (95 % blue, 90 % red) of the difference between F3HA6 and CNTRL scores.

Table 1 .
The 20 case studies.