Uncertainty in flood damage estimates and its potential effect on investment decisions

This paper addresses the large differences that are found between damage estimates of different flood damage models. It explains how implicit assumptions in flood damage functions and maximum damages can have large effects on flood damage estimates. This explanation is then used to quantify the uncertainty in the damage estimates with a Monte Carlo analysis. The Monte Carlo analysis uses a damage function library with 272 functions from seven different flood damage models. The paper shows that the resulting uncertainties in estimated damages are in the order of magnitude of a factor of 2 to 5. The uncertainty is typically larger for flood events with small water depths and for smaller flood events. The implications of the uncertainty in damage estimates for flood risk management are illustrated by a case study in which the economic optimal investment strategy for a dike segment in the Netherlands is determined. The case study shows that the uncertainty in flood damage estimates can lead to significant overor under-investments.


Introduction
Flood damage assessment is an essential aspect of flood risk management . It is used for supporting policy analysis and flood insurance. In the Netherlands flood damage estimates are used, for example, to determine economic optimal protection standards for flood defenses (van der Most et al., 2014), prioritize investments (Jongejan and Maaskant, 2013) or to compare the impact of different flood risk management strategies (Kind et al., 2014).
The most commonly used method for flood damage assessment is the unit loss method (De Bruijn, 2005). This method assesses the damage for each unit separately. This assessment is based on a maximum damage per object and a damage function. A damage function describes the relationship between a flood characteristic (most often water depth) and the fraction of the economic loss that occurs to the object that is damaged.
There are many different flood damage models all based on the unit loss method (e.g., HIS-SSM for the Netherlands (Kok et al., 2005), Multi-Coloured Manual in the UK (Penning-Rowsell et al., 2005), HAZUS in the USA (Scawthorn et al., 2006) and FLEMO in Germany, Thieken et al., 2008;Kreibich et al., 2010). These models differ for good reasons. Each model is specifically derived for a specific country, region and/or flood type and tailored to characteristics of the flooding and objects in the considered region (Cammerer et al., 2013).
When these different models are applied to one and the same event, they will yield significantly different results (De Moel and Aerts, 2011;Jongman et al., 2012;Chatterton et al., 2014). Jongman et al. (2012) compared the damage outcomes of seven different flood damage models with the recorded flood damages from events in the UK and Germany. The difference between the smallest and largest estimate/recording was a factor of 5 for the German event and a factor of 10 for the event in the UK. Chatterton et al. (2014) compared two different damage assessments for a region in the UK. The damage estimates differed by about a factor of 5 to 6 for both residential and commercial damages. These large differences in outcomes, for events for which the different models all should be applicable, indicate that flood damage estimation is prone to large uncertainties and thus that potentially large errors can occur when flood damage models are applied.
The uncertainties and potential errors in damage estimates affect decision-making based on those damage estimates. A quantification of the uncertainty in the damage estimates can help to get an insight in the potential error that can occur in a decision based on the flood damage estimate and may improve the decision-making process. USACE (1992) and Peterman and Anderson (1999) both showed that taking ranges of uncertainty into account can lead to different decisions than using single value estimates.
Furthermore, uncertainty quantification is useful to expose key focus points for the improvement of flood damage estimation methods. To reduce uncertainties, additional effort may be needed in researching the flood damages or in collecting data on damaged objects during floods.
Previous uncertainty analysis did not show a common understanding of the size or cause of uncertainties, which also indicates that further research is needed. Generally, uncertainty in flood damage assessment is quantified with forward uncertainty propagation methods which use Monte Carlo simulations (Merz et al., 2004;Egorova et al., 2008, Apel et al., 2008De Moel et al., 2012). The results of Egorova et al. (2008) indicate moderate uncertainties, which is in contrast with the large differences between flood damage models that were found by De Moel and Aerts (2011), Jongman et al. (2012) and Chatterton et al. (2014).
This paper provides a method to get a robust estimate of the uncertainty in damage estimates based on an analysis of the cause of the large differences between the various existing damage models. The method is illustrated with clear hypothetical examples and then applied to a case to show its use in decision-making for protection standards. The method makes use of a damage function library with 272 damage functions from seven different flood damage models.
The paper focuses on direct material damage. Indirect damages including damages due to business interruption are not considered here, since their analysis requires different methods. The paper starts with a qualitative analysis of the uncertainty found in flood damage models. This qualitative analysis is the basis for the assumptions made in a Monte Carlo analysis which is used to quantify uncertainty. Next, the Monte Carlo analysis is described and discussed in detail. Finally, the Monte Carlo analysis is applied to a case study in the Netherlands and the resulting uncertainties in damage estimates and the effects on flood risk management decisions are discussed.

Qualitative uncertainty analysis
This section first provides a detailed description of unit loss flood damage models and then places all elements of such models into a framework for uncertainty classification in order to generate a detailed qualitative understanding of the uncertainty sources and effects and their correlation. The understanding is applied in Sect. 3 to enable a quantitative uncertainty analysis.

The unit loss method for flood damage assessment
The unit loss method uses relationships between flood characteristics and damages to a unit. The unit loss method consists of four elements: the maximum damage s i for each category, the flood characteristics (such as water depth d) at all locations j , the damage functions (f (d)) for all categories which determine the damage fraction and the number of objects affected n. Damage of an area is assessed as the sum of all damage categories i for all grid cells n by the following formula (Egorova et al., 2008).
Potentially relevant flood characteristics are the maximum water depth, flood duration, flow velocity, pollution, warning time and other possible aspects of the flood. Often, only the water depth is used in flood damage modeling, occasionally supplemented by one or two other parameters. The uncertainties in the flood characteristics which are used as input for the damage estimation are not part of this paper. Damage is usually calculated for categories such as houses, industries, commercial companies, roads and agriculture. These object categories differ in maximum damage and flood damage functions. The object and flood characteristics are linked by damage functions which give the fraction of the maximum damage which occurs as a function of the flood intensity. The damage fraction is then multiplied by the maximum damage to get the damage. (Some methods such as the one in the Multi-coloured Manual (Penning-Rowsell et al., 2005) use absolute damage functions which relate the flood intensity directly to the damage and not to a fraction of the maximum damage.) The maximum damage can be defined in different ways. In this analysis we define the maximum damage as the expected damage corresponding with an extreme water depth. This means that the damage function will reach the value of one/unity for the most extreme water depths and it means that the maximum damage already holds information about what part of the total value of the object or unit is susceptible to flood damage. The maximum damage does not include value which is not, or unlikely to be, susceptible to floods, such as the value of the land surface, the costs of building foundations or the value present on high floors in buildings that are unlikely to collapse. Not all damage methods use this definition. Some include more items in the maximum damage and apply damage functions which never reach the value of one if part of that value is on average not susceptible to flooding.

3
When comparing different models, the definitions of maximum damage and damage functions first need to be aligned to make a fair comparison.
In this paper we discuss flood damage models, which we define as a set of maximum damages, damage functions, object data and their relationships with which a damage estimate for a flooding in a certain area can be made.

Types of uncertainty
In the uncertainty analysis in flood damage assessment two types of uncertainty are distinguished: aleatory and epistemic uncertainty (Merz and Thieken, 2009).
Aleatory uncertainty is related to the variability or heterogeneity within a population which can be expressed by statistic parameters such as the mean, variance and skewness. This uncertainty is introduced by using average data: we use the maximum damage value of an average residence, although we know that some houses will suffer more, and other will suffer less damage. In small flood events which only affect a few houses, these few houses may differ significantly from the "average house" and therefore the damage estimate for these houses is uncertain. In large flood events which affect many houses, it is likely that deviations from the mean damage cancel out. This means that for large floods this type of uncertainty is of lesser importance.
Aleatory uncertainty by using averages can sometimes be reduced by applying more differentiation, e.g., the uncertainty within the maximum damage of a residence is reduced by using more differentiation in house types. The variation in maximum damage per house type would then be less than if all houses together were considered as one category "houses".
Epistemic uncertainty is the lack of understanding of a system and can in theory be reduced by further study or by collecting more or better data. In other words, also the average damage itself is not certain. For flood damage assessments, data are only available for a small number of events and those events often differ significantly from each other. This variation between events is still poorly understood and is therefore related to epistemic uncertainty. The epistemic uncertainty as stated above is not reduced when many objects are flooded. Therefore, it is the dominant uncertainty type for large flood events.
This type of uncertainty is especially relevant when a damage module developed for one area is applied to another area. In such a case, for example, the maximum damage values within the model related to houses may not be valid for the types of houses in the area under consideration. It would therefore be good not to mix up maximum damages and damage functions from different areas. However, given the scarcity of data and flood damage models, this leaves many modelers with the difficult choice between damage functions and maximum damages based on recorded data from another area or a local estimate.

Uncertainty in the unit loss method
Uncertainty in the unit loss method consists of uncertainty in object data, in maximum damage figures and in the damage functions. Table 1 shows an overview of the uncertainties present in these aspects of the unit loss method.

Uncertainty in object/land use data
Uncertainties are found in the quantity of objects and their precise location. The precise location of objects is important, since the flood hazard characteristics (e.g., depth, or flow speed) may differ substantially from location to location. Each object should be linked to the hazard value present at the location of the object. The effect of the uncertainty in the precise location of objects on damage estimates is smaller for more homogenous hazards. For example, in a deep, flat polder, the exact location of an object is not important because the water depth is approximately the same everywhere. When a hazard becomes more heterogeneous the uncertainty in the exact location becomes more relevant.
Geographical location data uncertainties are especially significant in areas which flood frequently but have small water depths because, in this type of area, a small error in the location or height of an object can make the difference between an object getting wet frequently or very rarely. Furthermore, valuable objects susceptible to flood damage are unlikely to be placed in a location that floods frequently, so it is much more likely to count too many wet objects than to count too few. Therefore, damage estimates may be very wrong if the approach used was too coarse to discover local elevations or the exact locations of objects. For example, the Dutch standard damage model HIS-SSM estimated EUR 100 million damage for an event in an unprotected area that in reality had only caused about EUR 30 000 in damages (price level 2012) (Slager et al., 2013).
Such errors are, however, unlikely when objects are not elevated on purpose, placed on safe locations or protected in other ways. Without this local protection of some objects, the damage will be overestimated, and for others, it will be underestimated. If many objects are affected, these errors compensate each other which reduces the uncertainty in the total damage. Use of high-resolution elevation information is also very useful to reduce this uncertainty (Koivumäki et al., 2010).
Uncertainty in the quantity of objects can be caused by errors in data, or by using data sources that are inappropriate for the intended application. This uncertainty depends on the quality of the data set that is used. De Moel and Aerts (2011) illustrated that this type of uncertainty may be small as they showed that different types of land use maps for the same area only have a small impact on the resulting damage estimate. In the uncertainty quantification for this paper, the uncertainty in the geographical location data is neglected. In addition, objects can be represented in different ways and each way can cause different uncertainties. A company office, for example, can be represented by either the floor space in the office, the footprint area of the office building, an area larger than the office building on a rough land use map or the number of jobs within the office. All these object representations correlate in some way with value present in the building but the indicator will not precisely correspond with the value of the building. The uncertainty this causes is aleatory, because if the indicator overestimates the value at one point it will underestimate it somewhere else, assuming that the maximum damage is a good average.

Uncertainty in the maximum damage figure
The uncertainty in the maximum damage figure can be divided into two parts: the uncertainty in the value of the object and in the part of that value that is susceptible to flood damage.
There are generally two ways to obtain the maximum damage for a flood damage model: deriving this from economic data or by looking at synthetic (hypothetical) buildings.
Economic data typically provide a total value per sector of all physical assets in the economy. To obtain a maximum damage figure per unit, this total value can be divided by the number of units within that sector. Next, the part of this total value which is susceptible to flooding must be identified. The strength of this method is that the mean object value will be accurate. However, uncertainty is still present in the part of the object that is susceptible to flood damage. A similar method is to use average construction costs and to correct this for the fraction that is actually susceptible to flooding.
Alternatively, the maximum damage of a category can be obtained by defining a hypothetical average company or ob-ject, and assessing the damage of all parts/aspects within that hypothetical company. The strength of this method is that the damage function and the maximum damage are well connected. Furthermore, the part of the value that is susceptible to flood damage is determined in a systematic way. The disadvantage of this approach is that epistemic uncertainty is introduced in the value of the object as the assumptions about this may be wrong.
Because generally a lot of good data are available on the value of objects, the uncertainty in the value of objects is considered to be aleatory. The uncertainty in the part of the value which is susceptible to flooding is epistemic, because a small amount of data or knowledge about that is available.

Uncertainty in damage functions
Damage functions can be obtained in two ways: by analyzing data on observed damage to objects and flood characteristics in past flood events, or by defining hypothetical average objects and assessing their damage corresponding to different flood intensities. A combination of both approaches may also be used.
Flood damage data are rarely collected in a systematic way (Thieken et al., 2005) and are not always available for research. When available, data are often limited to a single or a few events. These events are often not representative of other types of floods or other countries or areas. Cultural or geographical differences can cause the use of different building or interior materials between regions and events, making one data set not applicable to other areas. Another problem is that data are often limited to certain ranges of a flood parameter. For example, data may be only available for low water depths or the flood that was the source of the data may have coin-cided with a storm. In such cases the data cannot be used for events with larger water depths or no storm.
In general, transferring data from one event to another is error-prone. This makes it very difficult to apply knowledge derived from one event to another. Even if the data are applied to the same area as the data were taken from, problems may arise. Different flood events in the same area may lead to very different damage due to different human responses. For example, the same area in the Netherlands flooded in 1993 and 1995 with approximately the same water levels. The second time, the damage to housing content was about 80 % less (Wind et al., 1999). Also, the damage due to the Rhine floods of 1995 was less than half of the damage that occurred in 1993, as a result of precaution measures taken by households (Bubeck et al., 2012). This shows the sensitivity of flood damage to factors other than water depth. These other factors (in this case, flood experience) are often neglected in the recordings. This example shows that a data set based on a small number of events does not capture all possible variable values.
Synthetic damage functions solve many of the problems of having too few empirical data on actual damages. In this method, a hypothetical building is defined and flood damage is assessed for each building part. The hypothetical building should be representative of an average building in the area. If it is not, or if the damage estimates for the different building parts are not right, the damage function is inaccurate.
Damage data can also be combined with expert knowledge. Probably the most common method to create a damage model is by picking and choosing damage functions from other models based on an analysis of which existing damage function best represents the area considered. Or, the average between different functions could be used as a damage function. The challenge with this combined method is to understand the background assumptions between the models that are brought together or compared. For example, a common challenge may be that the maximum damage definitions do not match.
The ideal case is to combine the best of the two methods. The damage data available should be used to calibrate a synthetic model. This limits the possibility that large errors are made in the interpretation of the damage data (e.g., wrong definition of the maximum damage), by forcing the modeler to think about the processes. Furthermore, it gives the modeler the freedom to diverge from the observed data in situations that do not match any of the recorded events.
A common problem in constructing damage functions is that it is difficult to include the large number of parameters that may influence the flood damage. The parameters that are not used are implicitly considered. Each flood damage model based on a limited number of parameters is therefore making assumptions about the effect of the nonexplicitly considered parameters. Those unconsidered parameters have been very significant in a subset of flood events. For example, in the 2002 Elbe floods, contamination was critical (Thieken et al., 2005), in the Meuse floods, flood experience was critical (Wind et al., 1999) and in the 1945 floods in the Wieringermeer polder in the Netherlands, the waves in the flood water were critical (Duiser, 1982). This last example is complicated by a study of Roos (2003) who showed that the findings of the 1945 Wieringermeer polder flood are not valid for modern buildings; so the construction year/type of building can in some cases also be a critical parameter. Other possibly significant parameters are, for example, building style, flow velocity, flood duration, warning time and preparation.
Parameters that are not used can have a correlation with parameters that are used. For example, the water depth is correlated with the flood duration for floods in the Netherlands (Duiser, 1982;Wagenaar, 2012). Because of this correlation, the uncertainty caused by not knowing the flood duration is limited in the Netherlands. This relationship between two parameters may, however, be completely different for other types of floods (e.g., flash floods). A generally applicable flood damage model therefore still needs all parameters. Table 1 splits the uncertainties involved in flood damage functions into two groups: those related to parameter representation (using fewer parameters than theoretically necessary to describe the damage processes) and those related to a lack of knowledge about the damage processes. The uncertainties related to a lack of knowledge are epistemic, while the group of uncertainties related to the use of fewer parameters than necessary is aleatory, because this group of uncertainties would remain even with perfect knowledge. In Sect. 3 the analysis of the epistemic and aleatory uncertainty components is used to assess the uncertainty in the outcome of a single damage calculation. The epistemic uncertainty will be estimated by using the difference between damage functions from entirely different flood damage models (Sect. 3.1.2). The aleatory uncertainty will be considered by looking at the variation within one flood damage model (e.g., the difference between the low and the high estimate of the same flood damage model).

Overview of the method
This section proposes a method for quantitative uncertainty analyses using a Monte Carlo analysis. The qualitative uncertainty analysis discussed in the previous section is used to estimate the uncertainty in the inputs and the correlations between the different input parameters. The general assumption behind this uncertainty analysis is that no good local damage functions are available and that the modeler therefore does not know which damage functions to choose. For areas for which good local damage functions are available, the approach discussed in this section may overestimate the uncertainty.
The damage analysis in this paper is limited to two damage categories: houses and companies, as they are represented in the different flood damage models on which this research is based. These were selected since all damage models contain functions to assess them. This is not the case for other damage types. In flood damage models that do consider other damage categories, houses and companies usually make up the majority of the direct damage. Both damage categories are divided into damage to buildings and damage to content. Many individual flood damage models provide several more detailed subcategories for these basic categories. Our approach may therefore lead to slightly larger uncertainties than are present in such models that are more detailed.
A crucial aspect of the Monte Carlo analysis of uncertainties is the correlation amongst the uncertainty of the different input parameters, such as the maximum damage of houses. If for example, the maximum damage of house X is overestimated, the maximum damage of house Y may also be overestimated. The parameters that are homogenous within one event, but vary between events will have a strongly correlated uncertainty value e.g., if damage depends on warning time and on one particular event, the warning time is unusually short; this would more or less be the case for all houses which are affected in such an event. Such aspects are therefore sampled for the entire area at once. Other parameters vary between neighborhoods, or from place to place, such as for example, the building type. These need to be sampled on a smaller level then the entire area at once. Sampling will therefore be done at two different levels: for the entire event and on a more detailed sub-event level. Figure 1 gives an overview of the calculations process which is repeated 10 000 times. This results in 10 000 dif-ferent damage estimates which together make up the distribution of possible damages.

Flood damage library
A damage function library was constructed containing 262 different damage functions from seven different flood damage models. Damage functions from flood damage models were included if they were made for developed countries and available to the author at the time of the study (2013-2014). These functions were the basis for the damage fraction and the measure of susceptibility to flooding. The damage fraction was sampled by picking damage functions from a flood damage model. These functions were individually all scaled to 1 to ensure that the same maximum damage definition is applied everywhere. Table 2 gives an overview of the models included in the damage function library. The Tebodin model only has damage functions for companies and the Billah (2007) model only has damage functions for houses. Since both models were made for the Netherlands and the damage functions were constructed using similar techniques, these two models have therefore been merged into one flood damage model. Figure 2 shows the average damage functions for the different flood damage models. In this picture the damage functions are not scaled up to 1.
In this library the definition of the zero height point is the ground level (elevation in the digital elevation model) rather Syst. Sci., 16, 1-14  Model Description

HIS-SSM
The HIS-SSM is a standard Dutch flood damage model (Kok et al., 2005). It is based on several earlier Dutch flood damage studies (Duiser, 1982;Briene et al., 2002).   and FLEMOcs for companies . The functions include a low and a high estimate.

Rhine Atlas
This second German model is based on expert judgement, taking data from an earlier German damage database (HOWAS) into account. More information about these functions is available in Jongman et al. (2012). Tebodin This is a Dutch study, based on a detailed, systematic and well-documented expert judgement approach. This study only provides damage functions for industry. It is detailed; it provides functions for many different industrial types and it has separate functions for areas protected by flood defences and for unprotected areas (Snuverink et al., 1998;Sluijs et al., 2000). Billah, 2007 This is a research project in which the systematic expert judgment approach as used in MCM was applied to Dutch houses.
than the ground floor level. Some flood damage models use the floor level as the zero point and combine this with vertical elevation data of the ground floor. For this paper no damage functions with significant damage below the zero point were used. In the few functions used with damage below the zero point, the zero point was shifted to the point where the first damage occurred to make them comparable with other functions.

Land use maps
A flood damage model needs input about the number of houses and jobs affected. For the case study, the number of houses was taken from the geographical database BAG. This database was made by the Dutch Cadaster, Land Registry and Mapping Agency. For the number of jobs, the background data of HIS-SSM were used (Kok et al., 2005).

Step 1: event-level sampling (epistemic uncertainty)
The sampling on the event level is done by sampling a flood damage model (e.g., HAZUS or MCM) and using that throughout the damage calculation. This sampled model will be applied to all categories and will be used as a source for the damage functions and for the measure of susceptibility to flooding of the maximum damages. The advantage of this is that a realistic combination of inputs will be sampled. This procedure prevents, on average, a higher damage from being sampled for small water depths than for large water depths or functions with different implicit assumptions from being merged.

Group size and dependency
For the sub-event-level sampling, uncertainty values are sampled for small groups of houses or for a company. Houses and jobs are grouped because also in reality, similar houses are often built near each other and a company is also expected to be relatively homogenous in damage per job. It is therefore not realistic to sample all houses and all individual jobs in an area completely independent from each other. By sampling in small groups of houses/jobs, total dependency within the group and total independency between the groups is assumed.
The way in which the area is grouped determines the dependency for this aleatory uncertainty. This buildup of groups is therefore also sampled again for each Monte Carlo simulation. This sampling should therefore be seen as a sampling of the dependencies between the damage of different objects. For houses the area is split into groups of 1, 10 or 100 houses for every simulation. Houses are only grouped together when they have a similar water depth. This is done to keep the calculation simple but also because similar water depths typically occur in locations that are geographical close. Furthermore, the group sizes are so small that for medium-or larger-sized events, this assumption has no influence on the results. For companies, the jobs are grouped per company.

Damage functions
Within each group, every house/job receives the same damage fraction and maximum object value. Sampling for the damage fraction is done based on the set of damage functions within the flood damage model sampled in step 1. For example, if the flood damage model sampled has three damage functions for houses, for each group, one of the three damage functions is randomly used.

Maximum damage
The  of EUR 70 000, for which here also a triangular distribution is assumed, at ± EUR 50 000. These maximum damages all use the price level 2011. These assumptions lead to a symmetric probability distribution, while it is probably in reality positively skewed. This is neglected in this study because it is difficult to estimate and the impact on the uncertainty is expected to be very small.
The maximum damage for companies is assessed in this study as the maximum damage per job times the number of jobs per company. Gauderis (2012) estimated material value per job for 62 different categories of companies. These estimates are taken together to produce a distribution of the physical value of a company per job. Because not all company categories are equally common, the values were weighted in the distribution based on their quantity in the Netherlands. This more complex method was used in order to correctly incorporate the skewness and because the data from Gauderis (2012) make it possible to do this, which was not the case for houses. The results are shown in Fig. 3. These values include both the structure and the content. Assumptions from Gauderis (2012) on the part of the maximum damage that belongs to the structure and the part that belongs to the content were adopted.

Monte Carlo analysis behavior: trial of the method on hypothetical flood maps
To gain understanding of the Monte Carlo analysis behavior, the analysis was tested on hypothetical flood depth maps, one with small water depths (< 0.5 m), one with medium water depths (0.5-2 m) and one with large water depths (2-3 m). These had average water depths of 0.35, 1.25 and 2.5 m. These maps were used for calculations with 150 and 15 000 houses and jobs; thus in total, six different trials were carried out and the resulting uncertainty values were compared. The uncertainties in the damage estimates are expressed with the coefficient of variation. This is the standard deviation of the damage divided by the mean of the damage. It has no unit and is therefore independent of the size of the flood event. This makes it a good measure to compare the uncertainties in different areas. Figure 4 shows the results of this hypothetical analysis. It stands out that both a smaller water depth and a smaller area increase the uncertainty significantly. This is because at small water depths, the different flood damage models differ significantly more from each other than at large water depths. This indicates that the uncertainty in damage estimates for events like, for example, small regional levee failures, is much larger than the uncertainty in damage estimates for large-scale floods with large water depths.
Another observation is that the distribution of the damage for small events looks very different from the distributions of the damage of large events. The main reason for this is that for large events, the aleatory uncertainty in the flood damages can be reduced significantly by the law of large numbers, but not the epistemic uncertainty.
Epistemic uncertainty is therefore the significant uncertainty for larger events. The frequency distributions therefore then show clearly separate peaks related to the damage functions of the separate flood damage models.
It is difficult to determine for what event size the variation between the flood damage models (epistemic uncertainty) becomes more important than the variation within the flood damage models (aleatory uncertainty). For the uncertainty model created in this paper this point is somewhere between 100 and 3000 houses, plus jobs. This critical size depends on the dependencies between individual objects. These dependencies determine how fast the law of large numbers will reduce the aleatory uncertainty. For this paper, this was estimated by sampling in groups instead of sampling individual objects. The size of these groups therefore determines when the epistemic uncertainty becomes dominant. These group sizes were based on a rough estimate in this paper and more research should be done for better results.

Case study
A case study is done in the Betuwe area, Tieler-en Culemborgerwaarden (dike ring 43) in the Netherlands, to show the effect of uncertainty in the flood damage estimation on investment decisions for flood risk management. Dike ring 43 is located between Rhine branches in the Netherlands. In the west, the area is closed with a high dike (border to next dike ring area). The area slopes down to the west. The difference in height between the eastern and western part is about 10 m.
The Monte Carlo analysis is applied to a water depth map resulting from a simulated dike breach (VNK, 2014) along the Rhine River, near Bemmel in the Netherlands (see Fig. 5 for its location). This dike section is about 26 km long. Bemmel is situated in the eastern upstream part of the Betuwe area. When the dike breaches, water flows through the Betuwe area to the west where after about 70 km, it is stopped by the western embankment. The maximum water depths due to this dike breach vary from less than 50 cm in the east to over 5 m in the west. In this dike breach scenario a total area of 626 km 2 is inundated. This area contains several small towns and villages, with a total population of around 300 000 people. The large flood extent, the large number of affected residences and companies and the large variation in water depths are expected to have a reducing effect on the aleatory uncertainty in the total damage of the dike ring area.
The damage assessed for this flood scenario was EUR 16 billion (price level 2011) with a standard deviation of EUR 5.6 billion based on 10 000 simulations. The resulting damage outcomes are shown in Fig. 6. The peaks in Fig. 6 are related to the damage models and illustrates the large differences between the different damage models (two damage models overlap).  Table 3. Optimal investment strategy given different damage estimates. The flood protection standard is a return period based on the direct method described in Kind (2011), note that the actual return period of the investment strategy differs per year. Price level 2011 is used. The results in Fig. 6 are used to find the economic optimum flood protection standard and investment strategy for the dike segment from an economic viewpoint. The optimum flood protection standard and investment strategy is calculated using a simplified version of the approach of Kind (2013). Kind (2013) assesses which investment strategy (set of dike improvements at different moments) has the smallest total cost for the future and hence is the economic optimum. This total cost consists of all the discounted expected annual damage (EAD) considering future changes and the discounted future investments in the dike. The expected annual damage depends on the flood probability and the flood damage given to a dike breach (as calculated in this case study). Long-term economic growth forecasts are used to increase the damage for every year in the future. Furthermore, a correction factor was used to take indirect damage into account. The flood probability depends on the quality of the dike which again depends on the investments carried out. This flood probability is calculated for each year into the future based on the current status of the dike, investments up to that point, consolidation of the dike and climate change predictions. Investments in the dike are simplified as height increases and the cost of these investments is based on fixed and variable cost for the dike segment considered. A height increase is converted into a flood probability reduction based on a parameter that describes the height increase necessary to decrease the flood probability by a factor of 10. This parameter and all other parameters used for determining the optimal investment strategy (except for the flood damage) were taken from the WV21 project (Kind, 2011).
In this paper we assess the effects of uncertainty in damage estimates on the economic optimal flood protection standard and the total investment costs. We do that by determining the investment strategy for five different damage estimates. The first four estimates relate to the first four peaks. For the highest damage estimate, the 98 % percentile of the damage outcomes was used.
The analysis in this paper focuses on the first investment made. In all five alternatives this investment is done in 2015. The second investment is in all alternatives planned about 75 years later and a third investment is suggested about 50 years after the second one (around the end of the time span considered). The total investment costs are mainly determined by the first investment, because the weight of later investments is very small due to the use of the net present value, which gives future costs and benefits a much lower weight than current costs and benefits. The calculations assumed a discount rate of 5.5 % (based on WV21; Kind, 2011).
The results in Table 3 show that the optimal investment strategy is at first glance not very sensitive to the precise damage estimate. The difference between the five alternatives in required dike heightening is only 18 cm (88 − 70 cm). This small difference is partly explained by the strong sensitivity of the flood probability to the precise height of the dike. The dike segment in this case study becomes 10 times safer by raising it only 34 cm. If the flood probability were less sensitive to height changes, the differences in dike height between a low and a high damage estimate could be much larger. If the dike were increased by 1 m to reduce the flood probability by a factor of 10, the difference between the top and lower damage estimate would be 47 cm.
If the flood damage applied in the cost benefit analysis differed from the flood damage that would actually occur, a suboptimal investment strategy would be applied. Table 4 shows the costs of using a wrong damage estimate. It gives the unnecessary cost made by assuming a certain damage for different real damage values. This cost varies in this case study between EUR 0 and 12 million and is on average about EUR 2 million, which is 1.4 % of the total cost and for this case study about EUR 75 000 km −1 . The maximum error is 9 % of the total cost and for this case study EUR 500 000 km −1 (price level 2011).
This case study illustrates how the Monte Carlo analysis may be used to assess the uncertainty in damage assessments, and how the effect of this uncertainty on investment costs may be determined. In the case study here, the effect is small. However, if we take into account the fact that in the Netherlands we have about 3000 km of embankments and that EUR 12 million might be unnecessarily spent per 26 km, the total amount of money spent unnecessarily may then be large. It is also likely that in cases with lower flood probability standards, or with smaller flood events, the effects of this uncertainty are much larger.
A striking observation in the results of Table 4 is that the costs of overestimating the damage are significantly lower than the costs of underestimating the damage. The difference in costs is on average a factor of 2 (see Table 4). This can be explained by the nonlinear relationship between the flood probability reduction and the investment costs. The flood probability can be reduced a lot with a small extra investment; thus when too little is invested, the EAD increases faster than the investment cost decreases. This implies that under uncertainty, it would be economically efficient to add a safety factor to avoid investing too little.

Discussion
This paper discusses a new method for the quantification of uncertainties and applied this method in a case study. The case study is a good illustration of the method and its use, but the calculated uncertainty, the damage frequency distribution and the effect of uncertainty on investment decisions may not be representative of all situations, first of all, because several damage-determining aspects were neglected in the case study. The damage is assumed to consist only of damage to buildings and companies. The quantification of other damage categories, such as affected persons and fatalities may also be relevant and can be taken into account in a CBA (cost benefit analysis). Another simplification is that the entire cost benefit analysis in the case study is based on only one flood scenario, at one breach location and at one water level (at the design water level of the dike). A more precise way would have been to include multiple breach locations and water levels. These effects are however assumed to be negligible for the conclusions of this paper, because in most cases, the uncertainty for the different damage estimates will be similar and highly correlated (because it is for the same area).
Secondly, the results may not be representative of all situations because the exact location and number of peaks in the damage frequency distribution depend on the input damage models in the uncertainty analysis. The set of seven damage models used does not cover all possible damage models. If an extra flood damage model were added to the damage function library, an entire new peak could appear. The frequency distributions of the outcomes must therefore be considered as an example of what a frequency distribution could look like and how far the peaks are approximately apart from each other. It is impossible to make a real frequency distribution because the major uncertainties are epistemic uncertainties. Epistemic uncertainties are by definition not understood and can therefore not be represented by a frequency distribution (Helton and Oberkampf, 2004). However, there are alternative concepts to describe uncertainty, such as imprecise probabilities or Bayesian statistics, that deal with this problem (Reichert and Omlin, 1997;Zadeh, 2005).
Thirdly, the cost of a wrong estimate, which was estimated for the case at about 1 % and at a maximum of 10 % of the total cost, may also be different for other cases. It depends, amongst others, on the cost required to reduce the failure probability by a factor of 10, on the damage itself and on the uncertainty in the damage interaction (which will be larger for small areas and areas with little flood water depths). Finally, the uncertainty in the damage estimate was, in this case study, directly linked to an error in the investment strategy. However, in the determination of the optimal investment strategy, not only uncertainties in the damage estimate, but also in other components play an important role. Uncertainty in the costs of dike strengthening, in the discount rate, in the future economic growth, in the flood pattern and so on, all add to the uncertainty in the optimal investment strategy. These uncertainties may partly compensate each other, but can also aggregate each other. Their relative importance differs per case, depending on local characteristics (De Moel et al., 2014).
We tried to combine information from different damage models to get a better quantification of uncertainties in damage outcomes. This can only be done when the damage models may all be applicable to the flood scenario which is being modeled. Whether flood models are equally applicable is sometimes difficult to establish. Metadata of the source of the damage models are not always available and sometimes information on the event on which the model is based is also lacking. This makes it difficult to compare damage models and to understand why they have different estimates for the same flood patterns. Relevant metadata on parameters which may be obvious for a certain event, but vary from event to event are needed. Examples of such parameters are, for example, flood experience of the population, building style, flood duration and contamination of the flood water.
Metadata for flood damage functions should give clear instructions about the type of events for which damage functions are applicable and for what events they are not. This could lead to a classification of different flood types with their own damage functions. This would first lead to a better transferability of damage functions and maximum damages and could eventually lead to generally applicable flood damage models. Vogel et al. (2012) and Schröter et al. (2014) also carried out detailed uncertainty analyses for flood damage assessment. The most obvious difference with these studies is that this paper uses a Monte Carlo approach, while they use fundamentally different methods. However, the more interesting difference is that this paper assumes a situation where no good local data are available and that little is known about the expected conditions during the potential flood (apart from the maximum water depth). Therefore, this paper used relatively simple data from many different countries and flood types as input for the uncertainty analysis, while these other papers used relatively complex data, only from Germany. The strength of this approach therefore is that it has a wider coverage of the spectrum of possible flood damage. The disadvantage of this approach is that it is not applicable when a good local flood damage model is available, based on many data.

Conclusion
Uncertainties in flood damage estimates can be large. This study showed uncertainties of an order of magnitude of 2-5. This uncertainty is mainly caused by a lack of knowledge. Most flood damage models are based on data resulting from a small number of events. Because flooding can occur in many different ways (water depths, contamination, flow velocities, flood durations, etc.) and in many different types of areas (building types, flood experience of the local population) any model will miss considerable parts of the spectrum of possible options. Data from one event therefore are often not transferrable to other areas or events. Since only data representative of the event under consideration can be used, few data are available and hence large uncertainties are introduced in flood damage modeling.
This study introduced a method to quantify these uncertainties using a set of damage models which have all been applied in the past to river floods (not flash floods) or storm surges in developed countries. To quantify the uncertainty, a distinction was made between epistemic and aleatory uncertainties. Epistemic uncertainties are introduced by a lack of knowledge about the spectrum of possible flood events and areas in which they could occur. The size of this spectrum was estimated for this study by using the difference between flood damage models. Aleatory uncertainties are introduced by local variations between objects and circumstances. These uncertainties were estimated for this study with the variations within different flood damage models.
These aleatory uncertainties are large for small flood events and much smaller for large flood events, affecting many objects. Epistemic uncertainties are not smaller for large areas, since they are not related to deviations of single objects from the average object for which the damage functions were derived. Epistemic uncertainties can only be reduced with new knowledge. The resulting Monte Carlo analysis therefore shows larger uncertainties for small areas. However, at a certain event size, the epistemic uncertainties become dominant.
These uncertainties in flood damage modeling can potentially have a significant effect on investment decisions. In this study a case study was carried out to calculate the economic optimal investment strategy for a dike segment. This case study showed that uncertainties in damage estimates can lead to suboptimal investment decisions. In the worst case scenario (maximum error in damage estimate), the difference between the total cost (remaining risks and investment cost) may be as high as EUR 500 000 km −1 dike (price level 2011). The expected difference between the optimal and suboptimal investment strategy was, however, significantly lower (EUR 75 000 km −1 dike). These findings need to be verified with further research in other areas.
The paper provides a good first approach for uncertainty quantification in damage estimates and shows how this approach can be used to improve investment decisions. Further research including other areas and more flood events is recommended to develop the approach further.