Multi-variable flood damage modelling with limited data using supervised learning approaches

Wagenaar, Dennis; de Jong, Jurjen; Bouwer, Laurens M.

doi:https://doi.org/10.5194/nhess-17-1683-2017

Articles | Volume 17, issue 9

https://doi.org/10.5194/nhess-17-1683-2017

© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 3.0 License.

Special issue:

Damage of natural hazards: assessment and mitigation

https://doi.org/10.5194/nhess-17-1683-2017

© Author(s) 2017. This work is distributed under
the Creative Commons Attribution 3.0 License.

Articles | Volume 17, issue 9

Research article

|

29 Sep 2017

Research article |

| 29 Sep 2017

Multi-variable flood damage modelling with limited data using supervised learning approaches

Dennis Wagenaar, Jurjen de Jong, and Laurens M. Bouwer

Abstract. Flood damage assessment is usually done with damage curves only dependent on the water depth. Several recent studies have shown that supervised learning techniques applied to a multi-variable data set can produce significantly better flood damage estimates. However, creating and applying a multi-variable flood damage model requires an extensive data set, which is rarely available, and this is currently holding back the widespread application of these techniques. In this paper we enrich a data set of residential building and contents damage from the Meuse flood of 1993 in the Netherlands, to make it suitable for multi-variable flood damage assessment. Results from 2-D flood simulations are used to add information on flow velocity, flood duration and the return period to the data set, and cadastre data are used to add information on building characteristics. Next, several statistical approaches are used to create multi-variable flood damage models, including regression trees, bagging regression trees, random forest, and a Bayesian network. Validation on data points from a test set shows that the enriched data set in combination with the supervised learning techniques delivers a 20 % reduction in the mean absolute error, compared to a simple model only based on the water depth, despite several limitations of the enriched data set. We find that with our data set, the tree-based methods perform better than the Bayesian network.

Download & links

Article (PDF, 1780 KB)

Supplement (146 KB)

Received: 04 Jan 2017 – Discussion started: 12 Jan 2017 – Revised: 25 Jul 2017 – Accepted: 30 Jul 2017 – Published: 29 Sep 2017