Natural hazards affect many types of tangible assets, the most valuable of which are often residential assets, comprising buildings and household contents. Yet, information necessary to derive exposure in terms of monetary value at the level of individual houses is often not available. This includes building type, size, quality, or age. In this study, we provide a universal method for estimating exposure of residential assets using only publicly available or open data. Using building footprints (polygons) from OpenStreetMap as a starting point, we utilized high-resolution elevation models of 30 European capitals and pan-European raster datasets to construct a Bayesian-network-based model that is able to predict building height. The model was then validated with a dataset of (1) buildings in Poland endangered by sea level rise, for which the number of floors is known, and (2) a sample of Dutch and German houses affected in the past by fluvial and pluvial floods, for which usable floor space area is known. Floor space of buildings is an important basis for approximating their economic value, including household contents. Here, we provide average national-level gross replacement costs of the stock of residential assets in 30 European countries, in nominal and real prices, covering the years 2000–2017. We either relied on existing estimates of the total stock of assets or made new calculations using the perpetual inventory method, which were then translated into exposure per square metre of floor space using data on countries' dwelling stocks. The study shows that the resulting standardized residential exposure values provide much better coverage and consistency compared to previous studies.
Residential assets are typically the most valuable components of national wealth
Modelling damages to residential buildings requires quantifying their exposure in terms of monetary value. This is particularly important as exposure was found to be the primary driver of long-term changes in damages due to natural hazards in Europe and other continents
Information on building characteristics, including floor space area, is not uniformly available. Many studies rely on national or local administrative spatial databases such as cadastres which record multiple characteristics of buildings such as occupancy, usable floor space or number of floors
Values of residential buildings are typically compiled per particular case study. A typical source of this information is local insurance industry practices
In this paper we develop a universal method of estimating exposure of residential assets at the level of individual buildings. It covers both building structure and household contents for application, at the very least, to the European Union (EU) member states. We focus on the approach that considers the total value of buildings and contents as a product of usable floor space area of a building and the average gross replacement cost of buildings and contents per square metre in a given territory. Additionally, we use only publicly available datasets to achieve the task. The methodology is applicable to any location within the 30 countries covered by this study. Building size estimation routine is validated on a set of natural-hazards-related case studies. Our estimates of the current gross replacement costs of building and household contents are provided at a national level from 2000 to 2017 to facilitate their use in assessments of past natural disasters.
The workflow of the paper is presented in Fig.
Workflow of the study. Boxes are coloured according to categories explained in the legend. In the top left corners of the boxes are references to relevant sections of this paper. In the top right corners of the boxes are references to figures, tables, supplementary tables (S.Tab.) in Supplementary Information 1, equations, and Supplementary Information 2 (S.Inf. 2).
Applying a building-level damage model requires information on the analysed objects such as size and value. Before those quantities could be calculated, residential buildings have to be identified in the area of interest. A variety of cartographic sources could be used depending on local availability, from governmental databases to topographic maps and remote sensing. The problem of accurately identifying buildings and occupancy, especially with open data, is outside the scope of this paper as this issue is still subject to intense research
We obtained the OSM building and land use layers to develop the building size estimation method. The download was carried out during 22–25 January 2019 through Overpass API, a system that allows us to obtain custom selections of OSM data
Once residential buildings, i.e. their footprints, are obtained, their size in terms of usable floor space area needs to be derived. The usable (also called useful) floor area of a dwelling is the total area of the rooms, kitchen, foyers, bathrooms, and all other spaces within the dwelling's outer walls. Cellars, uninhabitable attics, and, in multiple-occupancy houses, common areas are excluded
Variables considered for the building height prediction model. Abbreviations are shown for variables included in the final model (Fig.
A Bayesian network is a graphical, probabilistic model which allows multivariate dependency analysis and provides uncertainty distributions of the predictions made with it. BNs are directed acyclic graphs consisting of nodes (representing random variables) and arcs indicating the dependency structure
Building height was derived from a high-resolution digital surface model “Building Height 2012” by the
The final model is presented in Fig.
A Bayesian network for predicting residential building height. Values on the arcs represent the (conditional) rank correlation; values under the histograms showing the probability density function are the mean and standard deviation of the marginal distributions, with density on the
The dependencies defined in the model can be explained theoretically as follows. Firstly, high population density was highly correlated with height, as one might expect the presence of tall residential buildings (high-rises, tower blocks) in densely populated cities. High buildings also typically have a large footprint compared to single-family houses. Finally, the height of buildings is correlated with soil sealing, as urban districts with apartment blocks are largely covered by artificial surfaces providing supporting services to the buildings, such as roads, sidewalks, parking lots, etc. Such surfaces reduce the perviousness of the soil. On the other hand, small single-family houses are rather found in less-densely built-up and populated suburban zones.
The accuracy of the model is analysed in Sect.
The described routine can be applied to any location in Europe for which at least the building footprint area is known. The Bayesian network model can be used when data for any variable are missing, though the building footprint is required for Eq. (
Predictions of building height, number of floors, and floor space area are compared with observations using several error metrics Pearson's coefficient of determination ( Mean absolute error (MAE) was used to measure the average absolute difference between predicted and observed values, with higher MAE indicating higher error. Mean bias error (MBE) was used to measure the average difference between predicted and observed values, with positive MBE indicating overprediction and negative MBE indicating underprediction. Symmetric mean absolute percentage error (SMAPE) normalizes MAE by considering the absolute values of predictions and observations, with a value close to 0 indicating a small error compared to the variability of the phenomena in question. Root-mean-square error (RMSE) was used to measure the difference between predicted and observed values, with a higher RMSE indicating higher error.
Equations for the listed measures are shown in Table S2. For validation purposes, we use the predictions as mean (expected) values of the uncertainty distribution of the variables of interest per data point (building). We also analyse the uncertainty of the height prediction model and perform an out-of-sample validation.
An out-of-sample validation of building heights was done individually for each of the 30 capital cities contained in the sample quantifying the BN. Validation for all cities collectively was performed using 10-fold cross-validation. Predictions of building heights transformed into the number of floors were validated using a large (
Validation of floor space area predictions was carried out using results of post-disaster household surveys covering six river floods and three flash floods that affected Germany between 2002 and 2014 and a river flood along the river Meuse in the Netherlands in 1993
When the floor space of a building is known, it is multiplied by the average replacement cost of dwellings and household contents per square metre. The total floor space of dwellings in a country is available for European countries due to recording of this information in population and housing censuses, sometimes also in household surveys
Statistical institutes in most European countries are recording the stock of fixed assets, including dwellings, for purposes of national accounting
The remaining EU countries and three other western European nations (Iceland, Norway, Switzerland) required more data collection efforts. According to the European System of Accounts (ESA) 2010 manual
Three quantities are needed to obtain the stock of dwellings
Parameters
For a further four countries, where data on investment are limited, but the balances of the number of buildings and their floor space are available, a modified PIM was applied. In those cases, we computed an initial estimate of the stock of dwellings (Bulgaria in 1999, Latvia in 2000, Poland and Romania in 1995) based on national construction costs in the base year, and then we used annual data on investments in, and retirement of, dwellings in the country to arrive at a time series of the gross stock. In this case Eq. (
Data availability for the stock of household contents is much lower than for dwellings. This item is termed in national accounting “consumer durables” and assumed to be consumed within the accounting period, rather than accumulated, as those durables are not relevant from the perspective of economic production processes. As such, they are considered memorandum items in ESA 2010
In order to estimate the stock of household contents, the PIM method is applied again. However, the contents consist of various durables of different service lives; therefore Eq. (
Final consumption expenditure data were collected from Eurostat, OECD, and national statistical institutes. Due to the very long estimated service life of durables in the “personal effects” (COICOP code 12.3.1) category (45 years), the spending on those items had to be extrapolated using data on total private consumption expenditure, or GDP. This should have, however, limited influence on the results for recent years given the rather small share of spending on durable personal effects. For France, which has detailed expenditure going back to 1959, truncating the data to 1995 (the minimum availability for the countries considered except Malta) and extrapolating them with total private consumption resulted in a 2 %–5 % lower estimate of the stock of household contents, depending on the year. The uncertainty increases when moving back in time. Detailed sources of data are shown in Table S8. The calculation in Eq. (
Estimates of building and content value cannot be directly validated due to the lack of information on this subject at the level of individual objects. We can only compare our results with other published results, which is done in Sect.
The exposure estimation procedure was first validated by comparing observed and modelled residential building height. This analysis was done through a 10-fold cross-validation using a 10 % sample of residential buildings in 30 European capitals (Sect.
Binned scatter plot for observed and modelled heights of residential buildings for 30 European capitals, out-of-sample validation. The black line is the 1 : 1 line, and the red line is the linear regression line.
Validation statistics for the building height prediction model (mean value of the uncertainty distribution) for different cities. For all cities, the results are an average of results for a 10-fold cross-validation. For individual cities, the results are an out-of-sample validation (i.e. the model's sample excluded the city that was validated).
An out-of-sample validation was also carried out for each city in the dataset, where the validated capital was left out from the data quantifying the dependency structure of the BN model (Table
Validation statistics for the building height prediction model (mean value of the uncertainty distribution) for various sets of residential buildings.
Hit rate of predictions of the number of floors for Polish residential buildings at risk of sea level rise and coastal floods. Bold font indicates the percentage of the correctly predicted number of floors.
The second step in obtaining floor space – the number of floors – was tested against a large number of Polish residential buildings located in the coastal zone, obtained from the national database BDOT. Results in Table
Finally, predictions of the floor space area were tested against Dutch and German households (Table
As described in Sect. 2.3, statistical data on buildings and household expenditure were collected for a study area of 30 countries (Iceland, Norway, Switzerland and the European Union except for Croatia). The dataset reveals a considerable stock of residential assets in place. Based on those statistical data alone, we estimate that there were 259 million dwellings in the study area at the end of 2017, some 12 % of which are vacant or occupied seasonally. Those dwellings had a collective useful floor space area of almost 24 billion m
Value of
Value of
Change in the value of
Household contents in Europe are a diversified collection of durable items, which we estimated were worth EUR 6.6 trillion at the end of 2017. Furniture, furnishings, and floor coverings constituted 39 % of the gross stock of household contents, followed by jewellery, clocks, and watches (25 %); audiovisual, photographic, and information processing equipment (11 %); major household appliances (10 %); and various other tools, equipment, and appliances (16 %). Variation between countries is higher than for dwellings (Fig.
To illustrate an application of the two components of the study – building-level height predictions and country-level valuations of residential assets – we downloaded current (as of 18 July 2019) OSM building data for Szczecin, Poland. This city of slightly more than 400 000 people is endangered in its low-lying parts by floods and sea level rise
Estimated residential asset values in a low-lying part of the city of Szczecin, Poland. Flood hazard zone from
Combining our exposure estimates with flood maps for extreme sea levels
Predictions of floor space area involve several uncertainties along the chain of computations. Firstly, the Bayesian network (BN) for predicting buildings was quantified based on a set of capital cities. Those cities vary enormously in size, cover 30 countries, and include at least to some extent the surrounding metropolitan area, but they do not include area of more rural character. Incorporation of those areas could improve predictions for single-family houses. At the moment, the
Bias in predictions for high-rise buildings is observed, which can largely be a consequence of a relatively small number of those, even within large cities. Some errors originate in the source elevation model, which has a resolution of 10 m; therefore the height of buildings with small footprint areas could be less accurately assigned to OpenStreetMap polygons. Also, the validation information provided by
The OSM dataset is also not homogenous. Sometimes, individual buildings are not distinguished within a city block, creating an artificially large building, leading to overestimation of height in the BN model. The quality of building and land use is also uneven within the cities themselves, resulting in relatively few useful data points, e.g. for Nicosia, Rome, or Madrid. In the second step of obtaining floor space of buildings, i.e. calculating the number of floors, a constant height of each floor was assumed, though they tend to vary to some degree
The method used for data analysis, a non-parametric BN, is a model configured primarily using expert knowledge. The dependency structure modelled with a Gaussian copula is the main assumption in the model that could affect the results. For comparative purposes of the height model's predictions, we also tested an ensemble learning method known as random forests (RF). It utilizes ensembles of regression trees, which split continuous variables into subsets in order to approximate nonlinear regression structures
Improving building height predictions for the purpose of exposure estimation would involve incorporating new sources of information. For building heights, lidar scanning results from smaller cities and rural areas should be incorporated to increase the diversity of the sample for a Bayesian network model. The model itself could also be built separately based on data of different typology (urban, suburban, rural) or for different parts of Europe. More diversified resources are needed as well to analyse the relationship between building height and the number of floors and the usable floor space of the building, which can differ between countries and building types. As a more immediate step, the code used in this study is expected to become publicly available to facilitate its application and further testing.
Estimates of residential building replacement cost per square metre from two external sources, by the JRC
Comparison of residential building values per square metre of floor space estimated in this study with
Household contents were not directly estimated by JRC in the study by
Comparison of household content values per square metre of floor space estimated for the year 2010 in this study with two estimates by the Joint Research Centre
Some other literature estimated could be compared with our results. Studies based on German post-disaster surveys computed exposure based on an insurance sector guideline for residential building values deflated to a particular year with the construction price index
Uncertainties related to economic valuations are largely methodological or related to limitations in the availability of some data for certain countries. Most of the gross stocks of dwellings are taken directly from national estimates, which are computed with a variety of assumptions related to service life and retirement patterns as well as investment data availability, coverage, and detail. As noted in Table S4, analysis of methods identified time series for two countries incomparable with others, but more datasets could be affected by local methodological specifics. The stock of household contents was computed with a uniform approach, but service life assumptions based on a German study might not be suitable for other countries. Also, the availability of historical data on consumption expenditure varies between countries and most detailed COICOP four-digit data are not accessible on a per-annum basis, necessitating assumptions about the share of durable spending in more aggregated data. Quality of the expenditure data could also be questioned given the very large differences between deflators for individual durable items between countries. This is most strongly visible in the data for Ireland, where prices of all items have dropped significantly since the year 2000 according to national statistics, which is not in line with the experience of other European economies. Consequently, the estimate of the stock of household contents for Ireland is likely too low and the strong upward trend is likely overestimated. Further, availability of dwelling and household numbers and especially the floor space statistics is not uniform. For some countries, data on temporal changes in average floor space per dwelling or the total area are not published. Yet, housing statistics are typically better for central European countries than western European states, quite the opposite to economic data availability. This is likely a result of poorer living conditions in the new EU member states prioritizing gathering information on the subject compared to western Europe, while their less-developed statistical systems usually generate lower detail and shorter time series of economic statistics.
The study presented only valuations of dwellings and household contents as gross stock, i.e. replacement cost without allowing for depreciation of assets.
Consumer durables except for personal vehicles are used here for household contents on the basis of what items are actually insured and compensated after natural hazard events. Overall damages to households could be higher still. In the aftermath of the 2010 Xynthia storm, 8 % of flood-related insurance claims were related to cars on top of the 5 % of windstorm-related claims
Time series of building and content value provided in this study (Supplementary Information 2) have several applications. The main use is providing economic valuation of economic assets for natural hazard exposure and risk assessments carried out at the level of individual buildings (large-scale mapping). The time series could be used to correct past recorded damages from natural disasters for changes in asset reconstruction costs (separately for dwellings and contents) over time but also for changes in average quality of residential buildings and incomes of households that translate into more expensive consumer durables kept at home. Finally, the data could be used to rescale absolute damage functions, which generate damage estimates based on intensity of the hazardous event not as percentage of assets lost but as an absolute value for a given country in a specific year. In the field of flood risk, almost half of damage functions provide absolute values of damages instead of relative values
Further research on countries with good economic data would involve expanding the coverage in multiple aspects. Thematically, the net (depreciated) value of residential assets could be added to the dataset, as most of the necessary data have already been collected here. Net stock of dwellings is directly available for four more countries than the gross stock (Norway, Spain, Sweden, and Switzerland), while for others the PIM method would be used
Furthermore, we provide valuations at the national level, which neglects possible differences between urban and rural areas as well between regions of countries. This is exemplified by the example of Portuguese asset valuations for urban, intermediate, and rural areas mentioned in Sect.
Still, most countries of the world do not disseminate such detailed housing, asset, investment, or expenditure data as were used in this study. Simplified methods to indirectly estimate exposure will therefore be needed. GDP per capita was incorporated by the NERA study as such a measure, but as the comparison in Sect.
In this study we have explored aspects related to estimating exposure of residential assets in Europe. Firstly, we proposed a methodology to estimate useful floor space area of buildings in a situation when the only accessible quantitative measure about a house is its footprint area. This basic measure can be derived from various sources, from analogue topographic maps to crowdsourced databases like OpenStreetMap (OSM). Building height or the number of floors is only occasionally accessible; hence it has to be estimated based on other information. In our work, we have shown that a Bayesian network quantified with a set of publicly available pan-European raster datasets and building footprints from OSM has the ability to differentiate between urban high-rises and suburban or rural single-family dwellings. Further, it can be applied to approximate building dimensions that can be the basis for assigning economic value to assets in question.
In the second part of the analysis, we harnessed publicly disseminated statistical data on housing stock and national economies to calculate time series of average value of residential assets – building structure and household contents – for 30 European countries. It can be applied whenever local exposure data are missing or no detailed characteristics of buildings are accessible. Additionally, it can improve analyses of past natural disasters by estimating exposure of assets in a particular year and country, as well as enable transferability of damage models that provide absolute rather than relative damages. More work is expected on expanding the thematic, spatial, and temporal coverage and resolution of the dataset. It will also be applied as an important basis for constructing and validating a new generation of vulnerability models in natural hazards.
This study relied entirely on publicly available datasets, with the exception of validation datasets from Poland, Germany, and the Netherlands in Sect.
The supplement related to this article is available online at:
DP conceived and designed the study, collected and analysed the data, and wrote the first draft of the manuscript. HK and KS helped guide the research through technical discussions. OMN provided code for data analysis and was involved in technical discussions. PT provided, and supported processing of, some of the spatial datasets. All authors reviewed the draft manuscript and contributed to the final version.
Authors Heidi Kreibich and Kai Schröter are members of the editorial board of
This article is part of the special issue “Global- and continental-scale risk assessment for natural hazards: methods and practice”. It is a result of the European Geosciences Union General Assembly 2018, Vienna, Austria, 8–13 April 2018.
The authors would like to thank Dennis Wagenaar (Deltares) for kindly sharing the data from the 1993 Meuse flood, the office of the Polish surveyor general for providing topographical data from the national cartographic repository, and colleagues at the GFZ German Research Centre for Geosciences for their help with extracting the flood damage data contained in the HOWAS21 database (
This research has been supported by the Climate-KIC (grant no. TC2018B_4.7.3-SAFERPL_P430-1A KAVA2 4.7.3) and Horizon 2020 (H2020_Insurance (grant no. 730381)).The article processing charges for this open-access publication were covered by a Research Centre of the Helmholtz Association.
This paper was edited by Sven Fuchs and reviewed by two anonymous referees.