Direct local building inundation depth determination in 3D point clouds generated from user-generated flood images

. In recent years, the number of people affected by flooding caused by extreme weather events has increased considerably. In order to provide support in disaster recovery or to develop mitigation plans, accurate flood information is necessary. Particularly pluvial urban floods, characterized by high temporal and spatial variations, are not well documented. This study proposes a new, low-cost approach to determining local flood elevation and inundation depth of buildings based on 10 user-generated flood images. It first applies close-range digital photogrammetry to generate a geo-referenced 3D point cloud. Second, based on estimated camera orientation parameters, the flood level captured in a single flood image is mapped to the previously derived point cloud. The local flood elevation and the building inundation depth can then be derived automatically from the point cloud. The proposed method is carried out once for each of 66 different flood images showing the same building façade. An overall accuracy of 0.05 m with an uncertainty of ± 0.13 m for the derived flood elevation within the area of interest 15 and an accuracy of 0.13 m ± 0.10 m for the determined building inundation depth is achieved. Our results demonstrate that the proposed method can provide reliable flood information on a local scale using user-generated flood images as input. The approach can thus allow inundation depth maps to be derived even in complex urban environments with relatively high accuracies.


Introduction
Worldwide the number of extreme weather events has increased in recent years (CRED -Centre for Research on the Epidemiology of Disasters, 2016).The reasons for this accumulation of flood events are numerous: on the one hand, climate change might be responsible for variations in weather events.On the other, land-use changes such as increased ground surface sealing are leading to uncontrolled overland runoff and rainwater drainage, especially in urban areas (Douglas et al., 2010;Mason et al., 2014).Due to spreading urbanization, more and more of the areas at high risk of flooding have become populated, for example regions close to rivers or at the foot of hills.Since this leads to larger numbers of people being affected in terms of physical or monetary damages, or even human costs, there is a major need for urban flood-risk management (Zevenbergen et al., 2008;Hammond et al., 2013;Iervolino et al., 2015).Information about previous floods, such as flood elevation and local inundation depths, are of high relevance for mitigation and resilience planning to assess and minimize the impact of disastrous events.
Urban flood events can be differentiated according to their major causes and categorized into the following groups: fluvial flooding (e.g., flash floods, river-based urban floods), groundwater flooding, coastal flooding, and pluvial urban flooding.Thus far, the main traditional data sources for monitoring and documenting floods are gauge-system measurements; forecasted and measured precipitation rates; and information derived from remote-sensing techniques, such as satellite imagery or light detection and ranging (lidar) (Lo et al., 2015).
Published by Copernicus Publications on behalf of the European Geosciences Union.
L. Griesbaum et al.: Building inundation depth determination in 3-D point clouds Pluvial floods are often triggered by blocked or overburdened sewage systems in combination with heavy rainfalls (Maksimović et al., 2009;Hammond et al., 2013).They are highly dynamic phenomena with high spatial and temporal variation (Blanc et al., 2012).Most of the abovementioned traditional techniques are, thus, not suitable because of their relatively coarse spatial and temporal resolution.Gauge systems usually do not cover a city's whole street network, and precipitation rates are generally not sufficient for the simulation of local pluvial floods.Furthermore, detailed remotesensing data are typically not available at short notice.Thus, many studies utilize remote-sensing data in the aftermath of an event for post-event flood simulation in order to retrieve the deluge extent or flood-water depth of previous events (Bates et al., 2003;Schumann et al., 2008Schumann et al., , 2011;;Abdullah et al., 2009;Chen et al., 2009;Merkuryeva et al., 2015).Only a few studies focus on flood-level and depth determination from flood data acquired during the event itself (Matgen et al., 2007;Mason et al., 2010Mason et al., , 2014;;Iervolino et al., 2015).Matgen et al. (2007) report a root mean square error (RMSE) of 0.41 m for flood elevation along a 1 km river section by combining high-precision digital elevation models (DEMs) with flood extent maps from synthetic aperture radar (SAR) data.Iervolino et al. (2015) derive local building inundation from SAR data with accuracies of 0.24-0.81m.However, the Flood Loss Estimation Model for the Private Sector (FLEMOps) requires building inundation depth accuracies around 0.10 m for flood damage assessment, since it suggests damage classification according to the inundation depth of a building in steps of 0.20 to 0.50 m (Thieken, 2008).
Thus, in order to improve flood disaster management in response to urban flooding, more detailed information on pluvial urban floods in terms of spatial and temporal resolution is necessary (Hammond et al., 2013;Merkuryeva et al., 2015).Emergency response or recovery actions in urban areas -such as damage assessment, flood simulation, flood map generation, or flood-risk analysis -require particularly detailed flood information at building level.A 2-D flood line often does not suffice to depict the full spatial impact of flooding due to the highly complex structure and topology of cities and towns.Thus, 3-D flood information can enhance flood-risk management.The combination of the high temporal dynamics of the phenomenon and the need for high-spatial-resolution, on-demand, in situ data make it particularly difficult to measure urban flooding.It is therefore necessary to develop new methods to generate water-level information about pluvial urban floods using available highresolution data (Price and Vojinovic, 2008).One attempt to achieve higher resolution and easier availability of pluvial flood data in urban areas is to apply close-range photogrammetry (CRP), i.e., a sequence of digital image processing methods based on computer vision algorithms (e.g., structure from motion: SfM), and photogrammetric approaches (e.g., dense matching: DM) to derive 3-D point clouds or high-resolution digital terrain models (DTMs) (Meesuk et al., 2015;Shaad et al., 2016).Smith et al. (2014) demonstrate the potential of using photogrammetric point clouds for the reconstruction of high-water marks of a flash flood event at a river channel.However, in urban areas, such high-water marks (i.e., clearly visible flood relics like mud lines) are typically removed very quickly after the flood event.Thus, typically only very few single images document the actual flood elevation in urban settings.
To complement traditional documentation systems and to tackle their temporal, spatial, or cost-related limitations, a possible approach can be the use of user-generated content (UGC), such as ambient geographic information (AGI) (Stefanidis et al., 2013) or volunteered geographic information (VGI) (Goodchild, 2007).The increasing distribution of mobile devices, in conjunction with the ever-expanding use of Web 2.0, has led to more virtual participation in flood mitigation activities as well as in flood event documentation (Fazeli et al., 2015;Klonner et al., 2016).Subsequently, many urban flood events are now indirectly documented by means of user-generated, partially geo-tagged flood images, posted on social media platforms.Various studies investigate the feasibility and benefits of using these new data sources for flood management (Fazeli et al., 2015).McDougall and Temple-Watts (2012) and Fohringer et al. (2015) successfully demonstrate the potential of VGI data for flood reconstruction by manual in-field measurements of the flood elevation given in flood images.Other studies propose semi-automatic approaches to derive the flood extent or level shown in flood images: Triglav-Čekada and Radovan (2013) map the extent of flooded areas based on geo-located flood images by applying a method where the absolute orientation of an image is found by fitting that image to the superimposed 3-D points of a DTM.Narayana et al. (2014) propose a technique to determine building inundation depth by matching a manually traced flood line from a given flood image to a respective non-flood image with the help of corresponding image features.However, this methodology has not been tested in a real-world setup and requires a priori knowledge about the buildings' height in order to determine the inundation depth.Furthermore, the approach uses information derived from 2-D imagery, which has inherent restrictions in terms of perspective.
From these studies it emerges that there is still a lack of automatic approaches that allow singular flood-event-based information to be extracted from unstructured user-generated images in order to reconstruct flood parameters at a local building scale in 3-D.Such an approach can effectively support the work of local authorities and disaster managers by complementing their manual flood measurements.These are usually captured only on the basis of visual flood markers, such as mud lines at façades, in order to facilitate damage assessment and flood-risk analysis.The aim of our study is to develop a low-cost method to extract local flood elevation as well as building inundation depth in urban settings on the basis of ordinary user-generated photographs.This semi- automatic workflow includes automatic flood-level detection from flood images, as well as a new and innovative way to integrate singular, i.e., flood-event-based, information provided by a single flood image into a photogrammetric point cloud by extending existing methods in order to reconstruct the flood elevation.In contrast to previous photogrammetric approaches where two or more perspectives are necessary to reconstruct a given object in 3-D, our method is a twostage approach, where (1) the 3-D scene is reconstructed independent of flood images (before or after the flood event) and ( 2) the single flood image information is integrated into the 3-D scene in order to reconstruct the flood level in 3-D.

Study area
The area chosen for study is located at the Karl Theodor Bridge in Heidelberg, Germany, on the river Neckar at 49.41334 • N, 8.70996 • E. The Neckar flows northwards between the Swabian Jura and the Black Forest into the Rhine River at Mannheim, and it drains major parts of the German federal state of Baden-Württemberg.The closest gauge station is located about 4 km upstream of the study area.The chosen area (Fig. 1) is characterized by a declined road section parallel to the river that leads below the bridge.It is continually at risk of river-based urban floods and regularly inundated.
The flood event examined in this study occurred on 30 May 2016.Several days of heavy rainfall led to gauge measurements reaching almost 430 cm (200 cm is the normal gauge reading).According to the discharge curve, the peak water elevation was noted between 16:00 and 23:00 LT (local time), after which the gauge reading started to decline (LUBW -Landesanstalt für Umwelt, Messungen und Naturschutz Baden-Württemberg, 2016).The area experienced an overflow of the riverbed, which caused the inundation of the nearby roads Neckarstaden and Am Hackteufel (Fig. 1).The flooding reached the facing side of the adjacent houses, which comprises the central object of interest in this research.
3 Data sets

Flood images
Of primary importance for this study are flood images showing the inundated object of interest, photographed during the flood event on the 30 May 2016.Image acquisition took place using two mobile devices, which are typically used for imagery contributed to social media, from different randomly chosen and accessible positions around the object of interest: (1) at around 16:00 and 19:30 LT with a Samsung Galaxy A3 mobile phone camera with image resolutions of 3264 × 2448 pixels and 3264 × 1836 pixels and (2) at around 18:30 LT on the same day with a Samsung Galaxy S2 mobile phone camera with a resolution of 2560 × 1920 pixels.In total, 66 flood images showing the object of interest were captured using automatically set camera parameters.

Non-flood images
Terrestrial non-flood images showing the object of interest were captured in June 2016, after the flood event had subsided, with a Sony Alpha 57 16-megapixel single-lens translucent (SLT) camera, which provides images with a resolution of 4912 × 3264 pixels.The camera settings were automatically determined by the device.Image acquisition took place with the camera's perspective converging towards the study area and at an approximate distance of 3.5 m between the individual camera positions (Fig. 1).At almost all of the 25 camera positions, two or more images were required to capture the whole building's façade.In total, 63 non-flood images were captured.

Reference data
A high-end state-of-the-art terrestrial laser scanning (TLS) measurement system, Riegl VZ-400, is used to provide reference data for the analyses.The TLS data were captured in May 2016, before the severe flooding had occurred.The scanning system operates with a wavelength of 1550 nm and a beam divergence of 0.35 mrad.Range precision (repeatability) and accuracy (conformity of measurements to actual geometry) are 3 and 5 mm at 100 m, respectively, as given by the manufacturer's data sheet (Riegl, 2016).The scene was captured from four different scanning positions.When combined into a single data set, i.e., when co-registered, the overlapping scans result in a point cloud with a total of 29 million measurements within the area of interest.The registration accuracy is determined via 64 point pairs manually picked in the individual scans.The average point pair distances for x, y, and z are 0.001 m, 0.001 m, and 0.000 m, with a standard deviation of 0.020 m, 0.010 m, and 0.010 m, respectively.The reference absolute flood elevation at the given time of the flood event is determined on the basis of a sequence of images of a nearby staff gauge (ca.20 m to the object of interest).The median water-level elevation derived by averaging this image sequence is Z w,TLS = 153.83m a.s.l., with an amplitude of ±0.10 m reflecting water undulation and waves.
Additionally, in the aftermath of the flood event, the building inundation depth is also measured in the field at seven distinct positions along the building's façade (Fig. 2).An example flood image captured at 16:00 LT serves as a reference for the manual measurements.Table 1 gives an overview of the captured inundation depth values.Since the mean distance between the minimally and maximally measured inundation depths for all seven positions is 0.10 m, the theoretical uncertainty of the measurements is ±0.05 m.
Further complementary reference data for the inundation depth are provided by independent expert measurements within the TLS point cloud.Eight experts in 3-D point cloud processing from the GIScience Research Group measured inundation depth values at the seven reference positions (Fig. 2).

Methods
The aim of the proposed method is to derive from a single user-generated flood image a set of 3-D points representing (1) the absolute flood elevation (Z w ) within the area of interest and (2) information about the local building inundation depth (h).The approach is based on free and open-source software solutions and is designed to work based on crowdsourced images of local-scale urban flood events.It succeeds where other remote-sensing techniques fail due to the unsuitability of their spatial and temporal resolutions.The workflow given in Fig. 3 depicts all major steps of the methodology.

Data pre-processing
For each of the 66 flood images, a dense 3-D CRP point cloud is derived from a combination of (1) the 63 non-flood input images and (2) the single flood image.A CRP approach generally comprises multiple steps.After detecting and matching similar features in overlapping images, the relative positions of the images and the exterior orientation of the cameras used are estimated.At the same time, bundle block adjustment serves to optimize these camera parameters before 3-D coordinates of the matched features are derived via ray intersection, resulting in a sparse point cloud.As a final step, dense matching is performed, whereby 3-D coordinates of all visible pixels are derived (Eltner et al., 2016).The applied CRP methods provide camera positions and orientations for all of the employed images.In order to provide a highly detailed evaluation of our results, we geo-reference the CRP point cloud based on seven distinct and equally spread-out ground control points (GCPs) derived from the highly accurate TLS point cloud.Fine registration is then performed with the iterative closest point (ICP) algorithm (Besl and McKay, 1992;Chen and Medioni, 1992)   within the area of interest (Rosnell and Honkavaara, 2012).The point density is defined as the median of the point count of all 0.20 m × 0.20 m cells (Kraus et al., 2006).

2-D waterline detection
The 2-D waterline can be described as the demarcation line between the water and those parts of the image where objects remain above water.In order to trace this demarcation line, the image pixels are categorized into two relevant classes, designated as water and background.To this end, two different techniques are proposed: (1) a semi-automatic approach using a supervised machine-learning algorithm for image segmentation and (2) manual image classification.The manual classification results serve as ground truth data for the evaluation of the automated segmentation.Similar to the work of Bruinink et al. (2015), the semiautomated segmentation approach is based on a trained random forest (RF) classifier.As conducted by Marx et al. (2016), 10 % of the available flood images are randomly chosen as training data before the whole data set of 66 flood images is segmented by the algorithm.The resulting probability maps are further processed by applying a probability threshold (= 60 %) to assign each pixel to a class, thus generating binary images for the classes of interest (Fig. 4).Residual salt-and-pepper effects as well as small data gaps are removed via a succession of binary opening and closing.The water body is then identified by the system as being the largest connected component of pixels classified as water.After semi-automatic as well as manual image segmentation and extraction of water areas, the demarcation line between water and background, i.e., the 2-D waterline, is identified as a sequence of image pixel coordinates.For each pixel column, the image-based coordinates of the upper-most pixel belonging to the water class is assigned as part of the 2-D waterline (Bruinink et al., 2015).

2-D-3-D mapping
In order to derive the absolute flood elevation (Z w ) within the area of interest as well as the inundation depths along the building's façade (h) with full 3-D information, the derived 2-D waterline image pixel coordinates are mapped to the respective 3-D point cloud.This 2-D-3-D mapping of the 2-D waterline pixels is based on photogrammetric principles to reconstruct 3-D scenes and thus dependent on the individual flood image's camera position and orientation.Knowing these, the relationship between a 3-D point coordinate (x, y, z) in the dense CRP point cloud and the 2-D coordinates of its projection onto an image (u, v) can be formulated as shown in Eq. ( 1) (Furukawa and Ponce, 2010).
(1) P is a 3 × 4 projection matrix, and d denotes the depth of a point in relation to the image's camera position C. P and C are readily available for each image because the CRP approach was applied to prepare the initial 3-D point clouds.In order to reconstruct singular image features such as the 2-D waterline, which is only given in one flood image, additional (external) geometrical information is applied to derive the depth d and, thus, uniquely reconstruct that feature in 3-D.In this study, the additional geometrical information is already known because the targeted 3-D water-level points necessarily lie within the building's façade.Consequently, the final 3-D water-level points can be located at the intersection point between the calculated line of sight resulting from Eq. ( 1) and the plane of the façade.
Therefore, the building's façade needs to be identified within the point cloud first.To this end, the 3-D point cloud is disjointed into segments representing single planes.Façades can be identified by means of feature constraints such as size (they are usually the biggest, highest, or longest segments), direction (based on the vertical orientation of walls), and topology (the façade plane typically intersects with the terrain plane) (Pu and Vosselman, 2006;Xiao et al., 2008;Serna et al., 2016).In accordance with these criteria for façade identification, in this study the façade is defined as being the largest connected vertical plane segment within the area of interest.

Flood elevation determination
Since the water's surface can be understood as a continuous rather than a discrete phenomenon, spatially isolated points are eliminated by fitting a linear least-squares model with a random sample consensus (RANSAC) algorithm to the preliminarily derived 3-D water-level points.The measured wa-ter undulation of 0.10 m serves as the threshold.The final set of 3-D water-level points is then used for approximation of flood elevation within the area of interest.Due to the extent of the study area (ca.30 m × 60 m) and the flood's characteristics, the water surface is considered to be perfectly horizontal along the building's façade in the case of a calm water surface.To this end, the statistical distribution of z values of all 3-D water-level points is assumed to reflect the actual flood elevation.The quality of the derived flood elevation Z w is then compared to and evaluated against the reference flood elevation as derived from the nearby staff gauge (Z w,TLS = 153.83m a.s.l.± 0.10 m).

Building inundation depth determination
The inundation depth is determined by calculating the distance between the water's surface, i.e., the water-level points derived in the previous step, and the corresponding terrain elevation at the seven reference positions (Fig. 2).In this study, use of a raster DTM as terrain reference representing the ground surface is demonstrated; however, it can be applied to further terrain model types, such as points, planes, or meshes.
In order to account for data gaps, as a preparatory step for the DTM generation, terrain points are identified.To this end, a minimum raster at a much coarser scale -namely with a cell size of 5 m × 5 m, exceeding the size of the data gapsserves as the initial terrain model.All points within a vertical buffer zone of this coarse DTM are assigned as terrain points.In this case, a threshold of 0.5 m is found to produce the most stable result.The remaining terrain points serve, then, as input for raster generation with a cell size of 0.2 m × 0.2 m and are based on the minimum z value within each cell.
The building inundation depth can be calculated as the vertical offset between the terrain reference and each of the derived 3-D water-level points.In order to compare and evaluate these derived depth measurements against the manual field measurements (hfield,R1-R7), captured at seven reference positions, only measurements within a horizontal range of ±20 cm of each reference position are considered.

Pre-processing
After geo-referencing of the 66 point clouds, each CRP point cloud is compared to the TLS reference data.This step is taken to ensure detailed validation of the final results.It revealed an average cloud-to-cloud (C2C) median distance of 0.02 m and an average completeness of 37.6 % at the façade and 87.6 % at the terrain (Table 2).These completeness rates are mainly based on data gaps due to occlusion effects caused by parking cars.In summary, the photogrammetric point cloud lacks completeness at a few regions of interest, i.e., terrain and façade, yet shows satisfactory performance concern- ing geo-referencing, which is important only for validation of the results.Therefore, these findings are to be considered when discussing the overall inundation results.

2-D waterline detection
The image segmentation resulted in binary images representing the extracted water body (Fig. 4).The average classification precision of all images for the water class is 98.5 %, and the average recall is 83.7 %.This means that the detected water pixels are classified with a high degree of precision, though not all actual water areas are identified as such.Three major aspects are responsible for producing values below 80.0 %: (1) the water body shown in the image is not represented by one connected component but rather split into parts by artifacts like street lamps because of the photographer's perspective.Thus, some parts of the water body are left out of consideration (Fig. 4g).(2) Images captured between 18.30 and 19.40 show shadowing effects on the water's surface due to the zenith angle of the sun, which negatively affected the classification.(3) Shallow water or wet surfaces, such as on the wall of a building or bridge pillars caused by waves, are less likely to be classified correctly (Fig. 4h).Generally, images of higher resolution and taken from a frontal perspective in relation to the object of interest, as well as with higher contrast and brightness, tend to yield a better outcome during segmentation.
The overall segmentation results of the applied image classification workflow showed similar results to those reported by Bruinink et al. (2015), with an average precision of 99.2 % and a recall of 91.0 %.They studied nine images captured by experts in order to derive staff gauge measurements from these images.The aim of the approach presented here, however, was to make use of user-generated flood images in order to extract the water level at urban structures and, thus, to handle a much broader range of input images.Crowdsourced image pre-processing, such as pre-selecting those flood im-ages which are most suitable, could be beneficial to the outcome, since it can help to eliminate unsuitable images of low contrast or brightness as well as blurry ones (Lo et al., 2015).

2-D-3-D mapping
The 2-D-3-D mapping process results in a reconstructed set of 3-D point coordinates indicating the flood level that is represented by the 2-D waterline shown in the flood image.The proposed method allows reconstruction of a 3-D point for each pixel and, thus, a dense 3-D representation of the 2-D waterline independent of the given point density of the photogrammetric 3-D point cloud.
The performance of the proposed 2-D-3-D mapping approach is influenced by the estimation of camera parameters done in the course of the CRP process.A low accuracy for the derived camera location and orientation will consequently result in a low accuracy for the waterline reconstruction.Also, objects in front of the considered façade (e.g., cars or street lamps) can lead to misplacement of 3-D water-level points because the 2-D waterline along these artifacts does not lie within the same 3-D plane as the relevant façade.They will be erroneously projected onto the façade plane during the 2-D-3-D mapping process.
Hence, the individual image characteristics are influencing factors insofar as they influence (1) the CRP process and thus the estimation of camera location and orientation and (2) the clearness of the 2-D waterline pixels.Some image characteristics (e.g., artifacts in the foreground of the image) are more troublesome than others when seeking to obtain reliable and accurate results.These could be filtered via user interaction, by making use of participatory sensing or through crowdsourced approaches (Albuquerque et al., 2016).

Flood elevation determination
The complete set of 3-D water-level points sums up to 10 347 individual measurements representing the water surface along the building's façade.These points are based on ground truth data from all input images (n = 66).The median flood elevation Z w,GT is 153.78 m a.s.l. with a mean deviation (MD) from the median of ±0.08 m (Fig. 5a).
In comparison to the TLS reference flood elevation value and under consideration of error propagation, the overall accuracy of the derived flood elevation measurements is given as 0.05 m ± 0.13 m. Figure 5b shows that more than 80 % (54 of 66 images) of the images result in a median water-level elevation within the range of the TLS reference (153.83m a.s.l.± 0.10 m).Only four images result in flood elevation values outside the wave movements of ±0.10 m from the derived median flood elevation (Z w ).The slightly lower accuracy of 0.05 m can partly be explained by the georeferencing accuracy of the CRP point cloud, with a median C2C of 0.02 m.With regard to the natural undulation and waves of the water surface (±0.10 m), the accuracy of ≤ 0.10 m obtained from the derived flood elevation values is considered a satisfying result.In comparison, Smith et al. (2014) derived a mean absolute difference of 0.29 m between high-water marks indicated in a photogrammetric point cloud and differential global navigation satellite system (dGNSS) measurements, whereby the water marks were derived from two or more images at a time.The proposed method in our study, however, only requires one flood image at a time and, thus, allows more flexibility in terms of flood image collection.

Building inundation depth determination
The final inundation depth results are achieved after calculating the elevation difference between the water-level points and the DTM.The derived depths depend equally on the accuracy of (1) the water-level points and (2) the topography of the respective terrain.Since the topography might change considerably along the façade (e.g., a declining road) but not the water level, descriptive statistics -such as minimum, maximum, or mean inundation -need to be taken with caution.The determined depths give rather selective measurements at certain positions along the building's façade.
The inundation depth findings are compared to the manual field measurements.This results in an accuracy of 0.13 m ± 0.10 m for 533 points lying within the ranges of the seven reference positions.Additionally, expert measurements taken in the TLS point cloud are evaluated in the same way.The overall accuracy of the inundation depth derived by the experts is 0.07 m ± 0.09 m for 56 points in total (Fig. 6).Generally, the expert measurements show slightly more-accurate depth results than the ones derived from the automatic method, especially for positions 1, 6, and 7, where only few terrain points can be found and thus outlier values are more influential.Experts, however, can account for micro-topography or other irregularities in order to avoid miscalculations due to artifacts or data gaps.Furthermore, it has to be considered that both the in-field measurements and the expert measurements are based on the same flood image, whereas the results demonstrated by the automatic approach rely upon a series of 66 different flood images indicating slightly different flood elevations due to waves.All in all, it can be concluded that experts can be better at incorporating irregularities due to their experience.However, computer-based measurements benefit from a more systematic, objective, and reproducible approach that is not subject to human error and interpretation.Furthermore, it is found that the automatic approach can nonetheless achieve a similar accuracy to that of the experts but with the additional advantage of being much more time-efficient.A combination of both approaches, for example a standardized automatic approach with interactive user input for quality assessment, could be a beneficial enhancement of the methodology.In comparison to building inundation depth estimations based on high-resolution SAR data with accuracy values between 0.24 and 0.81 m (Iervolino et al., 2015), the results given in this study suggest that user-generated flood photographs can serve as an alternative or complementary data source for local building inundation depth determination at a smaller scale.

Conclusions
In this study we showed the applicability and benefits of usergenerated flood images for the purpose of documenting local building inundation.To this end, we developed a method to derive the local inundation depth within a 3-D point cloud based on user-generated flood images.The aim was to determine the accuracy of this proposed workflow concerning the derived flood elevation as well as the resulting inundation depths at a local building scale.
The results of this study have shown that the developed methodology is able to obtain measurements of local flood elevation and building inundation depth to within an accuracy of < 0.20 m.The overall accuracy is 0.05 m ± 0.13 m for flood elevation and 0.13 m ± 0.10 m for the local building inundation depth.It is also shown that the method is applicable for crowdsourced images captured in an unorganized manner.Moreover, the measurements taken by experts revealed that the proposed method produces results almost as accurate as those provided by human experts.The main advantage of the semi-automatic segmentation process is its time efficiency and, thus, the possibility of processing multiple flood images to receive more robust inundation results.
The key findings of this study can be summarized in the following points.(1) A satisfactory accuracy of local flood elevation and building inundation depth determination can be achieved using the proposed workflow.Under consideration of the natural fluctuation of the water surface (here ±0.10 m), the final overall precision of the method (±0.13 m) is only slightly less precise than the inherent uncertainty of the phenomenon itself.(2) The extraction of the 2-D waterline from the provided flood image has a major influence on the accuracy of the final results.It is, thus, recommended that the image segmentation process be stabilized by pre-selecting the available flood images according to their individual image characteristics.Images with low contrast, especially in wet areas along the façade, tended to result in a less accurate 2-D waterline.
In comparison to other studies using, for example, highresolution SAR data for inundation depth determination, it has been shown that user-generated images can serve as an alternative or complementary data source to examine the effects of flooding on a very local scale.Our approach is, thus, considered beneficial for applications such as flood damage assessment, or resilience planning, and more generally all research dealing with urban floods.It delivers a low-cost approach for automatically detecting the flood elevation and inundation depth indicated in a flood image in 3-D, and it does not rely on in situ estimations.Thus, the implementation of the provided concept in the form of a web service or mobile app can be beneficial for local authorities, disaster managers, engineers, and insurance assessors in order to facilitate flood disaster management.Future investigations should be done in the field of data integration of already-existent data sources, namely non-flood images.Depending on computational resources, the service could then allow near-real-time application as soon as flood images are available.Moreover, the analysis of a much broader data set is manageable while not necessarily requiring fieldwork.Furthermore, the methodology can be adapted to various other use cases where only singular image information is given.

Figure 1 .
Figure 1.Overview map of the study area, including the measurement setup for acquisition of the terrestrial laser scanning (TLS) data and the camera positions of the non-flood images.
to improve the alignment result.The parameters used to assess the overall quality of the photogrammetric point clouds in comparison to the TLS reference data are explained below.1.The alignment quality of the photogrammetric point cloud to the TLS reference point cloud is based on the nearest neighboring point between the two point clouds.2. Completeness and point density are individually determined for the façade plane as well as the terrain plane of the 3-D point cloud.The completeness is calculated as a ratio of the number of 0.20 m × 0.20 m cells with a minimum of one point in relation to the full count of cells

Figure 2 .
Figure 2. Example flood image with reference positions for manual in-field inundation depth measurements at the study site indicated with red arrows.

Figure 3 .
Figure 3. Workflow of this study, depicting the individual operations of the proposed methodology.Blue indicates operations executed on point cloud level (3-D); yellow indicates operations executed on image level (2-D).

Figure 4 .
Figure 4. Segmentation results of two different flood images (top line: precision = 99.6 %, recall = 95.6 %; bottom line: precision = 98.5 %, recall = 54.3 %).(a, e) depict the original images.(b, f) show the probability maps after RF classification.(c, g) are the binary images of the finally classified water body after largest-component analysis.(d, h) show subsets of the extracted 2-D waterlines.

Figure 5 .
Figure 5. Distribution of flood elevation values in comparison to the TLS reference flood elevation value Z w,TLS from all input flood images (a) based on all single points and (b) aggregated per image.

Figure 6 .
Figure 6.Derived inundation depth accuracy and precision of the proposed method (blue) and expert measurements (magenta) given in comparison to the water movement (cyan).

Table 1 .
Manual in-field inundation depth measurements at the seven reference positions at the study site, as given in Fig.2.

Table 2 .
Quality indicators of the photogrammetric point cloud registration in comparison to the TLS point cloud.