Interactive comment on “ Rapid post-earthquake modelling of coseismic landslide magnitude and distribution for emergency response decision support ” by Tom R .

This paper aims to analyse co-seismic landslide distributions using a fuzzy logic approach that takes into account seven variables which are mostly derived from an Aster DEM for the 2015 Gorkha earthquake. It addresses a relevant scientific question within the scope of NHESS. Although interesting, the claim that the authors make to provide a novel method for rapid pos-earthquake modelling of coseismic landslide magnitude and distribution that will support emergency response, is not entirely valid in my opinion, and overstates the importance of the work. The authors state they have generated a new method that can be applied to quickly predict the locations of landslides, after the occurrence of an earthquake, with a “limited” sample of mapped landslides. The authors use a sample of 2006 co-seismic landslides that were mapped within 12 days


Introduction
Coseismic landslides are one of the most widespread and destructive hazards to result from earthquakes in mountainous environments.Fatalities in earthquakes with landslides have been shown to be up to ten times higher than in comparable earthquakes without landslides (Budimir et al., 2014).Landslides are also a key inhibitor of relief and reconstruction via the blocking of critical infrastructure and present a chronic hazard, with post-earthquake landslide rates remaining elevated compared to pre-earthquake rates for at least several years (Marc et al., 2015).Rapidly identifying the distribution of landslides following an earthquake is therefore crucial for understanding the total earthquake impacts (Robinson andDavies, Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2017-83, 2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.2013), aiding immediate emergency response efforts, including search and rescue (SAR), and assessing the longer-term postearthquake risks.If an assessment of landsliding is to be useful, it needs to be rapid, i.e. generating outputs within the same timeframe (hours to days after the mainshock) as a response is being coordinated, understandable and communicated to appropriate stakeholders.However, post-earthquake landslide mapping is a difficult and time-consuming task, hindered by issues relating to the collection and processing of appropriate satellite or aerial images, cloud cover, and the slow speeds associated with manually identifying and mapping large numbers of landslides.Following the 2015 M w 7.8 Gorkha earthquake in Nepal, efforts to rapidly identify and map coseismic landslides using satellite imagery were undertaken by various international groups.However, such issues resulted in most initial mapping not being completed until more than one month after the earthquake occurred (e.g., Kargel et al. 2016), and despite most suitable imagery being collected during this time, complete inventories containing polygon data took considerably longer (e.g.Martha et al., 2016;Roback et al., 2017).
By comparison, the post-earthquake emergency response began to transition to a recovery phase within days to weeks after the event (Nepal Army, 2016).Consequently, the results of efforts to map coseismic landslides can come too late to effectively inform the emergency response to the earthquake.
This issue can potentially be helped through the use of empirically-based models of coseismic landsliding.Several previous attempts have been made to produce landslide models that can be applied rapidly after an earthquake (e.g.Jibson et al., 2000;Godt et al., 2008;Nowicki et al., 2014;Kritikos et al., 2015;Gallen et al., 2016), but with somewhat limited success.These models have attempted to predict the locations where landslides are most likely to have occurred, based either on statistical analysis of the locations of identified landslides from one or more prior earthquakes, or on a simplified Newmark analysis.
Approaches using statistical analysis of landslides from previous earthquakes have attempted to develop global relationships between predisposing factors and landslide occurrence (e.g.Kritikos et al., 2015), allowing the model to be rapidly applied to the specific area of interest following an earthquake.However, these models suffer from issues related to the quality and accuracy of input data, availability of training inventories from a sufficient number of different representative environments, and an inability to effectively extrapolate relationships beyond the input data limits.Approaches using a simplified Newmark analysis adapt the well-established Newmark sliding block model (Newmark, 1965) to predict slope stability using peak ground acceleration (PGA), local slope angle, and estimates of material shear strength properties (Jibson et al., 2000;Gallen et al., 2016).They can therefore be applied following an earthquake using only a digital elevation model (DEM) and established ground motion prediction equations.This method is limited however, because estimates of shear strength properties are difficult to obtain and are generally unknown at scales relevant to landsliding (Gallen et al., 2015).Gallen et al. (2016) applied a rapid Newmark analysis model immediately following the 2015 Gorkha earthquake which estimated 2987 landslides above the main slip patch compared to 2214 landslides that they were able to observe.However, their model showed significant discrepancies from the observed spatial pattern of landslides and subsequent mapping (Martha et al., 2016;Roback et al., 2017) has suggested up to ten times more landslides occurred than Gallen et al's (2016) modelled and observed.This failure is crucial, as the locations where landslides occur dictate the required response as much or perhaps Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2017-83, 2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.more than the number or volume of landslides.Developing a method that can accurately model both the spatial distribution and magnitude of landsliding within hours to days of a large earthquake is therefore vital.
This study presents a new approach for modelling of landslide magnitude and distribution following an earthquake using empirical methods.Instead of attempting to identify global relationships for predisposing factors, this study uses fuzzy logic in Geographic Information Systems (GIS) to rapidly assess relationships for the affected area using small samples of landslides identified by initial mapping efforts.The aim is to investigate whether these initial mapping efforts can be used to forecast the locations of as-yet unidentified landslides, and what the necessary requirements of these initial efforts are in terms of spatial coverage and number of landslides identified.We apply the model to the 2015 Gorkha earthquake and discuss the potential speed with which this model could be applied following future earthquakes.The purpose of the study is to demonstrate that meaningful assessments of the magnitude and total distribution of coseismic landslides can be undertaken within a short time after the earthquake, allowing the outputs to effectively inform emergency response.

Gorkha earthquake application
The M w 7.8 Gorkha earthquake occurred on 25 April 2015 with an epicentre ~80 km north-west of Kathmandu (Fig. 1), rupturing a ~150 km section of the Main Himalayan Thrust (MHT) (Avouac et al., 2015;Hubbard et al., 2016).It resulted in intense ground shaking throughout central Nepal leading to large-scale damage to built infrastructure and > 8000 fatalities.
The earthquake predominantly affected the Lesser Himalayan region north of Kathmandu, which is characterised by steep slopes and elevations ranging from ~1000 -3000 m.Martha et al. (2016) and Roback et al. (2017) documented between 15,000 and 25,000 coseismic landslides with a total area of ~90 km 2 and affecting an area > 14,000 km 2 (Fig. 1).The earthquake was followed by a series of powerful aftershocks, including a M w 7.3 on 12 May 2015, but most of the damage is believed to have resulted from the mainshock, with just 213 landslides attributed to the 12 May aftershock (Martha et al. 2016).Landsliding from the mainshock led to widespread losses in remote communities, as well as impacts on roads and trails that provide the only access to those communities.
Efforts to rapidly identify and map the resulting landslides led to one of the largest-ever NASA-led satellite image acquisition responses to an earthquake disaster (Kargel et al., 2016).However, persistent cloud cover in the affected region, combined with the time required downloading and georeferencing large amounts of imagery, hindered these efforts.By 4 May 2015 (9 days after the mainshock), a joint group from Durham University (DU) and the British Geological Survey (BGS) had mapped the location of just 279 landslides, which increased to 2006 by 7 May, 12 days after the mainshock (Table 1; http://ewf.nerc.ac.uk/2015/05/08/nepal-earthquake-update-on-landslide-hazard-2/).A parallel mapping effort was undertaken by a joint group led by the International Centre for Integrated Mountain Development (ICIMOD), NASA, and the University of Arizona (UA), and identified 4312 landslides (Fig. 1) in the period to 2 June 2015 (38 days after the mainshock) (Kargel et al., 2016).Kargel et al. (2016) argued that this was one of the fastest and broadest emergency remote sensing efforts undertaken by NASA.Importantly, however, the on-the-ground emergency response following the earthquake was decelerated from the end of May, 36 days after the mainshock, and officially ended on 15 June 2015, 51 days after the mainshock (Nepal Army, 2016).Consequently, much of the results of these mapping efforts came too late to inform any humanitarian activities during the emergency response.It is therefore clear that the current approach to mapping landslides post-earthquake is not sufficiently fast to effectively inform or support decision making during an emergency response.
In this study, we build a model that can rapidly assess the potential total distribution of landslides based on an initial, small sample of identified landslides collected soon after the earthquake.To do this, we model the spatial distribution of landsliding based only on the locations of the 2006 landslides identified by the DU-BGS group up to 7 May 2015.The model is tested against the 4312 landslides independently mapped by the ICIMOD-NASA-UA group up to 2 June 2015.While this larger dataset is thought to contain some of the 2006 landslides from the DU-BGS group, differences in mapping approaches (e.g.landslide top points versus centroids) between the groups result in minor location differences between the two datasets.Subsequently, we test the effect of reducing the number of training landslides in order to examine the minimum number required to successfully model the final mapped spatial distribution and magnitude of landsliding.

Data
Landslides were manually mapped by the authors in GIS in the immediate aftermath of the earthquake using imagery from a range of sources, including web-hosted high-resolution optical data in Google TM Crisis Response (e.g.UK-DMC2 © DMCii, WorldView ©, DigitalGlobe TM Inc., SPOT © CNES), imagery accessed via the Disaster Charter, imagery available from USGS HDDSExplorer, and imagery specifically tasked over regions of interest (e.g.Pleiades © CNES).All data, apart from that which was hosted on web-based GIS platforms, variously required georeferencing, pansharpening, and orthorectification.A database of cloud cover was maintained to monitor where ground had been visible and mapped, and where it remained obscured.All identified landslides were mapped as a single polyline along the long-axis (crest to toe) allowing data on landslide location, extent, and the intersection with infrastructure to be quickly extracted.Point landslide positions at the crest end of the mapped polyline are extracted for this analysis, representing the location of initial failure.
Mapping was undertaken at a scale of approximately 1:10,000 and was incrementally posted to the Humanitarian Data Exchange portal (https://data.humdata.org/group/nepal-earthquake),and shared more widely with stakeholders in Nepal in the days and weeks after the earthquake.This mapping effort, and the lessons learned, is described in detail in Williams et al.

(forthcoming).
Topographic factor analysis undertaken herein has used the open-source ASTER GDEM V2 (https://asterweb.jpl.nasa.gov/gdem.asp),which has a cell resolution of 30 m, while seismic data was downloaded at the time

Method and data analysis
Approaches to modelling regional-scale landslide hazard have been summarised in several comparative overviews (e.g., Aleotti and Chowdhury, 1999;Guzzetti et al., 1999;Wang et al., 2005).The present study uses fuzzy logic as the method is computationally simple, can consider highly uncertain data inputs, and has been shown to match or out-perform other approaches (Pradhan, 2010;Bui et al., 2012;Pourghasemi et al., 2012).

Fuzzy logic
Fuzzy logic derives from fuzzy set theory in which an event is assigned a degree of membership, varying between 0 and 1, to a given set, where 1 indicates that the event is entirely related to the set (full membership) and 0 indicates that the event is entirely unrelated to the set (no membership).When utilised for landslide hazard models, this membership value describes how different values of a predisposing factor influence landslide occurrence, with higher values representing greater influence.For instance, steeper slope angles are known to produce higher rates of landsliding compared to shallower slope angles, meaning the influence of slope angle on landsliding increases with increasing slope angle.Influence (I) is measured for unique values of each predisposing factor using a kernel density estimate (KDE) by comparing the prevalence of a given unique value at landslide locations to its prevalence throughout the whole study area: where I x is the influence of factor value x, Lx KDE is the KDE of value x for landslide locations, Sx KDE is the KDE of value x throughout the study area, and n is the total number of unique factor values considered.Consequently, if the prevalence of a value is greater at landslide locations than throughout the whole study area, that value has high influence on landslide occurrence.I is then normalised to a 0-1 scale by Factors that show a clear and definable relationship between influence and factor values are then carried forward, while factors that show no relationship, or a relationship that cannot be mechanically explained, are removed from the analysis.
Factors carried forward undergo a semi data-driven linear regression to find the most appropriate function to describe the distribution of I norm , using the coefficient of determination (R²) to assess goodness of fit.A semi data-driven approach is preferred as this allows the user to manually alter the function if necessary (e.g., by using multiple functional forms) to achieve better goodness of fit for specific influence values at the expense of overall goodness of fit.For instance, in a conservative approach, it is desirable to achieve greater fit for high influence values at the expense of low influence values, as this will produce conservative estimates of landslide hazard.Typical functional forms considered include linear, values < 0.2 suggest landsliding is very weakly anticipated, while values > 0.8 suggest landsliding is very strongly anticipated.Landslides are therefore more likely to occur at larger aggregate influence values as landsliding is more strongly anticipated at these locations.
Various approaches to aggregation exist, however the fuzzy GAMMA approach has been shown to be the most effective for landslide modelling (Bui et al., 2012;Kritikos and Davies, 2015).Fuzzy GAMMA establishes the combined effect of multiple membership functions for each pixel such that where H LS is landslide hazard, I normF is the membership function for factor F, where F = 1, 2, …, j; j is the number of factors to be aggregated; and γ is a user-defined parameter between 0 and 1.The effect of changes in γ has previously been tested, with the optimal value for landslide modelling shown to be ~0.9 (Kritikos and Davies, 2015;Kritikos et al., 2015).This approach can therefore identify locations in the study area where the combination of predisposing factors results in landsliding being most strongly anticipated, and hence where as-yet unidentified landslides are more likely to have occurred.

Data analysis
In the present study, landslides mapped by the DU-BGS group prior to 7 May 2015 are used as the training dataset (n=2006) and landslides mapped by the ICIMOD-NASA-UA group prior to 2 June 2015 are used as the test dataset (n=4312).The training dataset is used to derive the KDEs and membership functions, and the test dataset is used to measure the accuracy and sensitivity of the model via the area under receiver operating characteristic (ROC) curves.The ROC curve is a fundamental tool for evaluating model accuracy and plots the true positive rate of the model against the false positive rate using various cut-off values (Metz, 1978;Zweig and Campbell, 1993).Landslides occurring in cells with H LS values above the cut-off are considered true positives, while cells above the cut-off without landslides are considered false positives.To be considered successful, models must typical achieve area under the curve (AUC) values > 0.7 (Kritikos et al., 2015).
In total, we assess the influence of 12 predisposing factors, each of which has previously been attributed to coseismic landslide occurrence in earthquakes elsewhere (Keefer, 2000;Khazai and Sitar, 2004;Lee et al., 2008;Meunier et al., 2008;Kritikos et al., 2015;Parker et al., 2015) and may therefore influence landsliding in Nepal.With the exception of earthquake-specific parameters (fault plane distance, PGA, Az Epi and Az SV ), each of these factors can be directly derived from a global DEM, such as the ASTER GDEM V2 used herein.This reduces the initial data requirements, helping to facilitate rapid modelling.This is especially important for locations such as Nepal where data for other predisposing factors, such as lithology, structural geology or rainfall, are either absent or difficult (and thus timeconsuming) to obtain.PGA, fault plane solutions and CMT data are available from the USGS and Global CMT catalog within a few hours of the earthquake occurring.Consequently, no pre-event data collection is necessary to implement this model, as all necessary data can be gathered post-earthquake from openly available sources.

Peak ground acceleration
PGA is given special consideration in the model, separate to the other predisposing factors.PGA models following an earthquakesuch as those produced by the USGS ShakeMapoften change frequently as more acceleration data become available (Fig. 4).Rapid assessment of the influence of PGA on landslide occurrence may therefore give misleading results, as the best PGA model available in the immediate earthquake aftermath may not yet accurately reflect the true pattern.For instance, initial PGA models following the Gorkha earthquake showed that PGA was highest in the region south of Kathmandu.This suggested that the majority of landslides, which occurred north of Kathmandu, occurred in locations with low-to-moderate PGA (Fig. 4).The time required to generate a final (or near-final) PGA model can be on the order of several days or weeks, and the shaking model may therefore be completed too late to be effectively incorporated into the rapid assessments considered herein.Furthermore, there is commonly a high degree of scatter in models that attempt to predict the number, size, and location of coseismic landslides from PGA (e.g.Keefer, 1984;Meunier et al., 2007).Results To account for difficulties in modelling landslides from PGA, we therefore model PGA as a threshold factor.This involves setting I norm = 0.0 for all values below a given threshold, while values above the threshold are assigned I norm = 0.5.This ensures that locations that do not exceed the threshold cannot sustain landslides, while landslide influence in locations above the threshold is dictated by the combination of other factors.From Fig. 4 and Martha et al. (2016) and Keefer (1984) this threshold is suggested to be ~0.08 g.

Nat
Importantly, while the location and magnitude of PGA values varied between initial USGS ShakeMap models over the first few days, the area contained within the 0.08 g contour was nearly invariant (Fig. 4).Given that ShakeMap interpolates PGA between seismic stations (Wald et al., 2005), it is easier to rapidly define the footprint of the area experiencing shaking (i.e. PGA > 0.08 g) compared to defining the highest values within this zone, as the number of recording stations increases with distance from epicentre, even in regions with low recording density.

Factor influence modelling
Of the 12 predisposing factors consider, seven show a clear and definable relationship between influence and factor value, and these are carried forward in the modelling process.The remaining factors either showed no clear relationship, or the relationship demonstrated could not be mechanically defined, and are therefore removed from the analysis.For instance, NRSD appears to show a relationship in which landslides are more strongly influenced at locations close to rivers than at ridgelines (Fig. 3i).This is in contrast to the relationship observed for planform curvature (Fig. 3e) as well as observations in the literature that suggest that topographic amplification increases landslide occurrence at ridgelines (Densmore and Hovius, 2000;Meunier et al., 2008;Kritikos et al., 2015).This may be a consequence of ridgelines herein for NRSD being defined using watershed boundaries, and thus landslides occurring on slope 'shoulders' within a watershed are not accounted for by this factor, but are by planform curvature.Similarly, there is no clear reason why landslides should be strongly influenced at elevations of 2000 -3000 m asl and only weakly influenced at higher elevations (Fig. 3h).More likely this reflects the fact that higher elevations were predominantly outside of the 0.08 g contour.For these reasons, NRSD and elevation are among the five factors that are not carried forward.
For the factors carried forward, increases in slope angle correlate with increased influence up to slope angles of ~65° (Fig. 3a), at which point the relationship ends due to the small area covered by slope angles > 70° (Fig. 2a).Slopes <~15° show no influence suggesting these slope angles are insufficient to sustain landsliding (i.e.I norm = 0).Increased distances from the fault plane, rivers, and river confluences generally result in decreased influence (Fig. 3b-d), despite a small and unexplained increase at distances of ~3000 m from rivers.Increases in relief at both hillslope and sub-hillslope scales are associated with increased influence up to ~200 m and ~90 m respectively (Fig. 3f-g).Similar to slope angle, the variation above these values likely reflects the small area covered by such values (Fig. 2f-g).Larger planform curvature values also correlate with increased influence, with notably higher influence on laterally convex (divergent) noses compared to laterally concave (convergent) hollows (Fig. 3e).
These findings are similar to those of other studies (Khazai and Sitar, 2004;Parker et al., 2015;Kritikos et al., 2015) in that they show coseismic landslides are most strongly encouraged on steep, laterally convex hillslopes close to rivers, and most weakly encouraged on shallow, linear slopes situated far from rivers.

Model success
Using the membership curves derived in Fig. 3, which show the variation in I norm for each factor, we aggregate different combinations of the seven predisposing factors carried forward to find the best performing combination of variables to model the spatial distribution and magnitude of the 4312 test landslides.Using all seven predisposing factors results in AUC = 0.838 but the best performance (AUC = 0.870) is achieved by excluding sub-hillslope relief and planform curvature (Fig. 5).The worst performing model (AUC = 0.761) excludes hillslope relief and planform curvature (Fig. 5).Importantly, all models achieve AUC values > 0.7, suggesting that any combination of these factors is able to model spatial landslide distribution with reasonable accuracy.
The output of the most successful model is shown in Fig. 6 and shows the locations where landslides are strongly-to-very strongly anticipated (H LS > 0.6) in red and weakly-to-very weakly anticipated (H LS < 0.4) in blue.The maximum modelled H LS = 0.93 suggests landsliding from this earthquake was very strongly anticipated in at least some locations.The total area where landslide occurrence is mildly-to-very strongly anticipated (H LS > 0.4) covers ~18,000 km 2 , approximately the same size as the total affected area (~14,000 km 2 ) mapped by Martha et al. (2016) and Roback et al. (2017), while the area where landsliding is very strongly anticipated (H LS > 0.8) covers ~85 km 2 , similar to the 90 km 2 total landslide area mapped by Martha et al. (2016).This area is concentrated in the major river valleys in Gorkha, Dhading, Rasuwa, and Sindhupalchok districts (Fig. 6), corresponding well with observed landsliding (Fig. 1) and confirming the high AUC value.With a combination of a high maximum H LS value and a large area where landsliding is strongly anticipated, the model suggests that widespread landsliding from this earthquake was highly likely.

Landslide magnitude and spatial distribution
In order to be useful for informing emergency responders, it is essential that the final output model explicitly demonstrates both the spatial extent and magnitude, either in terms of number, area or volume, of landsliding.To do this, we estimate landslide magnitude in terms of number density as a function of the kernel density of pixels with H LS values exceeding some user-defined threshold (e.g.all cells with H LS ≥ 0.9).The total spatial extent of landsliding is then represented by the area where kernel density is > 0. This assumes that the magnitude of landsliding in any given area is directly linked to the frequency of cells above the corresponding threshold within that area.An area dominated by cells with high H LS is expected to experience a higher magnitude of landsliding (i.e. a higher number density) than a comparable area dominated by cells with low H LS .
Taking the most successful hazard model (Fig. 6), we set threshold limits at the 99 th , 95 th , and 90 th percentile of H LS values where the area with values above the threshold accounts for 1%, 5%, and 10% of the total area (H LS > 0) respectively, corresponding to H LS = 0.74, H LS = 0.65 and H LS = 0.60 (Fig. 6 inset) .The corresponding cells are extracted and the kernel density for each threshold is calculated using a 1 km 2 moving window (Fig. 7).To test how well the training inventory is able to model total landslide magnitude and distribution, the results are compared against the kernel density of all 4312 test landslides, also defined using a 1 km 2 moving window.Because this study considers individual pixels, the total number of cells exceeding each threshold is large (1% = 388,694; 5% = 1,927,614; 10% = 3,827,036) and consequently the corresponding kernel densities (100 -1000 per km 2 ) are not directly comparable to real landslide number densities (1 -100 per km 2 ; http://ewf.nerc.ac.uk/2015/06/30/updated-30-june-landslide-inventory-following-25-april-and-12-may-nepalearthquakes/).This occurs because groups of adjacent pixels exceeding the threshold may in reality represent only one or two landslides, but each pixel is considered individually in the kernel density analysis.To account for this, we scale all kernel density models onto a 0-1 scale in which the maximum kernel density = 1 and all other values are scaled proportionally.The results therefore consider relative landslide magnitude (i.e.how locations compare relative to each other) rather than absolute magnitude.
By setting a threshold at the 99 th percentile (H LS > 0.74), the model is able to accurately predict both spatial distribution and relative magnitude of landsliding (Fig. 7).Generally, the model performs well, although it over-predicts in Gorkha district near the mainshock epicentre and under-predicts in Sindhupalchok district near the M w 7.3 aftershock.In particular, it is notable that the model successfully highlights the high relative magnitude of landsliding near to the mainshock epicentre, despite no training landslides coming from this region (Fig. 1), as well as closely matching the observed spatial pattern of landsliding.However, it fails to identify landslides in the hills south of Kathmandu, suggesting that landslides here were not influenced in the same way, or by the same factors, as landslides further north.None of the training landslides are located in this region (Fig. 1).Comparatively, setting thresholds at the 95 th (H LS > 0.65) and 90 th (H LS > 0.60) percentiles substantially over-estimates both the relative magnitude, particularly close to the Nepal-China border, and the total landslide pattern, whilst still underestimating landslide magnitude near the M w 7.3 aftershock (Fig. 7).

Minimum landslide numbers
The current method is trained on 2006 landslides mapped within the first 12 days following the mainshock.We test the impact of reducing the number of landslides involved in training the model in order to estimate how soon after an earthquake this method could be implemented, assuming that the number of landslides mapped immediately after an earthquake may be quite small (Table 1).In reducing the number of training landslides, we also consider the effect of their spatial distribution on the model results by comparing landslide sub-samples randomly distributed across the affected area with sub-samples that are clustered (Fig. 8).These conditions represent mapping efforts undertaken with both partial cloud cover (randomly distributed sub-samples) and more regular cloud cover with intermittent gaps (clustered sub-samples).This allows us to test the effect of both the number of training landslides and their spatial distribution, and consider to which weather conditions this approach may be best suited.
For the randomly distributed sub-samples, we iteratively select 500, 250, and 100 landslides from the original 2006 training landslides.We repeat this five times, producing five sub-samples that mimic the incremental mapping of landslides following an earthquake.For the clustered sub-samples, we split the original training landslides into 10 separate clusters based on landslide proximity to each other.To test the effect of each sub-sample on the model, we calculate the distribution of I norm as before for the seven predisposing factors for which membership functions have been derived.We then evaluate the R 2 fit of each sub-sample to the corresponding membership curves ((  )) and compare the results with the R² values achieved for the original 2006 training landslides (Fig. 3).
For the majority of factors, we achieve R 2 values within 10% of the original using 500 randomly distributed landslides (Fig. 8).For all factors other than distance from fault plane and distance from confluences, we achieve R 2 values within 25% of the original when using 250 randomly distributed landslides, while both relief factors and planform curvature achieve the same result for 100 randomly distributed landslides (Fig. 8).However, for the clustered sub-samples there is little consistency between factors in regards to the number of landslides and the R² values achieved.For the sub-hillslope relief, distance from rivers, and distance from confluences factors, the largest clusters have the greatest discrepancies in R² values compared to the original, while smaller clusters are able to achieve R² values within 10% of the original (Fig. 8).
This suggests that almost identical estimates of landslide hazard (Fig. 6), magnitude and spatial pattern (Fig. 7) could have been made using just several hundred training landslides, provided that these landslides had a sufficiently wide spatial distribution.Had these landslides been clustered then it is unlikely that similar membership curves, and thus hazard models, would have been derived.We therefore suggest that our model is most suited to conditions of partial cloud cover where the total visible area of ground, and therefore mapped landslides, is small but covers a wide area, or where large numbers of landslides are visible and thus the time to complete mapping is dependent on slow manual identification methods.

Implications for post-earthquake emergency response decision support
This study has derived a model that can be applied rapidly post-earthquake to aid emergency response.Comparison of modelled relative landslide magnitude and pattern (Fig. 7c) with the location of training landslides (Fig. 1) reveals locations where landslide intensity was expected to be high, but landslides had, at the time, not been mapped.In particular, the model These results can potentially be used to alert decision makers to the need for response in specific locations before landslides have been confirmed either remotely or on the ground (e.g., in Gorkha and Dhading).In regions badly affected by landslides, ground communications may be severely impacted, meaning the need for response can be overlooked until impacts, which might include landslides, are identified, potentially many days after the event.This study may therefore reduce this risk by enabling decision makers to identify regions likely to have sustained intense landsliding, potentially allowing them to target these areas for response before landslides are conclusively identified.Such an approach may lead to improving response times for badly affected and isolated regions following major earthquakes.

Potential speed of application
The method presented herein has been developed and applied after the fact and therefore it is necessary to discuss the potential speed with which it could be applied following a future earthquake.Firstly, this study has deliberately focussed on predisposing factors that can be rapidly and directly derived from a DEM, allowing the model to be applied with only minimal additional data requirements.Global DEM datasets at 30 m resolution are freely available from NASA and can be downloaded for the relevant region in a matter of minutes.Likewise, the USGS ShakeMap programme rapidly assess earthquake parameters within minutes of the earthquake occurring.
We have also shown that it is possible to apply this method from only several hundred landslides (Fig. 8).While the mapping efforts described herein took up to 38 days or more to identify >4000 landslides (Kargel et al., 2016), the joint DU-BGS group had identified 279 landslides within 9 days of the mainshock (Table 1).This was despite the majority of the affected area suffering from cloud cover, suggesting that in events with less cloud cover this time could potential be reduced to just a few days.The proliferation of rapidly available crowdsourced data from unconventional but quickly available sources such as social media may also be used to inform this model (Earle et al., 2010;Bruns and Liang, 2012).Importantly however, the results herein suggest that when cloud free imagery is available for the majority of the affected region, systematically mapping from this imagery in considerable detail, as is currently done, is not required.Instead, mapping could be undertaken in far less detail using a random sample of grid squares covering the entire affected area, with the results then fed into the model described herein.Such an approach may allow landslide mapping and modelling to combine more effectively following an earthquake to inform emergency response operations.
Furthermore, various automated landslide mapping techniques are currently being developed (e.g.Booth et al., 2009;Borghuis et al., 2007;Li et al., 2016), and some methods are even capable of identifying landslides through cloud cover (Kimura and Yamaguchi, 2000).Such techniques can accurately map landslide locations much faster than traditional manual mapping, potentially reducing the time required to map sufficient numbers of landslides.However, whilst automated mapping offers many advantages, at present the range of imagery, the often complex topography, and the poor radiometric distinction between intact slopes and landslides means that manual mapping is often the most reliable approach, particularly in regions without extensive pre-earthquake research.

Other predisposing factors
There are several other notable factors that have been shown to influence coseismic landsliding elsewhere that have not been considered herein, notably bedrock lithology (Keefer, 2000;Parise and Jibson, 2000;Khazai and Sitar, 2004;Dai et al., 2011), structural geology (Hoek et al., 2005;Selby, 2005;Moore et al., 2009), and rainfall (Iverson, 2000;Dellow and Hancox, 2006).Whilst these factors are certainly important for landslide occurrence, obtaining sufficiently accurate data is often difficult and time-consuming.This is especially true for lower income countries such as Nepal, where the necessary data either may not exist or may not be readily accessible.Further, we have shown that accurate landslide models can be created without the need for these factors, so long as other important factors, in particular slope angle, are considered (also see Pawluszek and Borkowski (2017) for a comparison of the role of topographic factors and non-topographic factors).Of course, if these data-sets do exist, then they can easily be incorporated into the factor analysis.

Benefits, limitations, and uncertainties
Several other models exist that allow rapid landslide modelling (Jibson et al., 2000;Godt et al., 2008;Nowicki et al., 2014;Kritikos et al., 2015) including the model of Gallen et al. (2016), which was applied following the Gorkha earthquake.The present study provides a useful complement to these existing models, as well as a more general understanding of factors influencing slope failure.One of the primary benefits of our model over others is its ability to accurately model both the spatial distribution and relative magnitude of landsliding.Gallen et al. (2016) were able to accurately model the total number of landslides in the Gorkha earthquake but found large discrepancies between their predicted landslide pattern and the observed pattern.Incorrectly modelling the locations of landslides may have detrimental effects on emergency response by focussing responders on areas not requiring immediate response.
Our model also benefits from being trained on landslides known to have occurred during the event.Consequently, the model is tailored to the specific location and earthquake, rather than relying on global or regional relationships (e.g.Kritikos et al., 2015).No pre-event knowledge of landsliding in the affected area is necessary as the model establishes the influence of various factors.The model has been shown to be successful despite potential systematic bias in the initial landslide inventories, such as cloud cover above or below specific elevations.This suggests that if the inventories are systematically biased, the results are unaffected.The lack of pre-event data requirements is also beneficial, as it does not necessitate the collection and storage of global datasets.Where data connections are limited or vulnerable, setting up geodatabases of topography and its derivatives alongside a platform to run models such as the one described prior to an earthquake would be a prudent earthquake preparedness measure.
However, there are important limitations in the model to consider.Firstly, the model is unable to estimate individual landslide area or volume, which becomes of vital importance at the local scale and for assessing post-earthquake impacts like sediment aggradation and flood hazard (e.g., Huang and Fan, 2013).Additional information on the size of landslides would be beneficial, as larger landslides have greater runout and therefore potentially pose a greater hazard.Furthermore, the model only considers relative magnitude of landsliding, rather than absolute magnitude.Whilst this allows determination of the relative impacts between locations, it does not assess absolute impacts, which may be important.Finally, as with all landslide hazard models, the accuracy of this method is reliant on the accuracy and resolution of the input data.Errors in locating landslides can result in incorrect estimates of I norm and therefore the eventual hazard models.Similarly, the model is reliant on the resolution and accuracy of the input DEM from which all non-seismic influence factors are derived.

Conclusions
This study has addressed the need for more rapid assessments of coseismic landslide magnitude and distribution following a major earthquake.Manual mapping methods can be too slow to effectively inform emergency response operations and are too readily affected by issues with image collection and visibility.The present study has demonstrated an empirical method to model landslide magnitude and distribution using fuzzy logic based on initial inventories with small numbers of landslides.We have shown that landslides mapped in the first few days following an earthquake can be successfully used to assess the influence of various predisposing factors, provided that these landslides have sufficiently wide spatial distribution.
The influence of these predisposing factors can be used to model the total pattern and relative magnitude of landsliding prior to total landslide inventories being completed using a simple threshold analysis.Applied to the 2015 Gorkha earthquake, our approach identifies: slope angle; distance from fault plane, rivers and river confluences; hillslope and sub-hillslope relief; and planform curvature as the key pre-disposing factors influencing landsliding during this event.The model output suggests that landsliding during the event was strongly anticipated in the major river valleys in Gorkha, Dhading, Rasuwa, and Sindhupalchok districts, agreeing with observed patterns of landsliding.Calculating the kernel density of cells in the 99 th percentile of modelled hazard values is able to accurately and simultaneously represent both the spatial pattern and relative magnitude of landsliding during the event.Assessments of the minimum number of landslides required to achieve these results suggest that just several hundred landslides throughout the affected area need be identified for the method to be successfully applied.Consequently, this suggests that systematic high fidelity mapping of landslides following an earthquake may not be necessary.Instead, mapping should focus on rapid sampling from grid squares covering the entire affected area and using the results to inform the model herein.Doing so could potentially have allowed this model to have been implemented within hours to days after the Gorkha earthquake occurred, permitting results to be available to emergency responders far earlier than allowed by traditional mapping techniques.
The method herein improves upon other approaches which have been shown to accurately simulate either landslide magnitude or distribution, but not both simultaneously.A key strength of our study is the use of open access data sources, allowing the model to be implemented globally without the need for pre-event collection and storage of data.This may therefore present a useful approach for rapidly assessing landslide hazard following an earthquake to effectively inform decision makers during emergency response operations.
Nat. Hazards Earth Syst.Sci.Discuss., doi:10.5194/nhess-2017-83,2017   Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.exponential, power-law, and polynomial functions.The final function is known as the membership function,(  ), which describes how I norm is distributed against all values of a given factor.(  ) acts as a filter to convert factor maps (Fig.2), which show the spatial distribution of unique values within each factor, into influence maps (Fig.3), which show the spatial variation of I norm .The resulting influence maps are aggregated on a pixel-by-pixel basis to produce a landslide hazard map that can have values[0, 1].This output represents the relative likelihood of landsliding due to the combination of influence from all factors: They are: (1) slope angle; (2) distance from fault plane; (3) peak ground acceleration (PGA); (4) elevation above sea level (asl); (5) distance from rivers; (6) distance from river confluences; (7) planform curvature; (8) normalised ridge-stream distance (NRSD); (9) hillslope Nat.Hazards Earth Syst.Sci.Discuss., doi:10.5194/nhess-2017-83,2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.relief; (10) sub-hillslope relief; (11) slope aspect relative to the epicentre; and (12) slope aspect relative to the slip vector.Distances from fault plane, rivers, and confluences are calculated as the two-dimensional Euclidean (straight line) distance between pixel centroids.For fault plane distance this does not account for depth, assigning landslides directly above the plane with distance = 0. Rivers are defined using a flow accumulation tool with an upstream contributing area threshold of 5 km², and river confluences are defined as the point where two or more rivers intersect.NRSD is calculated as the distance between a river and the nearest watershed boundary, where a value of 0 represents a pixel in the river and a value of 1 represents a pixel on the watershed boundary.Relief is calculated as the standard deviation of elevation values within a square window of radius a) 30 cells (hillslope relief); and b) 10 cells (sub-hillslope relief).Slope aspect relative to the epicentre (Az Epi ) is derived by combining local slope aspect (As) with the Euclidean direction to the epicentre (Dir), such that   = { |(| − |) − 360|,| − | > 180 | − |,| − | ≤ 180 (4) Az Epi has values [0, 180], where 0 indicates a hillslope directly facing the epicentre and 180 indicates a hillslope facing directly away from the epicentre.Slope aspect relative to the slip vector (Az SV ) is calculated in the same way, by replacing direction to epicentre (Dir) in Eq. (4) with slip vector bearing (SV).
. Hazards Earth Syst.Sci.Discuss., doi:10.5194/nhess-2017-83,2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.relying on initial PGA models are therefore likely to have such large uncertainties that they are impractical for informing emergency responders.
anticipates that landsliding in Gorkha and Dhading districts occurred at similar intensities to Sindhupalchok district despite very few training landslides in the former locations.Likewise, the model correctly identifies that the training inventory under-represents the magnitude of landsliding in Rasuwa district.Nat.Hazards Earth Syst.Sci.Discuss., doi:10.5194/nhess-2017-83,2017 Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 2 March 2017 c Author(s) 2017.CC-BY 3.0 License.

Figure 1 :
Figure 1: Landslide mapping efforts following the 2015 Gorkha earthquake.(a) Training landslides identified by the joint DU-BGS group using satellite imagery available to 7 May 2015; (b) test landslides identified by the joint ICIMOD-NASA-UA group using satellite imagery available to 2 June 2015.Focal mechanisms are taken from the Global CMT catalog.

Figure 2 :
Figure 2: Kernel density estimates of training landslides (black dots; n=2006) compared to the entire study area for different predisposing factors.(a) Slope angle; (b) Euclidean distance from fault plane; (c) elevation; (d) Euclidean distance from rivers; (e) Euclidean distance from river confluences; (f) planform curvature; (g) normalised ridge-stream distance; (h) hillslope relief; (i) sub-hillslope relief; (j) slope aspect in relation to azimuth to epicentre; and (k) slope aspect in relation to slip vector.

Figure 3 :
Figure 3: Predisposing factor influence on landslide occurrence.Panels a-g show influence maps, normalised kernel density estimates and membership functions, (  ), for factors with mechanically definable influence: (a) Slope angle; (b) distance from fault plane; (c) distance from rivers; (d) distance from river confluences; (e) planform curvature; (f) hillslope relief; and (g) sub-hillslope relief.Panels h-k show factors with no mechanically discernible influence: (h) elevation; (i) normalised stream-ridge distance (NSRD); (j) aspect relative to epicentre; and (k) aspect relative to slip vector.

Figure 5 :
Figure 5: ROC Curves for various different combinations of predisposing factors calculated from the test landslides (n=4312) mapped by the joint ICIMOD-NASA-UA group up to 2 June 2015.Black line in plot and black text in legend show the model with all factors; green shows the best performing combination; red shows the worst performing combination; grey shows all other combinations.Ex -Excluding; Sub-Hill Rsub-hillslope relief; Hill Rhillslope relief; Curvplanform curvature; Confl -5

Figure 6 :
Figure 6: Landslide hazard resulting from the most successful factor combination (Fig. 5) and distribution of hazard values (inset) showing hazard values corresponding to the 99 th , 95 th , and 90 th percentile of hazard area.

Figure 7 :
Figure 7: Normalised landslide magnitude and spatial pattern calculated as kernel density of pixels exceeding various H LS values compared to test landslides for cells with H LS > 0.74 (99 th percentile), H LS > 0.65 (95 th percentile), H LS > 0.60 (90 th percentile).

Figure 8 :
Figure 8: Effect of using smaller sub-sampled landslide inventories, either clustered or distributed across the entire affected area, on R 2 fit of membership curves for different predisposing factors (Fig. 3).Dark grey box shows values within 10% of original R² value; Light grey box shows values within 25% of original R² value.Table 1.Total number of landslides identified by date by the joint DU-BGS group following the 2015 Gorkha earthquake.5