Dealing with deep uncertainties in landslide modelling for disaster risk reduction under climate change. Natural Hazards and Earth System Sciences Discussions

. Landslides have large negative economic and societal impacts, including loss of life and damage to infrastructure. Slope stability assessment is a vital tool for landslide risk management, but high levels of uncertainty often challenge its usefulness. Uncertainties are associated with the numerical model used to assess slope stability and its parameters, with the data characterising the geometric, geotechnic and hydrologic properties of the slope, and with hazard triggers (e.g., rainfall). 10 Uncertainties associated with many of these factors are also likely to be exacerbated further by future climatic and socio-economic changes, such as increased urbanisation and resultant land use change. In this study, we illustrate how numerical models can be used to explore the uncertain factors that influence potential future landslide hazard using a bottom-up strategy. Specifically, we link the Combined Hydrology And Stability Model (CHASM) with sensitivity analysis and Classification And Regression Trees (CART) to identify critical thresholds in slope properties and climatic (rainfall) drivers that lead to slope 15 failure. We apply our approach to a slope in the Caribbean, an area that is naturally susceptible to landslides due to a combination of high rainfall rates, steep slopes, and highly weathered residual soils. For this particular slope, we find that uncertainties regarding some slope properties (namely thickness and effective cohesion of top soil) are as important as the uncertainties related to future rainfall conditions. Furthermore, we show that 89% of the expected behaviour of the studied slope can be characterised based on only two variables – the ratio of top soil thickness to cohesion and the ratio of rainfall 20 intensity to duration. by Anderson and Holcombe (2006). At the time, estimates of geotechnical, hydrological and geometrical parameters for the study site were derived from a combination of: topographic maps, site surveys, interviews with residents to estimate soil strata depths and weathering grades (based on their 25 experiences of excavating the soils to construct house foundations), elicitation of local engineering knowledge of soils, and information from shear box and ring infiltrometer testing of similar soils in Saint Lucia. Soil properties were also benchmarked against extensive triaxial and permeameter test data for similar undisturbed tropical residual soils in Hong Kong (GCO, 1982).

Landslide hazard assessment forms the basis for disaster risk reduction decisions such as the design of physical landslide hazard mitigation measures, planning controls and early warning systems for hazard avoidance, vulnerability reduction (resilience) approaches, or insurance. The spatial scale and the purpose of the hazard assessment, as well as the data available, 5 determine which methods or slope stability models can be applied. Available assessment methods include inventory-based susceptibility mapping or regional forecasting, and statistical, heuristic and physically-based modelling (either spatially distributed or site specific) (Soeters and van Westen, 1996;Dai and Lee, 2001). These methods require data in some or all of the following categories: i) inventories of past landslide locations, types and triggers; ii) preparatory factors determining the inherent susceptibility of a slope to landslides, such as slope geometry (slope angles and heights, convergence/divergence, soil 10 and parent material depths), the geotechnical and hydrological properties of slope material, and land-use; and iii) triggering factors such as rainfall intensity, duration and frequency.
The challenge of acquiring this data in sufficient quantity, quality and resolution is currently hindering the production of "actionable" landslide hazard information for decision-makers (Aitsi-Selmi et al., 2015). With the exception of particular regions such as Hong Kong, there is generally a lack of systematic landslide data collection even in the most landslide-prone 15 countries (Corominas et al., 2014). Especially high frequency, low intensity events may be missed out due to their small spatial scale and impact. Existing databases are therefore often spatially and temporally biased or incomplete. Lack of data on past landslides limits our ability to build inventory-based and statistical models for predicting likely locations, timing or impacts of future landslides. In comparison, the advantage of physically-based slope stability models is their smaller reliance on observations of past events and an ability to mechanistically represent the preparatory and triggering processes driving slope 20 failure. This latter characteristic also allows them to assess the impact of higher intensity rainfall events than a region might have experienced in the past, or of possible urbanisation scenarios. Landslide hazard researchers and civil engineers currently employ physically-based models to diagnose existing stability conditions, design new slopes and determine landslide probabilities.
The mechanistic ability of physically-based models comes at a cost. The detailed representation of slope processes in the model 25 requires detailed information on slope properties such as soils and topography. The more complex, high resolution and comprehensive the representation of slope processes in the model, the more data regarding the physical site characteristics is required. Parsimonious models are therefore often selected that are consistent with data availability and with the required level of process representation. Even then, data is rarely available in sufficient detail, and this introduces uncertainty into the model parameterisation. Sources of uncertainty include those associated with slopes geometries and material strata depths (Lumb, 30 1975;Corominas et al., 2014), soil properties (Cho, 2007;Beven and Germann, 2013), and a limited understanding of how measured variables relate to model parameters (the commensurability issue - Beven, 1989;Wagener and Gupta, 2005;Beven, 2006). The lack in accuracy of forcing boundary conditions, such as the temporal and areal variability of historical rainfall, introduces further uncertainty that needs to be considered (Minder et al., 2009;von Ruette et al., 2014). Studies have assessed Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Published: 12 September 2016 c Author(s) 2016. CC-BY 3.0 License. the impact of uncertainties associated with such models Arnone et al., 2016) for both site-specific and spatially distributed landslide hazard predictionsthough such information is rarely used to support risk reduction decisionmaking.
Additional uncertainty is introduced by the poorly known potential implications of future climate or land use change. Such uncertainties are different from the ones listed above, because they cannot be easily characterised by probability distributions 5 that different experts could agree on, i.e. they are often called deep uncertainties (Bankes, 2002;Lempert et al., 2003;Stein and Stein, 2013). The standard approach to dealing with deep uncertainties is through scenario-led strategies, also called topdown approaches. In these approaches, climate change projections of a general circulation model (GCM) are downscaled to derive local rainfall intensities and frequencies, as well as other climatic variables. These scenarios are then used to drive a slope stability model in a top-down manner (Collison et al., 2000;Melchiorre and Frattini, 2012). Studies that have attempted 10 to quantify the uncertainty in the estimation of climate variables derived in this manner have found them to be prohibitively large in many cases (e.g., Collins et al., 2012;Ning et al., 2012). As a result, one usually finds that the uncertainty in the final predicted impacts is also quite large and that a wide range of possible outcomes is feasible, which is of little practical use for decision-making or for identifying an "optimal" management solution (Bankes, 2002;Wilby and Dessai, 2010;Hallegatte et al., 2012;Herman et al., 2014). To reduce the range of possible outcomes, a smaller subset of the many possible potential 15 future scenarios can be selected. However, this approach is problematic as arbitrary selection of scenarios or of the downscaled simulations will undermine the credibility of the results (e.g., Kim et al., 2015).
Given that such deep uncertainties are unavoidable, a shift from "top-down" to "bottom-up" approaches has been suggested to derive actionable information for decision-makers (Groves and Lempert, 2007;Wilby and Dessai, 2010;Singh et al., 2014;Ray and Brown, 2015). While "top-down" approaches simulate system behaviour under potential future conditions in a 20 predictive manner (e.g., to estimate probability of slope failure given one or more climate change scenarios), "bottom-up" approaches focus on exploring the vulnerabilities of the system, i.e. on finding those combinations of factors values that would produce unwanted outcomes (e.g., slope failure). Bottom-up approaches are therefore stakeholder driven since they start with the stakeholder who has to define what threshold separates acceptable from unwanted outcomes. A wide range of possible values of the uncertain factors can then be considered (i.e. propagated through the model) and mapped onto the regions of 25 vulnerability of the output space. We can use statistical data-mining algorithms to quantify the link between inputs and outputs, i.e. the mapping stage. Commonly used algorithms to implement the mapping required for bottom-up approaches include Friedman and Fisher's (1999) Patient Rule Induction Method (PRIM) and Classification and Regression Trees (CART) developed by Breiman et al. (1984). The bottom-up strategy is very similar to the problem of mapping in Global Sensitivity Analysis, where one tries to understand which parts of the input factor space produce a particular model output, e.g. output 30 values above a certain threshold (e.g., Saltelli et al., 2008;Pianosi et al., 2016).
In this study, we apply a bottom-up approach to a landslide hazard assessment model in order to address the following three questions: Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Published: 12 September 2016 c Author(s) 2016. CC-BY 3.0 License.
1. Can we identify the dominant preparatory and triggering factors driving slope instability (i.e. slope geometry, geotechnical and hydrological parameters or rainfall drivers) in data scarce locations? 2. Can deep uncertainty in future landslide triggers, such as climate change, exceed other uncertainties?
3. What are the implications of uncertainty for data acquisition and assessment of future hazard? 2 Model and study site 5 We select a physically-based model and study site to represent the landslide hazard assessment and risk reduction challenges typical of data-scarce and resource limited locations (Fig. 1). The model is representative of those used by civil engineers and slope stability modellers. It can be parameterised using study site data and has a track-record of successful application in datascarce locations. The methodology we develop around this case study is transferable to other locations and other types of physically-based slope stability assessment models. 10

An urban study site in the humid tropics
The selected study site is situated on the lower slope of a ravine on a hillside in the city of Castries, Saint Lucia (Eastern Caribbean), in which informal construction of houses has led to increased landslide risk. The slope section is approximately 50 m in height and has an overall angle of about 30°. The material strata comprise up to 6 m of residual soils overlying a similar depth of decomposed rock (weathering grades III-IV; after GEO, 1988) overlying volcanic bedrock. As is typical for 15 the humid tropics, the dominant landslide trigger in the Caribbean is rainfall (Lumb, 1975;De Graff et al., 1989), and shallow rotational slides are the most common type of landslide in the deep weathered tropical residual soils on the mid to lower slopes of steep hillsides (Migon, 2010). Such locations tend to be the only land available to the most socio-economically vulnerable families, so even small landslide events can have a high societal impact (UNISDR, 2015). These "everyday disasters" are also increasingly seen as indicators of risk accumulation (low disaster resilience) and represent a potential hindrance to national 20 economic development (Bull-Kamanga et al., 2003).
In 2011 a landslide hazard reduction project was implemented at the study site location using a community-based approach -Management of Slope Stability in Communities (Mossaic)developed by Anderson and Holcombe (2006). At the time, estimates of geotechnical, hydrological and geometrical parameters for the study site were derived from a combination of: topographic maps, site surveys, interviews with residents to estimate soil strata depths and weathering grades (based on their 25 experiences of excavating the soils to construct house foundations), elicitation of local engineering knowledge of soils, and information from shear box and ring infiltrometer testing of similar soils in Saint Lucia. Soil properties were also benchmarked against extensive triaxial and permeameter test data for similar undisturbed tropical residual soils in Hong Kong (GCO, 1982).

A physically based model for rainfall-triggered landslides
Deterministic physically-based modelling of slope stability has previously been carried out at the study site using the Combined Hydrology And slope Stability Model (CHASM) to diagnose landslide drivers and estimate the benefit:cost ratio of landslide mitigation (Holcombe et al., 2012). In a validation exercise in Hong Kong CHASM was shown to be numerically robust and capable of correctly classifying 77% of failed slopes and 68% of stable slopes for a specified rainfall event (Anderson, 1990). 5 CHASM has been extensively used by slope stability researchers and practitioners in Malaysia, Indonesia, the Eastern Caribbean, United Kingdom and New Zealand (Anderson et al., 1997;Lloyd et al., 2001;Wilkinson et al., 2002a;Wilkinson et al., 2002b) A brief overview of CHASM is given herefull descriptions of the numerical scheme and principle equations can be found in Anderson and Lloyd (1991) and Wilkinson et al. (2002b). CHASM represents the slope cross-section as a regular two-10 dimensional mesh of columns and cells with geotechnical and hydrological parameters specified for each soil type (Fig. 2).
Initial hydrological conditions are the matric suctions in the top cells of each column and the water table position. Subsequent dynamic forcing conditions are rainfall events of specified intensities and durations imposed on the top cells. For each hydrological time-step (usually 10-60 seconds) a forward-explicit finite difference scheme is used to solve Richards' equation (Richards, 1931) and Darcy's Law (Darcy, 1856) for rainfall infiltration, unsaturated and saturated groundwater flows. Cell 15 moisture conditions, pressure heads and unsaturated hydraulic conductivities are updated at each time-step using soil moisture characteristic curves and the Millington-Quirk procedure (Millington and Quirk, 1959). At the end of each simulation hour the pressure head fields are used to calculate pore water pressures (positive and negative) for input to a two-dimensional Limit Equilibrium Method (LEM) calculation of slope stability. In LEM analysis the slope Factor of Safety (F) is calculated as the ratio of destabilising forces to resisting forces for a potential landslide slip surface location, such that F<1 indicates failure. In 20 CHASM either Bishop's simplified circular method of slices (Bishop, 1955) or Janbu's non-circular method (Janbu, 1954) is implemented using an automated search algorithm to identify the location of the slip surface with the minimum value of F for that hour.

Methods
Our study aims to advance understanding of the critical uncertainties driving rainfall-triggered landslides. To this end, the 25 simulation model, CHASM, is run with 10000 different combinations of values for the uncertain input factors. Simulations are performed using the BlueCrystal Phase 3 high-performance cluster at the University of Bristol, which contains 16 x 2.6 GHz SandyBridge cores. Each simulation is classified as corresponding to stable or unstable slope based on the resulting slope Factor of Safety, F, (model output) being above or below 1 at any stage during the simulated time period. We first perform a preliminary visual analysis of the simulations to identify influential factors that lead to slope failure. We then apply which the model predicts slope failure in a bottom-up strategy. The Matlab SAFE toolbox (Pianosi et al., 2015) and the CART functions in the Matlab Statistics and Machine Learning Toolbox (Mathworks, 2015) are used to perform our analysis.

Characterisation of uncertain input factors
The study site slope cross-section, as represented in CHASM, is illustrated in Fig. 2. Slope input factors (slope geometry, geotechnical and hydrological properties) are assumed to be random variables characterised by different statistical 5 distributions. These distributions and their respective statistical parameters have been obtained from different sources as described in Sect. 2.1, and summarised in Table 1. For each set of CHASM input factors generated, checks are undertaken to ensure particular combinations of the input factor values selected are physically realistic (for example, that saturated unit weight is greater than unsaturated unit weight). Model discretisation parameterssuch as the cell size (1m x 1m), hydrological time-step (60 seconds), and slip search grid location and dimensionsand physical and mathematical constants are not varied. 10 The dynamic hydrological scheme of CHASM requires the specification of hourly rainfall intensities to drive the dynamic hydrological component of CHASM for the selected hydrological time-step (60 seconds). While the uncertainties in slope properties are characterised by probability distributions based on past experience with the model and the study area, uncertainties related to potential future rainfall are deeply uncertain. We therefore represent our lack of knowledge by varying rainfall intensity-duration combinations widely to ensure that any feasible future design storm in a changing climate is captured 15 in our sampling. The ranges of rainfall intensity and duration used in this analysis are based on Intensity-Duration-Frequency (IDF) relationships derived for the design of the Roseau Dam in Saint Lucia (Fig. 3). Engineering consultants, Klohn-Crippen (1995), applied Gumbel analysis of 40-years of daily rainfall data from weather stations around the island to estimate the intensities and durations of 1:5 to 1:500-year return periods events. From these IDF relationships we define ranges of possible rainfall intensities of 0 to 200 mm.h -1 and durations of 0 to 72 h, which we sample independently and uniformly. 20 For this experiment the first 168 hours of rainfall forcing are set to an intensity of 0 mm.h -1 so that water table, cell moisture conditions and slope stability (indicated by F) can reach states of steady seepage and equilibrium. Rainfall of the selected intensity and duration is then imposed on the slope to determine the stability of the slope for that storm event. In the absence of information on typical hourly rainfall rates for events longer than 1 hour, rainfall intensity is assumed to be uniform across the rainfall duration sampled. The storm is followed by a further 168 hours of zero-rainfall simulation time to consider the 25 continued effects of the groundwater response on slope stability.

Classification And Regression Trees (CART)
While different algorithms have been used to implement the mapping step in bottom-up approaches in past studies, we have chosen to use Classification And Regression Trees (CART). Comparison between the most popular algorithms, CART and PRIM, did not show either algorithm to be superior (Lempert et al., 2008), while CART hast he advantages of simplicity and 30 an ability to work with minimal input from the user. CART is a machine-learning method for constructing prediction models from data (Breiman et al., 1984). In our application, such model takes the form of a binary tree, where a categorical dependent Nat. Hazards Earth Syst. Sci. Discuss., doi: 10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Published: 12 September 2016 c Author(s) 2016. CC-BY 3.0 License. outcome (i.e. slope failure versus stability) is predicted from a set of continuous independent variables (i.e. slope and design storm properties). The tree is composed of nodes and branches. At each node, an if-then condition is applied to one of the independent variables (e.g. "slope angle above or below 30 degrees") to generate two different branches. Several strata of nodes and branches compose the tree up to a stratum of terminal nodes (leaf nodes) where a prediction of the categorical dependent outcome is made (e.g., "slope fails"). 5 A CART is constructed through a recursive algorithm applied to a sample of independent variables (inputs) and associated categorical outcomes (outputs). In this study, the input/output sample is generated by the Monte Carlo simulation of CHASM.
The algorithm automatically selects which input variable and threshold value to use at every node. The selection is based on maximising the homogeneity of the output samples in the subsequent nodes. The level of homogeneity can be measured by different criteria. Here, we use the Gini impurity measure, which is defined in Eq. (1) as: 10 where, is the number of categories for the output (2 in our case), and ( ) is the fraction of output samples in the node belonging to category . A Gini impurity index equal to 0 corresponds to a pure node, i.e. a node where all output samples belong to the same category. In general, pure nodes cannot be obtained, but the algorithm seeks to minimise the Gini impurity

index. 15
A key question in constructing a classification tree is that of the optimal sizing (i.e. number of nodes and branches). Increasing the size of the tree generally increases its predictive accuracy over the sample used for its construction, but it might reduce its ability to generalise to new samples (overfitting). Moreover, a tree with simpler structure might be easier to interpret and communicate. Once a CART has been constructed, a machine learning technique called "pruning" can be applied to reduce the size of the tree by removing sections that provide little classification power. More details about the pruning technique can 20 be found, for example, in Hastie et al. (2009).

Results
In this section, we analyse the outputs from the10000 CHASM simulations to determine which factors control slope stability and whether critical thresholds exist beyond which slope failure will occur.
We perform an initial evaluation of the factors controlling slope stability by comparing the marginal distributions of the input 25 factors that cause slope failure with those that do not. The approach is generally referred to as Regional Sensitivity Analysis (first proposed by Spear and Hornberger, 1980; for a general introduction see Pianosi et al., 2016). We split the 10000 model simulations into two sub-sets: those that produce slope failure (F<1) and those that simulate a stable slope (F>1). For each input factor, the (marginal) empirical cumulative distribution functions of the two sub-sets are computed.  Fig. 4), the two lines essentially overlap, implying that the factors have no direct influence on determining failure (although they still may have influence through interactions, see Saltelli et al., 2008). Besides identifying the influential factors, Fig. 4 also provides information on the ranges of influential factors associated with slope failure. For example, the top left panel shows that the 5 model is unlikely to predict failure when thickness of top soil is smaller than 3 m, as the black solid line is nearly flat towards the smallest values of the original range of this factor. This initial analysis suggests that relatively few input factors have a significant impact on slope stability. Furthermore, it also provides initial guidance about which values of those factors are more likely to produce failure.
CART analysis provides a systematic way to quantify the thresholds separating slope failure and stability. Figure 5 shows the 10 classification tree obtained from the same set of simulations (see Appendix A for details on pruning). CART results are consistent with the initial analysis of input distributions, as evidenced by the fact that the main factors emerging in the classification tree of Fig. 5 are the same that were shown as influential in Fig. 4. From the 28 analysed factors (26 model parameters plus the two design storm properties), five factors alone (thickness of top soil, effective cohesion of top soil, rainfall intensity, rainfall duration and initial depth of water table) are sufficient to correctly classify 89% of the simulations. Figure 5  15 also illuminates the critical thresholds in slope properties and rainfall drivers that separate slope failure and stability. For example, the leftmost branch shows that if the thickness of the top soil is less than 2.9 m and effective cohesion of this stratum is below 2.1 kPa, the landslide model tends to predict slope failure. This happens regardless of what values rainfall intensity and duration take. The black/grey-shaded bar at the end of each node visualises the fraction of input factor combinations that produce failure/stability respectively. It therefore provides a visual indicator of the predictive performance of the tree at each 20 node and shows the high predictive performance achieved.
Effective cohesion and thickness of top soil appear multiple times in the same branch in Fig. 5 (for example, at the leftmost and the rightmost branches), which may indicate that these two factors interact with each other. This suspicion is confirmed if we look at the scatter plot of the effective cohesion of top soil samples versus the thickness of top soil samples for simulations that lead to slope failure (black) and stability (grey) (Fig. 6a). The triangular pattern in Fig. 6a clearly indicates that these two 25 factors interact to produce slope failure, i.e. a slope with more cohesive soil can be thicker without experiencing failure.
Visualising the thresholds identified from the CART analysis (Fig. 5) in this scatter plot (red dashed line in Fig. 6b) shows the inability of CART to characterise this interaction, i.e. CART attempts to reproduce the interaction through a sequence of vertical and horizontal separations. One can approach this problem either by rotating the axes of this graph or by creating a new auxiliary variable combining the interacting factors (Dalal et al., 2013). We create a new auxiliary variable, the ratio 30 between effective cohesion and thickness of the top soil, because we believe that it is a physically meaningful variable. We then generate a new tree based on the original factors plus the new auxiliary variable. The resulting tree, pruned for a similar error as the original tree, is shown in Fig. 7. It still classifies 89% of the simulations correctly but using a much simpler structure than the tree in Fig. 5. Figure 7 shows that when the ratio of cohesion to thickness of the top soil is above 2.0 kPa.m -1 , CHASM predicts a stable slope most of the time (4611 simulations correspond to a stable slope and only 385 simulations produce slope failure). When the top soil cohesion to thickness ratio is below 2.0 kPa.m -1 then rainfall characteristics need to be taken into account to predict slope stability. Fig. 6c also shows that the new separation line (red dashed) between failure and stability is now more consistent with the underlying scatter plot.
We further expect rainfall intensity and duration to interact in the context of slope stability. Indeed, numerous field observations 5 in the literature, and empirical relationships used in regional landslide early warning systems worldwide, show that both highintensity/short-duration rainfall combinations and low-intensity/long-duration combinations can result in slope failure (Lumb, 1975;Crosta, 1998;Martelloni et al., 2012). Empirical rainfall intensity-duration (I-D) thresholds are typically obtained by plotting observed landslide events according to their triggering rainfall intensity and duration, and are generally linear, with a negative gradient, when logarithmic scales are used. This log-log relationship can also be seen in Fig. 8, which shows a 10 separation between the rainfall I-D combinations that can trigger landslides (black dots) and those that do not (grey crosses).
We therefore create a second auxiliary variablethe negative ratio between the logarithm of rainfall intensity and the logarithm of rainfall duration (-log(I)/log(D))and repeat the CART analysis to see whether it is possible to further simplify the tree.
The resulting tree (Fig. 9) is pruned based on a similar error to previous trees (Figs. 5 and 7), but with a significantly simplified structure. We consider this final classification tree as the most effective output of the CART analysis, since its simplicity 15 facilitates communication with stakeholders, while preserving the same information content.

Discussion
We have shown that the application of a mechanistic model such as CHASM in a combined Monte Carlo and CART analysis framework can help identify the dominant preparatory and triggering factors driving slope instability in a data scarce location.
Our method goes beyond previous studies by accounting for both site-specific preparatory factors (geometrical, geotechnical 20 and hydrological conditions) and future uncertain rainfall triggers. Results for our study site indicate that targeted geotechnical data acquisition could help to constrain uncertainties in cohesion and soil depths; and the effects of different rainfall intensities and durations should be represented to capture both current and potential future hazard scenarios.

Preparatory and triggering factors driving slope instability
For our study site the ratio of the top soil strata's effective cohesion to strata thickness is shown to be a dominant factor in the 25 stability of the slope. Inspection of selected model outputs indicates that the critical slip surface generally tends to be located within this top strata (weathering grade V-VI) with the deepest part of the slip circle at the interface with the weathered material strata (grade III-IV), as illustrated in Fig. 2. The slip surface location within the top soil strata is explained by the site-specific parameters (Table 1) in which: i) the higher strength weathered material strata constrains the failure surface to within the weaker soil strata above; and ii) as rainfall infiltrates, the lower hydraulic conductivity of the weathered material is likely to 30 cause loss of soil matric suction and thus a reduced apparent strength at the soil/weathered material interface.
Of the two top soil strata strength parameters, it is effective cohesion rather than angle of friction, that shows up in the CART analysis because the study site slope is likely to be cohesion controlled rather than friction controlledi.e. the slope angle is typically greater than the friction angle. Figure 6 shows that when the top soil strata is thin it can remain stable even for very low values of effective cohesion, whereas, thicker soils tend to require a much higher effective cohesion for stability. This result is in keeping with other physically-based modelling studies of the relationships between the geometry of slopes, strata 5 and shallow landslides in cohesion controlled slopes (Frattini and Crosta, 2013;Milledge et al., 2014). Ignoring the effects of water table location and pore water pressures, the greater self-weight of soil at the base of a thicker soil stratum generates higher shear stresses and requires greater shear resistance for stability than in a shallower soil.
The importance of cohesion for the stability of tropical residual soil slopes, such as our study site, and its inclusion in stability analysis is the subject of an ongoing debate amongst geotechnical engineering, researchers and practitioners. Laboratory 10 analysis of the shear strength of remoulded clays show that the value of the effective cohesion parameter is affected by measurement uncertainties (Parry, 2004) and that peak cohesion is lost with seasonal cycles of dilatancy (Take and Bolton, 2011). The known sensitivity of slope stability to cohesion and the uncertainties associated with its measurement thus lead some engineers to adopt a highly conservative approach to the value of effective cohesion used in slope designoften assigning it a value of zero (Schofield, 2006). Yet, landslide hazard assessment scientists and engineering practitioners in the humid 15 tropics argue for its inclusion as an observable strength parameter in the analysis of existing slopes comprised of undisturbed tropical residual soils which exhibit relict structures from the weathered parent rock (Burland et al., 2008). For the study site, our modelling approach supports the inclusion of non-zero values of effective cohesion in the analysis to account for its observable stability in the field. Our results also indicate that any data acquisition strategy for this site should target both soil thickness and effective cohesion values (both undisturbed and remoulded) to improve landslide hazard predictions. 20 The second most important factor in the stability of our study site is the nature of rainfall events in terms of their intensity and duration. As noted in Sect. 4, when the predicted failed and stable slopes are plotted on log-log axes of associated rainfall intensities and durations (Fig. 8) a negative linear threshold is found above which landslides are more likely to occur. This relationship is observed in landslide inventories which are widely used to generate empirical regional rainfall intensity-duration (I-D) thresholds of the form: I = a1 D -a 2 , where a1 and a2 are parameters specific to a site or region (e.g., Larsen and Simon, 25 1993;Guzzetti et al., 2007). In the absence of empirical data, physically-based models may be used to generate synthetic thresholds using Monte Carlo methods (e.g., Peres and Cancelliere, 2014). Our method thus demonstrates both the importance of representing the dynamic hydrological processes involved in triggering landslides, while also providing a starting point for generating site-specific rainfall I-D thresholds in data scarce locations.
We have to reiterate that our results are valid within the context of the assumptions made in our study. For example, changing 30 a particular input factor distribution may influence the importance of other factors (Sect. 3.1 and Table 1). This is of course not a limitation that is specific to our study, but simply the constraint of any model-based analysis, which can in any case be reduced by additional analysis. The approach presented here provides a means for identifying dominant landslide preparatory and triggering factors, guiding data acquisition, updating the analysis and refining the hazard assessment.

Can deep uncertainty in future rainfall exceed other uncertainties?
In this study, we have evaluated how uncertainty about slope characteristics and future rainfall change may influence risk of slope failure. Our findings have demonstrated that, for our study site, physical slope properties, namely effective cohesion and thickness of the top soil, are significant drivers of slope stability. For this slope, these physical properties have a more significant impact on landslide hazard risk than variability in future rainfall intensity and duration. 5 Our findings have a number of important implications for landslide hazard research. With few exceptions (e.g., Melchiorre and Frattini, 2012), previous studies typically have analysed the impacts of uncertainty related to slope characteristics and future climate independently. For example, Dehn and Buma (1999), Collison et al. (2000) and Ciabatta et al. (2016) consider the impacts of climate change on slope stability, but ignore uncertainty around soil properties; while Rubio et al. (2004) account for uncertainty introduced by slope hydrology and geotechnical properties, but ignore uncertainty relating to design storms. 10 However, our results suggest that the failure to consider both sources of uncertainty simultaneously may lead to a significant underestimation of slope vulnerability under potential climate change. Furthermore, we have demonstrated how physicallybased models, like CHASM, can be utilised to rapidly assess the impacts of multiple interacting and uncertain drivers of landslide occurrence in ways that would not be possible using simpler statistical models (e.g., Dixon and Brook, 2007).
To date, only Melchiorre and Frattini (2012) have attempted to analyse slope failure considering both uncertainty arising from 15 slope properties and future climate. Their research used a top-down approach to quantify the impacts of multiple uncertainties.
Specifically, they used Monte Carlo simulations and sensitivity analysis to assess the impacts of uncertainty in soil properties on predictions of slope stability for a pre-defined set of precipitation scenarios. We conclude that deep uncertainty due to potential future climate characteristics is not yet fully considered in landslide hazard assessment, and, as a result, policy recommendations may lead to undesirable outcomes given the difficulty in predicting impacts of climate change on future 20 rainfall. In contrast, bottom-up approaches, such as that proposed in this study, consider a much wider range of possible system drivers (rainfall) and other uncertainties, without introducing assumptions regarding the probability of future precipitation. This insight is particularly relevant for small islands like the Caribbean, where the variety of different processes that contribute to rainfall change, some of which are poorly resolved by GCMs, make it very difficult to provide projections of future rainfall (Seneviratne et al., 2012). 25

Implications for data acquisition and assessment of future hazard
Bottom-up approaches to natural hazard risk assessment provide valuable knowledge to inform management decisions and to target data acquisition, especially in situations where resources may be limited. This study has shown how dominant physical slope properties driving landslide occurrence for a particular class of slope can be identified. As a result, decision-makers may seek to provide funding for targeted data acquisition to reduce uncertainty about the values of these parameters (e.g., thickness 30 and effective cohesion of top soil) in order to improve understanding of how likely a slope is to fail. Decision-makers could also use the knowledge gained from CART to target management practices to improve slope stability (e.g., improving slope Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Published: 12 September 2016 c Author(s) 2016. CC-BY 3.0 License. drainage or land cover) or to develop options to mitigate consequences of slope failure (e.g., restricting development in highrisk areas). In this regard, the methodology developed in this study could be used to quantify the impact of different management options on landslide occurrence thresholds, providing an objective measure of the potential value of each strategy that can support debate amongst disaster risk reduction practitioners.
CART also has the distinct advantage that trees can provide useful knowledge for decision-makers even when uncertainties 5 about future changes in climate are large, as is the case with rainfall projections in the Caribbean islands that are predominant areas where landslides occur globally. Any available climate information can be used a posteriori to assess the plausibility of threshold tipping points being crossed, and support the discussion whether improved management is required to improve slope stability in the future. Ultimately, the decision to implement any given action depends on how risk averse stakeholders are and how much they are willing to spend on an adaptation strategy. However, CART provides the tools and information to enable 10 managers to make informed choices that are robust under a wide range of plausible future conditions, reducing the risk of wasted investment and/or unanticipated negative outcomes.

Conclusions
In this study, we used a combination of physical-based modelling and empirical CART analysis to quantify the importance of different sources of uncertainty when predicting landslide hazards for an example case study in the Caribbean. Contrary to 15 common assumptions, our findings have showed that prediction of landslide occurrence may be more strongly influenced by uncertainty related to physical slope properties (e.g., cohesion and thickness of the top soil) than by (deep) uncertainty associated with future changes in rainfall patterns due to climate change. We suggest that failure to account for uncertainty related to both slope properties and climate change therefore will lead to a significant underestimation of landslide risks and associated impacts on human populations. 20 The methodology developed in this paper has demonstrated that bottom-up approaches, implemented here using CART, can provide valuable information for assessment of landslide hazards even in data sparse environments. Our bottom-up approach illuminates dominant drivers of slope instability, enabling stakeholders and decision-makers to target data acquisition to reduce model prediction uncertainty. Moreover, CART analysis provides estimates of critical rainfall thresholds at which slope failure is predicted to occur. Using this knowledge, decision-makers can assess whether it is likely these threshold tipping points being 25 crossed in the future given available climate change information, and they can determine if improved management may be required to ensure long-term slope stability in the face of climate change.
The approach presented is valid within the context of the model used, including its assumptions, and any other assumptions made. The results are transferrable to other regions where the assumptions made here hold true. Future work will focus on expanding these assumptions, e.g. by varying other slope characteristics such as slope angle and height to reflect a wider 30 region. The general approach to bottom-up analysis presented here is transferrable to other hazard models.
Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Published: 12 September 2016 c Author(s) 2016. CC-BY 3.0 License.  GEO (1988) b Water table height is defined as a percentage of slope height measured to the toe of the slope c n is always greater than 1 d is always greater than 0 e = + 2, where is the saturated unit weight f Effective cohesion is always greater than 0 U -Uniform distribution; N -Normal distribution; ln Nlognormal distribution Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci.      Nat. Hazards Earth Syst. Sci. Discuss., doi:10.5194/nhess-2016-278, 2016 Manuscript under review for journal Nat. Hazards Earth Syst. Sci.