Combination of statistical and physically-based methods to assess shallow slides susceptibility at the basin scale

Approaches used to assess shallow slides susceptibility at the basin scale are conceptually different depending on the use of statistical or physically-based methods. The former are based on the assumption that the same causes are more likely to produce the same effects, whereas the latter are based on the comparison between forces which tend to promote movement along the slope and the counteracting forces that are resistant to motion. Within this general framework, this work 10 tests two hypotheses: (i) although conceptually and methodological distinct, the statistic and deterministic methods generate similar shallow slides susceptibility results regarding the model’s predictive capacity and spatial agreement; and (ii) the combination of shallow slides susceptibility maps obtained with statistical and physically-based methods, for the same study area, generate a more reliable susceptibility model for shallow slides occurrence. These hypotheses were tested in a small test site (13.9 km) located north of Lisbon (Portugal) using a statistical method (the Information Value method) and a 15 physically-based method (the Infinite Slope method). The landslide susceptibility maps produced with the statistic and deterministic methods were combined into a new landslide susceptibility map. The latter was based on a set of integration rules defined by the cross-tabulation of the susceptibility classes of both maps and analysis of the corresponding contingency tables. The results demonstrate a higher predictive capacity of the new shallow slides susceptibility map, which combines the independent results obtained with statistical and physically-based models. Moreover, the combination of the two models 20 allowed the identification of areas where the results of the Information Value and the Infinite Slope methods are contradictory. Thus, these areas were classified as uncertain and deserve additional investigation at a more detailed scale.

Abstract.Approaches used to assess shallow slide susceptibility at the basin scale are conceptually different depending on the use of statistical or physically based methods.The former are based on the assumption that the same causes are more likely to produce the same effects, whereas the latter are based on the comparison between forces which tend to promote movement along the slope and the counteracting forces that are resistant to motion.Within this general framework, this work tests two hypotheses: (i) although conceptually and methodologically distinct, the statistical and deterministic methods generate similar shallow slide susceptibility results regarding the model's predictive capacity and spatial agreement; and (ii) the combination of shallow slide susceptibility maps obtained with statistical and physically based methods, for the same study area, generate a more reliable susceptibility model for shallow slide occurrence.These hypotheses were tested at a small test site (13.9 km 2 ) located north of Lisbon (Portugal), using a statistical method (the information value method, IV) and a physically based method (the infinite slope method, IS).The landslide susceptibility maps produced with the statistical and deterministic methods were combined into a new landslide susceptibility map.The latter was based on a set of integration rules defined by the cross tabulation of the susceptibility classes of both maps and analysis of the corresponding contingency tables.The results demonstrate a higher predictive capacity of the new shallow slide susceptibility map, which combines the independent results obtained with statistical and physically based models.Moreover, the combination of the two models allowed the identification of areas where the results of the information value and the infinite slope methods are contradictory.Thus, these areas were classified as uncertain and deserve additional investigation at a more detailed scale.

Introduction
The evaluation of landslide susceptibility has been carried out worldwide based on three fundamental principles (Varnes and International Association of Engineering Geology, Commission on Landslides and Other Mass Movements on Slopes, 1984;Carrara et al., 1991;Hutchinson, 1995;Guzzetti, 2005): (i) the landslides can be recognized, classified, and mapped; (ii) the conditions that cause instability (predisposing factors) can be identified, registered, and used to build predictive models; and (iii) the occurrence of landslides can be spatially inferred.Within this conceptual scheme, it is assumed that future landslides are more likely to occur in areas where geologic and geomorphologic conditions are similar to those that originated the slope instability in the past (Guzzetti et al., 1999).This conceptual scheme has been extended to different methods of landslide susceptibility assessment regardless of their nature (Varnes and International Association of Engineering Geology, Commission on Landslides and Other Mass Movements on Slopes, 1984;Hutchinson, 1995;Aleotti and Chowdhury, 1999;Carrara et al., 1999;Fell et al., 2008b).This is nonetheless surprising since the conceptual model is perfectly applied to any statistical method used to assess landslide susceptibility, but the same is not true for the physically based methods.Indeed, the latter methods are based on physical laws and soil mechanics principles where the slope is considered as a system where shear stress and shear strength are continually in opposition.Unlike landslide susceptibility models based on statistical methods, landslide inventories are not used to assess landslide susceptibility with deterministic methods.However, landslide inventories still remain essential to validate the obtained landslide susceptibility maps.In addition, landslide inventory information is frequently used for calibrating stability models through back-calculation (e.g.Delmonaco et al., 2003;Teixeira et al., 2015).
The comparison between different methods to assess landslide susceptibility is not a new research topic when performed exclusively with different statistical methods (Gorsevski et al., 2003;Süzen and Doyuran, 2004;Brenning, 2005;Davis et al., 2006;Lee et al., 2007;Felicísimo et al., 2013;Bui et al., 2016) or with different physically based methods (Zizioli et al., 2013;Formetta et al., 2014;Pradham and Kim, 2015;Teixeira et al., 2015).There are a few studies that compare the predictive capacity between statistical and physically based methods (Crosta et al., 2006;Carrara et al., 2008;Frattini et al., 2008;Yilmaz and Keskin, 2009;Cervi et al., 2010;Goetz et al., 2011;de Lima Neves Seefelder et al., 2016), and out of those only a limited number have combined the results obtained with statistical and physically based approaches (Chang and Chiang, 2009;Günther and Thiel, 2009;Goetz et al., 2011).According to Zizioli et al. (2013) the different methods used to assess shallow slide susceptibility are not mutually exclusive.The latter authors pointed out that the use of different strategies to assess landslide susceptibility and the comparison of their predictive capacity can help to (i) enhance the quality and reliability of each method, (ii) highlight and identify the most important factors affecting the slope instability system, (iii) neglect less influential aspects to simplify the models, and (iv) select the most appropriate methodology to achieve a specified goal.
In the present study, the basin scale refers to the river basin limits (e.g.Guzzetti et al., 2005;Remondo et al., 2005).The relevance for using this study area limits when assessing rainfall-triggered landslide susceptibility is related with the maintenance of the hydrologic processes continuity, mainly runoff and potential infiltration.In addition, the basin scale is adjusted to the susceptibility zonation recommendations proposed for modelling scales between 1 : 25 000 and 1 : 5 000 (Cascini, 2008;Fell et al., 2008b) and for study areas ranging in size from 10 to 1000 km 2 (Fell et al., 2008b).
In this study we test two hypotheses: (i) although conceptually and methodologically distinct, the statistical and deterministic methods generate similar results for shallow landslide susceptibility regarding the model's predictive capacity and spatial agreement and (ii) the combination of the shallow landslide susceptibility maps obtained with statistical and physically based methods, for the same study area, generate a more reliable susceptibility map for shallow slide occurrence.

Study area
The study area comprises the two small catchments of Monfalim and Louriceira (13.9 km 2 ), which are located 25 km NNW of Lisbon, Portugal (Fig. 1).The elevation ranges from 442 m at the west to 134 m in the northeast sector of the study area, near the confluence of both the Monfalim and Louriceira rivers with the Grande da Pipa River (GPR), which is an affluent of the Tagus River.
The lithological units are mainly sedimentary rocks dated from the Kimmeridgian to the lower Tithonian (Upper Jurassic).There are also alluvium deposits of the Holocene age and a complex of dikes and volcanic masses that cover only 1.1 % of the study area.The detailed lithological unit map of the study area (Fig. 1) was constructed based on official geological maps (Zbyszewski and Assunção, 1965;IN-ETI, 2005) and on the interpretation of aerial photographs and validation of lithological unit limits through field work.Therefore, it was possible to identify the following eight lithological units (from L1 to L8): (L1) alluvium, (L2) limestones and marls, (L3) sandstones and limestones, (L4) mudstones and marly limestones, (L5) limestones, (L6) marls, (L7) mudstones and marls, and (L8) dykes and volcanic masses (basalt, teschenite, and dolerite).
The study area has undergone a general tectonic uplift since the Miocene (Zbyszewski and Assunção, 1965), and  the layers dip typically to the SE or SW.This structural setting, together with the alternation of soft rocks such as marls, clays and mudstones with more resistant rocks as the limestones, has allowed the development of cuesta-like landforms resulting from differential erosional processes (Ferreira, 1984;Ferreira et al., 1987;Zêzere, 1991).The slopes within the study area are typically moderate: 78.1 % of the total area has slopes in the range of 5-20 • .The gentle slopes (0-5 • ) represent only 12.9 % and the steepest slopes (> 20 • ) occur only in 9 % of the study area.
Landslides in the study area have been triggered by rainfall (Zêzere et al., 1999(Zêzere et al., , 2005(Zêzere et al., , 2015;;Zêzere and Rodrigues, 2002;Oliveira, 2012).The climate is Mediterranean and the Mean Annual Precipitation (MAP) is 730 mm (at São Julião do Tojal gauge located 20 km south from the study area) (Zêzere et al., 2015).Shallow slides have been triggered mainly by intense short-duration rainfall episodes, of typically 1 to 15 days maximum (Zêzere and Trigo, 2011;Zêzere et al., 2015).These rainfall events generate increments of pore water pressures and the reduction of the soil shear strength, including the loss of cohesion on fine sediments, which promote the failure along the superficial soil formations or along the contact between the soil and the impermeable bedrock (Trigo et al., 2005).

Methods and data
The methodological procedures for assessing shallow slide susceptibility based on the application and combination of statistical and physically based approaches are summarized in Fig. 2. Two commonly used methods were chosen: the bivariate statistical information value method (IV; Yin and Yan, 1988) and the infinite slope method (IS; Sharma, 2002) based on the calculation of the factor of safety (FS).Both methods are in line with the experts panel recommendations to assess landslide susceptibility (Cascini, 2008;Fell et al., 2008a, b;Corominas et al., 2014) and have been applied successfully in similar geological and geomorphological context in the region north of Lisbon (Zêzere, 2002;Pimenta, 2011;Guillard and Zêzere, 2012;Oliveira et al., 2015).In order to model the shallow slide susceptibility, the dependent variables (shallow slide training and validation groups), the independent dataset of variables used as predisposing factors, and the maps representing geotechnical and hydraulic parameters were rasterized using a pixel of 5 m × 5 m.

Landslide inventory
The landslide inventory was used twice in this study: (i) to establish the statistical relationships between shallow slides and the dataset of environmental factors assumed as shallow slide predisposing factors in the statistical approach and (ii) to validate the shallow slide susceptibility models obtained with both statistical and physically based models.The landslide inventory of the study area (Fig. 1) includes 111 shallow slides (translational and rotational slides with high curvature angle of the slip surface) that were classified following the Cruden and Varnes (1996) proposal.The depth of the slip surface is typically less than 1.5 m, and, as a rule, the shear planes are located in the interface between the  1).The shallow slides inventory was extracted from (Oliveira, 2012) and was based on the interpretation of aerial photographs (1983,1989) and orthophotomaps (2003,2004,2007) as well as on extensive field work carried out during the 2006-2010 period.
The inventory of shallow slides was further subjected to a partition based on a temporal criterion (Fig. 1, Table 1).Table 1 summarizes the mean shallow slide characteristics considering the total inventory and two subsets (training group and validation group) for the entire study area.The landslide training group includes the shallow slides that occurred until the end of 1983 (51 cases, 0.027 km 2 , and 0.19 % of the study area).The landslide validation group includes the shal-low slides that occurred between 1984 and the end of 2010 (60 cases, 0.03 km 2 , and 0.22 % of the study area).The training group was used to weight classes of shallow slide predisposing factors in the statistical model using the IV method, and it was also used to calibrate the shear strength parameters (cohesion and friction angle) of the lithological units in the IS model.The validation group was used for the independent validation of both statistical and physically based shallow slide susceptibility models.The shallow slide density, area, and volume that occurred in each lithological unit are summarized in Table 1.The larger shallow slides are observed within lithological units L2 and L4, where the landslides area and volume are above the mean.Smaller shallow slides are observed within lithological units L3, L5, L7, and L8, where the area and volume of landslides are below the mean (Table 1).

The information value method (IV)
The information value (Yin and Yan, 1988) was used to compute the susceptibility score for each class of each variable considered as a landslide predisposing factor based on the log normalization of the ratio between the conditional probability to find a shallow slide in a certain class of a predisposing factor and the a priori probability to find a shallow slide in the study area, following Eq.( 1).
where I i is the information value of class X i belonging to an independent variable (landslide predisposing factor), S i is the number of pixels with shallow slides belonging to the training group and the presence of the variable class X i , N i is the number of pixels with variable class X i , S is the total number pixels with shallow slides belonging to the training group, and N is the total number of pixels of the study area.Due to the logarithmic normalization I i cannot be calculated when no shallow slides are registered in a certain predisposing factor class (S i = 0).In these cases the I i was forced to be equal to the millesimal value lower than the lowest I i within the predisposing factor.The final IV scores I j for each terrain unit j were obtained using Eq. ( 2).
where m is the total number of variable classes, and X ij is either 0 if the variable class is not present in the pixel j or 1 if the variable class is present.

Landslide predisposing factors
We selected the following seven landslide predisposing factors as independent variables (Figs. 1, 3, and Table 2 for the description of classes) that have successfully been used in previous studies in the region north of Lisbon (e.g.Oliveira et al., 2015): lithology, slope angle, slope aspect, slope curvature, topographic position index (TPI), slope over area ratio (SOAR), and land use.
The lithological map includes eight classes described above (cf.Sect.2).The land use map was obtained from the official map representing the land use in 1990.Although it does not match the current land use in the study area, it is the one that best fits the time span of shallow landslides included in the present landslide inventory and the temporal land use frame closer to the age of the landslides in the training group.The remaining variables (slope, aspect, curvature, topographic position index, and slope over area ratio) were derived from a digital elevation model (DEM) based on elevation data interpolated from a topographic contours map (equidistance 10 m).Regarding the curvature map, a DEM generalization based on a 50 m pixel size grid was considered to calculate the profile of the slopes, as it provides the best fit to the morphology of slopes in the study area (Oliveira et al., 2015).The topographic position index was calculated based on the Land Facet Corridor Designer tool for ArcGIS and compares the elevation value of each cell in the DEM with the mean elevation value of the neighbouring cells, at a given maximum distance (Jenness et al., 2011).This index is heavily dependent on the scale (Piacentini et al., 2015), and the neighbourhood radius of 25 m proved to be the most appropriate for the index calculation at the work reference scale.The slope over area ratio was used to express the importance of the topography in hydrological processes through the relationship between the slope and the contribution area (Sørensen et al., 2006), which allows us to infer the areas prone to surface saturation (Fonseca, 2005).The calculation of the SOAR was made using the TauDEM 5.2 (Terrain Analysis Using Digital Elevation Models) tool and the algorithm D8 (O'Callagham and Mark, 1984) to minimize the dispersion of accumulation flow.

The infinite slope method (IS)
The most popular formulations of the infinite slope method consider a subsurface flow/water table level parallel to the topographic surface, whose maximum depth is equivalent to the maximum thickness of the saturated soil.In this context, the development of a steady-state hydraulic model in static conditions can be related to the ratio between the thickness of saturated soil and the thickness of the potentially unstable soil, as provided in the formulation of the SHALSTAB model (Dietrich and Montegomery, 1998).The FS for each terrain unit (pixel) was thus calculated based on the infinite slope method, incorporating a soil thickness model and a hydraulic  2.
model for the study area, following Eq.(3) (Sharma, 2002): where c is the effective cohesion (kN m −2 ); h is the potentially unstable soil depth; β is the slope of the terrain unit; m is the equation component of the hydraulic model, considered as the ratio between the saturated soil depth and the potentially unstable soil depth; φ is the internal friction angle ( • ); γ m is the specific soil weight (kN m −3 ); γ sat is the saturated soil weight (kN m −3 ); and γ sub is the submerged soil weight (kN m −3 ).The FS values can be interpreted in two ways.In the more restrict sense it is assumed that all terrain units with FS values ≤ 1 are unstable.In a broader interpretation the FS results are compared with results obtained using the statistical approach; in other words each terrain unit within a study area can be ranked according to its FS value, where the lowest FS value indicates the highest landslide susceptibility.
The development of the IS model was supported by the following parameters: (i) topographical variables (slope and catchment area), (ii) soil thickness, (iii) hydrological parameters (hydraulic conductivity, soil transmissivity, and daily rainfall threshold), (iv) geotechnical parameters (natural, saturated, and submerged specific soil weights; cohesion; and internal friction angle).Most geotechnical parameters were deduced from references with regional validity that were summarized by Pimenta (2011).

Soil thickness model
The depth of the potentially unstable soil is a critical parameter that strongly influences the stability of slopes.The soil depth model for the study area was obtained using Eq. ( 4), as proposed by Catani et al. (2010): where h is the soil thickness, K c is a constant calibration parameter, C is an index based on the slope profile curvature, η  is the relative soil depth dependent on the topographic position, and ψ −1 is the critical slope angle associated with landslide occurrence.The three parameters C, η, and ψ −1 were expressed in a scale ranging between 0 and 1.For each parameter, the value 1 was assigned to the maximum observed value, 0 to the minimum observed value, and the remaining observed values were assigned numbers between 0 and 1 by linear normalization.The constant K c was estimated independently for each lithological unit based on trial and error estimation to obtain the best possible fit of the soil thickness values obtained by Eq. ( 4) to the soil thickness values measured in 110 sampling field points.These sampling field measurements, subject to the existence of slope cuts where the soil depth was measured, were spatially distributed in order to guarantee a reasonable number of soil thickness measurements in each lithological unit but also along different geomorphological units (interfluve areas, slopes, valley floors).
The calibration of the K c constant for any lithological unit requires that the differences between the maximum estimated soil thickness and the maximum soil thickness measured in the field does not exceed 1 m.Table 3 summarizes the K c constant calibration values obtained for each lithological unit (from L1 to L8) in the study area.Soil profiles were not found in lithological units L1, L3, and L8 during the field work.In the case of lithological unit L3, we adopted a K c value equal to the one estimated for the other lithological unit of the same age (L4, K c = 3.6).In the case of alluvium (L1) and the complex of dikes and masses (L9), we adopted a K c = 2.9, which is the arithmetic mean of all K c values obtained for lithological units where it was possible to measure soil thickness during field sampling.Figure 4a shows the final soil thickness map of the study area, and Fig. 4b shows the correlation between the soil thicknesses measured in the field and the soil thickness values extracted from the final soil thickness model (Fig. 4a).The soil thickness model was forced to a maximum error of 1 m between the estimated and field-measured soil thickness, and a coefficient of determination of 0.62 was obtained (Fig. 4b).Nevertheless, for eight sampling field points (7.3 % of total) the observed error was higher than 1 m, ranging from 1.1 to 2.1 m.

Hydraulic model
The adopted hydraulic model was developed using SHAL-STAB (Dietrich and Montegomery, 1998), which follows a model developed by O'Loughlin (1986).According to Sharma (2002), the hydraulic model is the ratio between the thickness of saturated soil and the thickness of the potentially  unstable soil given by Eq. ( 5).
where h/z is the ratio between the thickness of the saturated soil above the impermeable layer and the thickness of the potentially unstable soil, Q is the effective precipitation (m day −1 ), T is the transmissivity of the soil (m 2 day −1 ), a is the upstream contribution area (m 2 ), b is the cell length (m), and β is the slope gradient ( • ).The increase of the hydrologic ratio (Q/T ) indicates that soil saturation will be faster and more extensive.The topographic ratio (a/(b • sin β)) describes the topographical effect on runoff (Dietrich and Montegomery, 1998; Montgomery et al., 1998).The transmissivity of the soil was estimated using Eq. ( 6) (Lencastre and Franco, 2006): where T is the soil transmissivity (m day −1 ), k is the saturated hydraulic conductivity (m day −1 ), and z is the soil thickness (m).
As the hydraulic conductivity based on field measurements was not available for the study area, this parameter was estimated for the identified soil types based on the work developed by Rawls et al. (1982), which summarized the typical hydraulic conductivities for different soil types starting from the respective textural properties.The national digital soil map at 1 : 25 000 scale (DGADR, 1999) was used to extract the clay, silt and sand, and coarse sand fractions for the different soils types in the study area.The soil taxonomy of the US Department of Agriculture was used to distinguish between soil types, through the soil texture triangle bulk density.Rocky outcrops and urban areas were assigned the −1 value, thus corresponding to 0 (absence of water) in the hydraulic model.The Kastanozem soils were also assigned the −1 value because the typical pedological stage of Kastanozem soils within the study area is a stony soil phase.Finally, 55 types of soils were identified, in addition to social areas and rocky outcrops.
The effective precipitation was estimated based on the Eq. ( 7) proposed by Trigo et al. (2005) that defines the rainfall threshold for triggering translational and rotational landslides in the region north of Lisbon that includes the study area.
C r = 7.4D + 107, where C r is the rainfall threshold that is associated with landslide occurrence (mm) and D is the number of consecutive rainfall days.
As most landslide events occur in the study area during the winter season we believe that the effect of evapotranspiration can be neglected; therefore, the effective precipitation can be assumed to equal the total precipitation, namely for short rainfall periods.Using Eq. ( 7) we obtained a critical daily rainfall for failure of 114.4 mm.The rainfall concentrated in a single day is a feasible scenario for triggering shallow landslide events, such as the ones that occurred in the Lisbon region in 1967 and 1983 (Zêzere et al., 2005(Zêzere et al., , 2015)).
The hydraulic conductivity was estimated based on the critical precipitation for failure and the soil texture.In the study area k ranges from 5.05 m day −1 in the Luvisols with dominantly sandy texture to 0.0144 m day −1 in Vertisols with dominantly clayey texture.The computed transmissivity ranges between 0 and 13.45 m 2 day −1 (Fig. 5a).The final hydraulic model is shown in Fig. 5b.

Geotechnical parameters of superficial soils
All geotechnical parameters mentioned in this section, related to soil weight (ϒ m , ϒ sat , ϒ sub ) cohesion (c ) and friction angle (φ ), were based on literature and were defined specifically for the superficial soils above the bedrock within each lithological unit.
Superficial soils in the study area are mainly regoliths and colluvium deposits that had suffered little mobilization along the slopes, which is explained by the low energy of the existing landforms.As a consequence, soils above the bedrock are typically shallow and can be assumed to be very close to the parent lithology regarding composition.Therefore, in this case study, we consider appropriate to use the lithological mapping units for the regionalization of the shear strength parameters of the superficial soils above the bedrock.The specific (ϒ m ), saturated (ϒ sat ), and submerged (ϒ sub ) soil weights values were provided by Pimenta (2011) for the superficial soils above the bedrock within each lithological unit and are summarized in Table 4.
The strength parameters of the lithological units obtained in laboratory with direct shear tests Pimenta (2011) proved to be too high to explain the observed slope instability.Therefore, the optimal combinations of cohesion and effective internal friction angle values for each lithological unit were defined iteratively through back-analysis.Different combinations of cohesion and effective internal friction angles were tested with the infinite slope method and validated with the landslide training group (landslide area), using as reference the maximum and minimum friction angles suggested by Geotechdata (2013).Critical pairs of cohesion and internal friction angle were selected for each lithological unit by combining two criteria: (i) the susceptibility class with FS ≤ 1 must include at least 50 % of landslide area of the landslide training group located on the lithological unit and (ii) the susceptibility class with FS ≤ 1 must have the highest effective ratio, which is expressed by the ratio between the percentage of landslide area predicted in the class and the percentage of the class area in the study area (Chung and Fabbri, 2003).In the cases of lithological units L2 and L5, it was not possible to comply with criterion (i), but the corresponding critical pair cohesion-internal friction angle were selected with respect to criterion (ii).In addition, strength parameters of lithological units L1 and L8 could not be estimated with this method due to the absence of landslides in these lithological units.In these cases, the cohesion and effective internal friction angle were derived directly from Pimenta (2011), who gathered information from technical reports, geotechnical laboratory tests, and standard values reported in the literature (Baptista, 2004;Cernica, 1995;Fernandes, 1994;Jeremias, 2000;Vallejo et al., 2002).Table 4 summarizes the geotechnical parameters of the lithological units used to implement the physically based model.

Validation, comparison, and combination of shallow slide susceptibility models
The validation of susceptibility maps produced by statistical and physically based models was done independently using the landslide validation group.ROC (receiver operating characteristic) curves were computed and the corresponding area under the ROC curve (AUROC) was calculated.Additionally, the landslide susceptibility maps were classified and the effective ratio of each class was estimated.Both statistical and physically based susceptibility maps were classified considering the same fraction of study area in each equivalent landslide susceptibility class.First, the IS map was ranked into five classes based on the factor of safety values (≤ 1, 1-1.25, 1.25-1.5,1.5-2, and > 2), which correspond, respectively, to the following descriptive classifications of susceptibility: very high, high, moderate, low, and very low.Next, the IV map was organized into five classes (very high, high, moderate, low, and very low) ensuring that equivalent susceptibility classes cover the same fraction of the study area in both maps.The evaluation of the spatial agreement between landslide susceptibility maps based on statistical and physically based approaches was made using the Rank Differences tool included in ArcSDM (Sawatzky et al., 2008).Lastly, statistical and physically based susceptibility maps were combined into a final shallow slide susceptibility map based on the intersection of the susceptibility classes in a contingency table, using the Map Comparison Kit (Visser and Nijs, 2006) on a cell by cell comparison and kappa statistics.2, and the corresponding shallow slide susceptibility map is shown in Fig. 6.The spatial distribution of susceptibility shows a clear contrast between the northern and northeastern sectors of the study area in which the susceptibility is predominantly classified as low to very low, whereas in the central-southern part of the study area the susceptibility to shallow slides is typically higher.This contrast is mainly justified by the lithological differentiation.In fact the lithological units L7 (marls and clays) and L5 (limestones) are found in the northern part of the study area, and they apparently have a low predisposition to shallow slide occurrence (Table 2).In contrast, lithological units more prone to slope instability (L2, limestones and marls; and L3, sandstones and limestones) occur as outcrops in the central and southern part of the study area.In addition, the slope angle tends to be higher in the latter part of the study area, thus contributing to the higher landslide susceptibility.
The ROC curve of the landslide susceptibility model is shown in Fig. 7.The IV model predictive capacity is reasonable/good, as expressed by the AUROC of 0.75.

Physically based landslide susceptibility assessment
The shallow slide susceptibility map computed with the IS method is shown in Fig. 8a.The susceptibility class with FS ≤ 1 (very high susceptibility) covers 17.9 % of the total study area and validates 53.4 % of the shallow slides belonging to the landslide validation group, which explains the higher effective ratio (2.98) of this susceptibility class (Table 5).By comparison with the IV susceptibility map, the increment of area classified with very high or high susceptibility is clear in the northern sector of the study area where lithological unit L7 outcrops, whereas the spatial expression of the two highest landslide susceptibility classes decreases in the southwestern-southern sector where the lithological unit L2 outcrops.The ROC curve of the model based on the landslide validation group is shown in Fig. 7.The ROC curve is closer to the upper-left corner of the ROC curve graphic, which confirms the best predictive capacity of the IS susceptibility map when compared with the IV susceptibility map.
The AUROC of 0.81 also supports the better predictive capacity of the IS model.As mentioned above, shallow landslides have been triggered by rainfall in the study area, typically during intense short-duration rainfall events (Zêzere et al., 2005(Zêzere et al., , 2015;;Zêzere and Trigo, 2011).Additionally, extensive field work in the study area (Oliveira, 2012) has shown a total absence of instability signs during the summer, which is consistent with the dryness that characterizes this season.Therefore, a typical situation of superficial absence of water in the soil during summer, i.e. m = 0, is implicit; accordingly, an additional physically based shallow slide susceptibility map was prepared considering no water in the soil (m = 0).Figure 8b shows the model results.Given the assumed boundary conditions, it was expected that the model would not generate  FS ≤ 1.However, Fig. 8b shows a small fraction of the study area classified with very high susceptibility (FS ≤ 1, 2.25 % of study area) under conditions of absence of water into the soil, which is interpreted as an error of the IS model.It is worth mentioning that most of the model errors occur over the lithological unit L2, indicating that the corresponding resistance parameters (cohesion, internal friction angle) may be underestimated.The cohesion and internal friction angle values that guarantee FS > 1 for any lithological unit in the absence of water into the soil (m = 0) are summarized in Table 4 (in brackets).These geotechnical parameters were tested in a new model considering the existence of water into the soil (susceptibility map not shown in the present work), and the obtained result is not reliable: the area classified as unstable (with FS ≤ 1) corresponds to only 1.3 % of the total study area and validates only 8.1 % of the landslides belonging to the training group.Therefore, we conclude that the geotechnical param-eters that guarantee the absence of cells with FS ≤ 1 when m = 0 (features in brackets in Table 4) are too high to correctly express the landslide susceptibility in the study area.

Comparison of landslide susceptibility models
The comparison of the susceptibility maps produced with IV and IS methods demonstrates that spatially the susceptibility ranking differs substantially depending on the method used.Indeed, the kappa coefficient is only 0.23, which means that spatial correlation is moderate, although the reasonable/good predictive capacity of both models was attested by the AU-ROC (Fig. 7).
Table 5 summarizes the spatial extension, the percentage of shallow slides of the landslide validation group, and the effective ratio of each susceptibility class for the two predictive models (IV and IS).The two highest classes in the IV landslide susceptibility map spread over 34.1 % of the to-tal study area, and the corresponding percentage of predicted shallow slides approaches 69.4 %.The performance of the predictive model is weaker for the intermediate susceptibility classes (moderate and low), in particular for the low susceptibility class that includes a relevant portion (15.7 %) of shallow slides belonging to the landslide validation group.The IS landslide susceptibility model reveals a better predictive capacity confirmed by the fact that 83.1 % of the landslide validation group falls into the two highest susceptibility classes.
The effective ratios calculated for the landslide susceptibility classes of both models are summarized in Table 5.The effective ratios for the IS model are higher for the very high and high susceptibility classes and lower for the low and very low susceptibility classes than the effective ratios of the IV model for the same classes, which indicate a better predictive capacity of the IS model.
The spatial comparison of the two susceptibility maps is shown in Fig. 9.The value 0 means spatial agreement between landslide susceptibility classes, whereas values other than 0 mean disagreement.Negative values indicate that landslide susceptibility obtained with IV is lower when compared with the map obtained with IS, with the difference increasing from −1 to −4.For example, a grid cell with a score −4 means this terrain unit was classified as very high susceptibility in the IS susceptibility map and as very low susceptibility in the IV susceptibility map.Positive values indicate the opposite relationship between map classes.The perfect spatial agreement between susceptibility classes in both maps occurs in 39.9 % of the study area (Table 6).However, adding the minimum mismatch classification (−1 and +1 in Fig. 9), the previous feature rises to 73 % of the total study area.The major discrepancy between the two susceptibility maps (−4, −3, 3, and 4 in Fig. 9) occurs along 10.5 % of the study area, namely where the lithological units L7 and L2 outcrop.In the northern part of the study area where the lithological unit L7 is present, the landslide susceptibility obtained with the IV method is lower than the one obtained with the IS method, whereas the opposite occurs in the central and southern part of the study area where the lithological unit L2 is present.
These results can be explained by the particular specifications associated with the physically based and statistical methods.The resistance parameters estimated for the superficial soil over lithological unit L7 (c = 2 kPa, ϕ = 19 • ) are higher than those estimated for lithological unit L2 (c = 0.5 kPa, ϕ = 17 • ).However, the landslide susceptibility computed using the IS tends to be higher over lithological unit L7, which is related to the soil water content and eventually to the presence of thicker soils, particularly along the lower part of slopes where topographic conditions are more prone to soil saturation.On the other hand, the statistical approach generated IV scores of 0.494 and −0.857, respectively, for lithological units L2 and L7.The positive IV score for lithological unit L2 clearly indicates a higher likelihood of shallow slide occurrence.We acknowledge that the shallow slides inventory may be incomplete in the area corresponding to lithological unit L7, which could justify the negative IV score.The slope and land use clusters observed within lithological units L2 and L7 are shown in Fig. 10a  and b.The lithological unit L7 (clays and marls) are mainly associated with gentle and moderate slopes (slope angles between 5 and 15 • ) and are characterized by intense agricultural use that extends over 78.1 % of the L7 surface; thus, the footprint of small shallow slides is easily erased on the landscape, as the "original" slope profile is recovered for agricultural activities.On the contrary, the lithological unit L2 is constituted by sequences of marl and limestone layers, which induce larger topographic irregularities and less productive soils on steep to moderate slopes.The existing geological and geomorphological conditions favoured the prevalence of forest and annual crop cultures instead of the intensive agricultural activity.In this context, the landslide footprint over slopes tends to last longer, as proven by the larger number of shallow slides mapped over lithological unit L2 (Table 1: 45 shallow slides, 8.3 shallow slides km −2 ) when compared with lithological unit L7 (Table 1: 17 shallow slides, 5.2 landslides km −2 ).Therefore, the inventory of shallow slides is assumed to be more complete over lithological unit L2, which explains the higher IV score.

Combination of landslide susceptibility models
The results of the cross tabulation between landslide susceptibility classes of both susceptibility maps (statistical and physically based) are summarized in a contingency table (Table 6).The distribution of shallow slides belonging to the validation group on the same contingency table is summarized in Table 7. Table 6 shows the combinations considered within the contingency table to classify the final landslide susceptibility map resulting from the integration of statistical and physically based predictive models; the number labels defined in the footnote (1-6) represent the final susceptibility classes (very high, high, moderate, low, very low, and uncertain, respectively).The corresponding final shallow slide susceptibility map is shown in Fig. 11, and information about final landslide susceptibility classes is detailed in Table 8.The very high susceptibility class covers 16.4 % of the study area and includes 55.6 % of the shallow slide validation group, and the high susceptibility class covers 14.3 % of the study area and includes 18.6 % of the shallow slides.In contrast, the very low and low susceptibility classes cover 33.4 and 10.6 % of the study area, respectively, and include only a small fraction of the landslide validation group (1.4 % each class).
Terrain units classified as very high or high susceptibility by one method and simultaneously as very low or low susceptibility by the other method were considered as uncertain regarding susceptibility to shallow slide occurrence in the final map.The class designated as uncertain is potentially highly or very highly susceptible to landslides, covers 16.3 % of the study area, and includes 16.0 % of the shallow slides belonging to the validation group.However, the distribution of the landslide validation group in the uncertain susceptibility class is different in the upper-right corner and in the lower-left corner of the contingency table (see Tables 6  and 7).Terrain units classified as very high or high susceptibility by the IS susceptibility map and as very low or low susceptibility by the VI method (upper-right corner in Tables 6  and 7) include 14.7 % of shallow slides belonging to the validation group, whereas terrain units with inverse classification (lower-left corner in Tables 6 and 7) only contain 1.2 % of the shallow slide validation group.These values, once more, reflect the higher quality of the physically based susceptibility model in comparison with the statistical model.
The predictive quality of susceptibility classes that make up the final landslide susceptibility map is confirmed by the estimated effective ratios (Table 8).The effective ratio of the very high susceptibility class (3.39) is higher than that obtained for the equivalent susceptibility class with the statistical and physically based methods (cf.Table 5).In addition, effective ratios corresponding to the very low and low susceptibility classes (0.04 and 0.12, respectively) are lower than those obtained with statistical and physically based methods (cf.Table 5), which indicates a better predictive performance of the combination of the two landslide susceptibility models.Moreover, the effective ratio is higher for the uncertain class than for the moderate class (Table 8), which is consistent with the potential for high or very high susceptibility considered for the uncertain class.

Conclusion
Statistical and physically based methods used to assess landslide susceptibility at the basin scale are conceptually distinct as the former are based on weighting environment predisposing factors, whereas the latter are supported by the computation of shearing and resistance forces along potential slip surfaces.The existence of a landslide inventory is crucial to weight predictive variables within statistical meth-ods, which is not the case for physically based methods that can be computed independently on the landslide inventory.Both types of methods have advantages and drawbacks.The major constraints associated with statistical approaches have been summarized in previous works (Corominas et al., 2014;Fell et al., 2008a) and result from (i) the difficulty of establishing causal (cause-effect) relationships between variables; (ii) problems arising from self-correlation between variables; (iii) the typically not normal statistical distribution of predictor variables; (iv) the limitations related to the quality of data, in particular the completion of the landslide inventory; and (v) the difficulty in transferring the results from the study area to other areas, even with similar characteristics.In the case of physically based methods, the major constraints were listed as follow (Corominas et al., 2014;Fell et al., 2008a): (i) the high level of generalization and/or simplification regarding the spatial distribution of geotechnical or hydrological parameters, (ii) the feasibility of model application is limited to areas with relatively homogeneous ground conditions (e.g.geology and geomorphology), (iii) the uncertainties about the depth of the soil and of the slip surface, and (iv) the difficulties in predicting groundwater pore pressures and their relationship with rainfall.
In this work we tested two hypotheses: (i) although conceptually distinct, statistical and physically based methods generate similar results concerning susceptibility to shallow slide occurrence and (ii) a reliable landslide susceptibility map can be obtained for a single study area by combining two landslide susceptibility models (statistical vs. physically based).
To achieve the proposed objectives the information value method and the infinite slope method were chosen to build two landslide susceptibility maps.A shallow slides inventory was separated into two independent landslide groups adopting a temporal criterion.Although some differences exist regarding the morphometric characteristics and the spatial distribution over lithological units of the two shallow slide subsets (training group and validation group), they were used independently in order to allow model comparison and validation.The training group was used 2-fold to define the statistical relationships between landslides and the dataset of variables assumed as landslide predisposing factors by the IV method and to calibrate the resistance parameters (cohesion and internal friction angle) within the IS method.The landslide validation group was used to validate both susceptibility maps independently.Some sources of bias were identified in the present work.Firstly, although the infinite slope stability model remains physically based, the used geotechnical parameters lose, to some extent, their direct physical meaning since critical cohesion and internal friction angle combination were determined statistically assuming the highest effective ratio.Another potential source of bias is the use of the lithological map instead of the soil map to generalize the geotechnical properties of superficial soils.We acknowledge that the use of the soil map could be appropriate for that purpose, taking into account the shallow characteristics of landslides.However, the national soil map at the 1 : 25 000 scale provides 55 different soil-type classes for the study area, which is a very large number to be balanced with the 51 shallow slides of the landslide training group used to evaluate the critical cohesion and internal friction angle parameters by back-analysis.If landslides were regularly distributed over soil types, we would get only 0.9 shallow slides in each soil class, which is insufficient to estimate the resistance parameters.Moreover the occurrence of shallow slides was only verified on 18 of the 55 soil-type classes, which would strongly increase the uncertainty of the regionalization of shear strength parameters based on the soil map.
When analysed separately, both methods generated good predictive results, although the physically based model was revealed to be more effective in the spatial prediction of shallow landslides, which is attested by the AUROC and the effective ratio of landslide susceptibility classes.In addition, the application of the kappa statistics showed that the overall spatial agreement between susceptibility classes of both maps is only moderate (κ = 0.23), so the first hypothesis is only partially confirmed.The major differences were registered over two lithological units (L2 and L7) and may result from the probable incompleteness of the shallow slides inventory over the lithological unit L7, as a consequence of human interventions related to agricultural activities.
Although similar approaches in the past mathematically merged the results obtained from two conceptually different susceptibility models (Günther and Thiel, 2009), in our work the final shallow slide susceptibility map was produced by combining the results obtained with the statistical and physically based methods through a contingency table.The final result proved to be reliable, as shown by the effective ratio of the extreme susceptibility classes (very high, low, and very low).Thus, the second hypothesis is confirmed.Although it was possible to identify uncertain areas with one single model by varying some input assumptions and parameter combinations, our work demonstrates that the combination of both methods using a contingency table allowed the identification of areas classified as uncertain regarding landslide susceptibility but with the potential to be highly or very highly susceptible to shallow slide occurrences, which is not possible when using a single landslide susceptibility model.This is particularly relevant in areas where the completeness degree of the landslide inventory is not equivalent over different lithological units, as is the case in the present study.
For future works an important contribution for augmenting the reliability of statistical models would be the update of the landslide inventory immediately after any landslide event, particularly in areas subject to intensive agricultural practices.The physically based models will certainly improve, with a more reliable soil thickness model resulting from the densification of field soil sampling measurements covering all lithological and soil classes.The comprehensive aggregation of soil classes should sustain the more reliable regionalization of hydrological properties and shear strength parameters.In addition, the improvement of the resolution of the DEM of the study area using lidar technology should increase the reliability of both landside susceptibility maps based on statistical and physically based methods.

Figure 2 .
Figure 2. Methodological framework to compare and to combine statistical and physically based landslide susceptibility models.

Figure 3 .
Figure 3. Dataset of shallow slide predisposing factors.(a) Slope, (b) aspect, (c) profile slope curvature, (d) topographic position index, (e) slope over area ratio, and (f) land use.Lithology is shown in Fig. 1, and the classes of landslide predisposing factors are described in Table2.

Figure 4 .
Figure 4. (a) Soil thickness map; (B) correlation between the field-measured soil thickness at sampling points and the estimated soil thickness.The eight sampling field points with more than 1 m error are not plotted.

Figure 5 .
Figure 5. Transmissivity (a) and ratio h/z (b) for the hydraulic model of the study area.

Figure 7 .
Figure 7. ROC curves based on independent validation of IV and IS shallow slide susceptibility models.

Figure 9 .
Figure 9. Spatial agreement between IV and IS shallow slide susceptibility maps.A value of 0 means full agreement; 4 and −4 mean maximum disagreement.

Figure 10 .
Figure 10.Frequency of slope (a) and land use clusters (b) within lithological units L2 and L7.Flat areas were not considered.

Figure 11 .
Figure 11.Final shallow slide susceptibility map resulting from the combination of IV and IS susceptibility maps.

Table 1 .
Shallow slides inventory characteristics and distribution per lithological unit.Landslides affecting more than one lithological unit were considered within the dominant lithological unit.

Table 2 .
Description of landslide predisposing factor classes and respective information value scores (I i ).

Table 3 .
K c constant calibration parameter for each lithological unit.

Table 4 .
Geotechnical parameters assigned to each lithological unit.In brackets, cohesion and internal friction angle for each lithological unit to guarantee FS > 1 in the absence of water into the soil (m = 0).

Table 5 .
Effective ratio of classes defined for the IV and IS shallow slide susceptibility maps.

Table 6 .
Contingency table extracted from the overlay of IV and IS shallow slide susceptibility maps in percent (%) of the study area.Numbered footnotes represent the susceptibility classes of the final map.

Table 7 .
Distribution (%) of shallow slides of the validation group in classes obtained by overlay IV and IS shallow slide susceptibility maps.

Table 8 .
Susceptibility classes and corresponding effective ratios of the final shallow slide susceptibility map.