HIRESSS: a physically based slope stability simulator for HPC applications

HIRESSS ( HI gh REsolution Slope Stability Simulator) is a physically based distributed slope stability simulator for analyzing shallow landslide triggering conditions in real time and on large areas using parallel computational techniques. The physical model proposed is composed of two parts: hydrological and geotechnical. The hydrological model receives the rainfall data as dynamical input and provides the pressure head as perturbation to the geotechnical stability model that computes the factor of safety (FS) in probabilistic terms. The hydrological model is based on an analytical solution of an approximated form of the Richards equation under the wet condition hypothesis and it is introduced as a modeled form of hydraulic diffusivity to improve the hydrological response. The geotechnical stability model is based on an infinite slope model that takes into account the unsaturated soil condition. During the slope stability analysis the proposed model takes into account the increase in strength and cohesion due to matric suction in unsaturated soil, where the pressure head is negative. Moreover, the soil mass variation on partially saturated soil caused by water infiltration is modeled. The model is then inserted into a Monte Carlo simulation, to manage the typical uncertainty in the values of the input geotechnical and hydrological parameters, which is a common weak point of deterministic models. The Monte Carlo simulation manages a probability distribution of input parameters providing results in terms of slope failure probability. The developed software uses the computational power offered by multicore and multiprocessor hardware, from modern workstations to supercomputing facilities (HPC), to achieve the simulation in reasonable runtimes, compatible with civil protection real time monitoring. A first test of HIRESSS in three different areas is presented to evaluate the reliability of the results and the runtime performance on large areas.


Introduction
Soil slips and debris flows are among the most dangerous landslides (Jakob and Hungr, 2005): the threat they pose to human activities and life is mainly due to the high velocity that they can reach during the run out and to the nearly total absence of premonitory signals. These movements are usually triggered by heavy rainfall and therefore they have the same extemporaneous character. Moreover, small and apparently harmless debris flows, triggered by small zones of unstable slopes, can group from different sources in channels greatly increasing mass displacement and destructive powers reaching velocities up to 20 m s −1 . There are several examples that show the destructive power and the extemporaneous character of shallow landslides. Some Italian regions are under continuous threat and every year are hit by these phenomena that usually cause damage to infrastructures and occasionally even human casualties (Tofani et al., 2006).
Despite the large number of studies, publications and applications available nowadays, the prediction of shallow landslides over large areas in real or near real time remains a very complex task . This is mainly due to: the necessary simplifications introduced in hydrological and geotechnical models (Crosta and Frattini, 2003;, the errors introduced by rainfall predictions (Jakob et al., 2012), the consequences of the uncertainties in the knowledge of morphometric, mechanical and hydrological parameters of soils (Segoni et al., 2012) and the extremely high computational effort required to operate on a basin scale .
Distributed slope stability models apply algorithms and equations to every cell of an extended area. Usually the analyzed area is divided into pixels, and sometimes it is necessary to apply the model equations at different depths to each of them. As a consequence, the computation can be extremely time consuming depending on the thickness of the soil, the extension of the studied area, the spatial and temporal resolution and the complexity of the equation. Many softwares have been developed to handle this large amount of computations to apply stability models to large areas and to visualize the results. It is usually possible to find two different software approaches: plug-in oriented and stand-alone. Plug-in oriented codes are routines or add-ons that work on an existent software that provides a platform; this approach usually discharges all the file management and logical operations on the platform software and in some cases even part or all calculations are entrusted to the host software computational engine. Stand-alone software has a file management system and a dedicated and optimized computing routine which is developed in a universal programming language (C++, Fortran, Basic, etc.).
SHALSTAB, SHAllow Landslide STABility model, is a popular distributed slope stability analysis software (Dietrich et al., 1998). It has a physical core based on a distributed steady state description of the hydrological fluxes coupled with an infinite slope analysis. The basic tool is a grid-based model, a combination of C++ programs and ARC/INFO AML scripts intended to be used within an ESRI-ArcGIS software environment. This model has been classified as spatially predictive because it is not suited to forecast the timing of landslide triggering (Simoni et al., 2008). SINMAP, Stability INdex MAPping, and SINMAP 2 are other add-on tools for the ESRI-ArcGIS software. These have their theoretical basis in the infinite slope stability model with groundwater pore pressures obtained from a topographically based steady state model of hydrology (Pack et al., 1998(Pack et al., , 2001. The input information (slope and specific catchment area) is obtained from the analysis of digital elevation models (DEM). These parameters can be adjusted and calibrated with an interactive visual procedure that adjusts them based upon observed landslides. SINMAP allows an uncertainty of the variables through the specification of lower and upper bounds that define uniform probability distributions. Between these boundaries the parameters are assumed to vary at random with respect to the probability distribution.
Other softwares have a more complex approach to the hydrological modeling of the groundwater flow and require longer computational time. For example, SEEP/W (Geo-Slope, 2003a) is a stand-alone finite element software that resolves the Richards equations to account for transient groundwater flow within a slope. This software analyzes groundwater seepage and excess pore-water pressure dissipation within porous materials and can model both saturated and unsaturated flow (Krahn, 2004). SEEP/W is very efficient in resolving saturated-unsaturated and timedependent problems and in combination with the software SLOPE/W (Geo-Slope, 2003b) it performs the slope stability analysis adopting the limit equilibrium method. This software works very well for single slope stability analysis (Tofani et al., 2006) but is not suited to be applied to a distributed analysis.
TRIGRS, Transient Rainfall Infiltration and Grid based Regional Slope stability model (Baum et al., 2002, is a software developed in Fortran language, for computing the transient pore pressure distribution due to rainfall infiltration using the method proposed by Iverson (2000). The results are stored in a distributed map of the factor of safety. TRI-GRS, freely distributed both as source code and executable files, is widely used by many authors for regional landslide hazard assessment (Baum et al., 2005;Salciarini et al., 2006;Chien-Yuan et al., 2005) under the approximation of nearly saturated soil, presence of flow field and isotropic and homogeneous hydrologic properties (Baum et al., 2002). TRIGRS is very sensitive to initial conditions, therefore, if the initial water table depth is poorly constrained, it may produce questionable results (Zonghu et al., 2011).
GEOtop-FS is one of the most advanced models for distributed slope stability and was recently proposed by Simoni (2008). This model uses the hydrological distributed model GEOtop (Rigon et al., 2006) to compute pore pressure distribution by an approximate solution of the Richards equation and an infinite slope stability analysis to compute the distributed factor of safety. The approximate solution of Richards equation used by the software works in saturated soil conditions. The factor of safety of GEOtop-FS is computed in a probabilistic approach assigning statistical distributions to soil parameters instead of a single deterministic value and analyzing the error propagation. Apip et al. (2010) proposed a model that combines a satellite real time estimate of rainfall intensity, a one-dimensional physically based distributed hydrological model based on grid-cell kinematic wave rainfall-runoff model (Kojima and Takara, 2003) and a geotechnical stability model based on a infinite slope and on Mohr-Coulomb law. The hydrological model simulates three lateral flow mechanisms: subsurface flow through capillary pores; subsurface flow through noncapillary pores; surface flow on the soil layer.
All these softwares use different models, approximations and programming languages but they have one common characteristic: all are suitable only for research purposes. In all these cases, computational speed is not the main objective. Even using modern computational hardware, workstations or personal computers, the computational time can take days for a relatively small area at high spatial and temporal resolution. It is impossible to use these softwares, even if they are state of art, in real time and for warning system purposes. To achieve this objective a physical model was developed with these main characteristics: (a) the capability of computing the factor of safety at each time step and not only at the end of the rainfall event; (b) the variable-depth computation of slope stability; (c) the introduction of the contribution of soil suction in unsaturated conditions; (d) the probabilistic treatment of the uncertainties in the main hydrological and mechanical parameters and, thus, of the factor of safety.
A model with the aforementioned capabilities cannot be applied continuously over a large area without resorting to supercomputers and parallel processing. For this reason, the entire model programming code of HIRESSS was developed to run over multiprocessor systems and was tested for performances with an increasing number of processing units to design an optimal cost/benefit approach covering the entire prediction chain, from rainfall data acquisition to the factor of safety computation.

The model
The modeling of physical processes involved in shallow landslide triggering usually requires some simplifications: we developed our model trying to reach a compromise between computational speed and the reliability of the results.
The physical model proposed is composed of two parts: hydrological and geotechnical. The hydrological model receives the rainfall data as dynamical input, calculates the pressure head and supplies it to the geotechnical stability model that provides results in terms of factor of safety (FS). FS is a dimensionless parameter that implies the beginning of instability when it assumes the value 1, because the destabilizing forces are equal to those stabilizing (FS > 1 means instability).

The hydrological model
The hydrological model is based on an approximate solution of the Richards equations that represents the unsteady Darcian fluid flow in a porous media, in any saturation condition. The general form of the Richards equation (Richards, 1931;Bear, 1972) is a non-linear partial differential equation that does not have an analytical solution. There are two techniques that allow us to get a solution for this general form without approximation hypothesis: finite difference (FDM) and finite elements (FEM). The final solution is obtained evaluating the equation with consecutive small differences, or steps. The step must be quite small to have a convergent solution that means small spatial and temporal steps and many computations compared to an analytical solution. The advantage of this approach is undoubtedly that it is not approximation dependent. However, in favor of short computational time, it was decided to use an analytical approximate solution of the general equation.
The first approximation regards the timescale connected to shallow landslides. The physical process that leads to landslide triggering can operate on different timescales (Iverson, 2000): for times greater than A/D 0 , the groundwater pressure head gets to a steady background distribution in response to a rainfall, where D 0 [L 2 T −1 ] is the maximum hydraulic diffusivity and A[L 2 ] is the catchment area that might influence the pressure head distribution in a point of the analyzed area. Therefore, A/D 0 represents the minimum time needed for lateral pore pressure transmission from the area A. The triggering of landslides is instead the result of a rainfall over a shorter timescale of Z 2 /D 0 (Z is the depth from the slope surface) which is associated with transient pore pressure transmission during and immediately after a rainstorm (Iverson, 2000).
If we consider wet initial conditions and short time scale, according to Iverson (2000), the gravity flux can be neglected and the pressure head equation becomes where h is the pressure head, D 0 is the maximum diffusivity (D 0 = k sat /C 0 where C 0 is the change in volumetric water content per unit change in pressure at saturated condition, k sat is the saturated soil conductivity) and α the slope angle. Equation (1) is a linear partial differential equation and allows the superposition principle: the net response at a given place and time caused by two or more stimuli is the sum of the responses which would have been caused by each stimulus individually. Mathematically that means: where f is a function of a generic variable x i . Therefore, it is possible to analyze a complex rainfall path, which means different intensity and duration, managing them as a sum of different stimuli response.
The solution of the partial differential Eq. (1) is well known in thermodynamics (Carslaw et al., 1959); it is an analytical solution with these boundary conditions: where d Z is the steady water table depth, T is the rainfall duration, I z the rainfall intensity and β is defined as where k z is the soil conductivity in Z direction. The first boundary condition from Eq. (3) assumes a steady state pressure head distribution, the second assumes that at great depths below the water table the transient vertical groundwater flux becomes negligible but the steady state pressure head distribution persists. The last two conditions state that Darcy's law governs the water entry at the ground surface and that the pressure head distribution is defined by β when it is not raining (t > T ) and by β plus a short time infiltration rate during rainfall (t ≤ T ).
With these boundary conditions the solution of the Eq. (1) is where the response function R is defined as and erfc is the complementary error function. There are two other conditions to this solution: the maximum infiltration rate and the maximum pressure head sustainable at the surface. The maximum infiltration rate is defined as the rate I Z /k sat =1, which means if the I Z overreaches the saturated conductivity, the exceeding rainfall runs off as Horton over-land flow, which is not considered in this model. The maximum pressure head sustainable is (Zβ), over this value we have the unrealistic physical condition of a water column that leaves the soil without flowing away. This restriction is rather ad hoc but necessary when using a linear model and constant flux boundary to approximate the nonlinear effects of rainfall infiltration (Iverson, 2000). The hydrological model has the same sensitivity at the initial water table height of the Iverson (2000) model. We overcome the issues due to water table sensitivity using a Monte Carlo simulation (Sect. 2.3) after the computation of the pressure head.
There is one parameter that is very important in the timing response of rainfall intensity of this hydrological model: the hydraulic diffusivity. Unfortunately, this parameter is difficult to measure, especially on a large scale measurement campaign needed for a model applied to a large scale area. Therefore, the operating philosophy on a large regional areas suggests modeling physical properties and to relate the parameters that are difficult to measure with others that are easier to collect.
The hydraulic diffusivity D(h) is defined as Where the derivative part is the change in volumetric water content per unit change in pressure head and k(h) is the conductivity (θ is the soil water content). In the proposed model, the Brooks and Corey soil water retention mathematical relationship was used (Brooks et al., 1962(Brooks et al., , 1964: where θ s is the soil water content at the saturation, θ r is the residual water content, h b the bubbling pressure and λ the pore size index distribution. The diffusivity can be expressed, deriving Eq. (9) and using the Brooks and Corey expression of k(h): Combining the Eqs. (8), (9) and (10) the hydraulic diffusivity form becomes As stated, the proposed hydraulic model is under the wet condition hypothesis that is compatible with the Brooks and Corey theory, which describes the soil water retention curve for matric potentials lower than the bubbling pressure and not suitable for dry conditions. In wet conditions, tending to saturation, as used to obtain the solution at Eq. (2), the relation Eq. (11) can be simplified as where θ s is rewritten in function of the saturation degree (S) and the porosity n (θ = S × n).

The geotechnical model
The stability analysis is based on the limit equilibrium method for an infinite slope. It is observed that shallow landslides are usually characterized by an elongated shape and the influence of the toe and head portion is usually negligible, therefore it can be represented as a single slice with the slide surface approximately parallel to the ground surface. If the landslide has a low depth compared to length and width, as is common for shallow landslides, it is possible to assume a simplified geometry of the slide characterized by a planar slip surface on an infinitely extended planar slope, both laterally and distally. It assumes that the failure is the result of translational sliding, that the failure plane and the water table are parallel to the ground surface and that the failure occurs along a single layer of infinite length. The forces acting at a point along the potential failure plane are those illustrated in Fig. 1. The hydrological model computes the pressure head in relation to the depth, therefore, it is possible to evaluate the stability at different y values. In relation to the pressure head, an evaluated point in the soil can be saturated or not. If the soil is unsaturated, it is possible to write the limit equilibrium equations for each axis, x and y, of the reference system in Fig. 1 as where m(y) is the mass of the columns of y depth soils, F N the normal force, F A the friction force and F C the effective cohesion forces. The soil suction in an unsaturated soil can be considered as a pressure that raises the friction force. Therefore, the F A is the static friction force plus a contribution, F S , deriving from the soil suction where µ is the dimensionless static friction coefficient that in geotechnical science is better known as the tangent of friction angle. The suction force according to Fredlund and Rahardjo (1993) can be expressed as the product of the suction pressure for a surface A where φ b is an additional friction angle needed to account for the contribution of the matric suction to shear strength, u a and u w are respectively the air pressure in the soil and the water pressure. Solving the two equation systems, Eq. (13), and expressing all the forces, it is possible to write the limit equilibrium equation as where c is the effective cohesion. In this model, we consider the soil homogeneous and isotropic and we consider a two state model of soil density: wet or dry. If the soil is unsaturated, the soil density is assumed as equal to the density of dry soil. This is a first approximation in the HIRESSS model. The model response, in this way, can be too sharp but is smoothed thanks to the Monte Carlo simulation, as we will see in the next paragraph. Considering this hypothesis, the mass at an unsaturated depth y can be written as m(y) = ρAy where ρ is the density of dry soil. Dividing Eq. (16) by the left term, considering the soil mass hypothesis and the relationship of dry soil unit weight (γ d = ρg) it is possible to write the condition of stability as The right term of Eq. (17) is known as the "factor of safety" (FS) because it is the rate between resisting forces and driving forces.
In Eq. (17) u a u w , then the air pressure usually can be neglected, and the water pressure expressed in function of the pressure head h and the water unit weight γ W (u w = γ w h). Similar to the approach proposed by Lu et al. (2010), the additional friction angle ϕ b can be related to the soil water retention curve and to the friction angle ϕ (Vanapalli et al., 1996) tan Using the Van Genuchten (1980) soil water mathematical expression and putting together the Eqs. (17), (18) and (19), we obtain the following relationship for the factor of safety of unsaturated soil: If the pressure head is positive, at soil depth y, the soil is saturated: in this case the soil suction disappears because all pores are saturated and the capillary force is null. Therefore, the contribution due to suction in friction force Eq. (14) disappears. When the soil becomes saturated, another force must be considered: the force that comes from hydrostatic pressure, the pressure exerted by a fluid at equilibrium due to the force of gravity. The static equilibrium Eq. (13) become where the F hyd is the hydrostatic force obtained from the product of a surface A with the well-known hydrostatic pressure relationship: where ρ w is the water density and h is the height of the saturated soil in the point y that is equivalent at the pressure head value in that point. Operating like with the unsaturated soil condition, the factor of safety for a saturated depth point is where γ sat is the saturated soil unit weight. When the hydrological model gives a negative pressure head, unsaturated soil, Eq. (20) is used to compute the factor of safety, when instead the pressure head is positive FS is evaluated by the Eq. (23).

The Monte Carlo simulation
One of the main drawbacks of the deterministic approach is the uncertainty of the input data: the reliability of the results are strongly dependent on the quality of the input parameters needed by the physical model. Geotechnical and hydrological parameters are extremely variable at all spatial scales, from few meters to many kilometers; this is an intrinsic characteristic of an extremely mixed and chaotic natural material composition. Moreover, if the parameters are evaluated starting from thematic maps, scale and cartographic errors are also introduced. The limits of lithological units may not be as sharp and clear as if they were traced in the maps, especially if we are dealing with a shallow soil part that lays on the bedrock. Evaluating the pressure head and the factor of safety from exact input parameters even in a controlled laboratory test can lead to disappointing results. Some models and slope stability simulators try to solve these problems by characterizing the parameters uncertainty with a range or a probability curve, and evaluating the error propagation function of the model itself. Even this approach is risky because there is no guarantee that in an evaluated point the result is produced from the central value of each parameters error distribution. The result of crossed evaluations of the most probable value with a probability distribution tail value of two parameters can lead to very different results and this is not considered in a simple error propagation approach. Usually the approach used to limit this problem is to improve the knowledge of the input model data with huge amounts of measurements, even continuous ones produced using real time monitoring instruments. The aim of these constant measurements is to decrease the range of uncertainty. This approach is possible on a slope scale, in some cases on a small basin scale, but, it is not applicable on a large scale level of hundreds or thousands of square kilometers. Even restricting the analyzed area and helping the physical model with a large amount of measurements and real time control point, the knowledge will be incomplete because some measurements can only be taken with destructive analysis. On large areas, only a limited amount of measurements are possible and it is necessary to manage very approximative data inputs.
We propose in HIRESSS the use of a technique that helps overcome this problem: the Monte Carlo simulation (Metropolis and Ulam, 1949). This is a statistical nonparametric method that is useful in solving problems linked to a deterministic exact computation. Monte Carlo methods are a class of computational algorithms that use repeated random sampling to compute their results. They are often used in simulating physical and mathematical systems and they are common studying systems with a large number of coupled degrees of freedom such as fluids, strongly coupled solids, disordered materials, and cellular structures. More broadly, Monte Carlo methods are useful for modeling phenomena with significant uncertainties in inputs and allow for determining how random variation, lack of knowledge, or error affects the sensitivity, performance, or reliability of the system that is being modeled.
The values of the input parameters are randomly generated from probability distributions that most closely match data we already have or best represents our current state of knowledge in order to simulate the process of sampling from an actual population. HIRESSS support several types of probability distribution: uniform, Poisson, binomial, hypergeometric and various non-central hypergeometric distributions.
We use, at this step of the HIRESSS developmement and testing, an uniform distribution in all the test. The relative error used to define the uniform probabilty distribution for each parameter is reported in Table 4.
The data generated from the simulation can be represented as probability distributions or converted into error bars, reliability predictions, tolerance zones, and confidence intervals.
The Monte Carlo methods approach has a defined procedural scheme: -Define a domain of possible data inputs and a probability distribution curve or an equiprobable uniform range of parameters.
-Generate data inputs randomly from the domain using a chosen probability distribution.
-Perform a deterministic computation using the random inputs.
-Repeat the first three points n times.
-Aggregate the results of the individual computations into the final result.
The accuracy of the final results is proportional to the square root of sampling n: in order to improve the accuracy of the aggregate result by a factor 10, the sampling values must be increased 100 times. This method allows for the evaluation of the behavior of the physical model at the crossing of the input parameters that have different probability values. As stated previously, this is more reliable than other error propagation analyses that are strongly bounded to the knowledge and the accuracy of input data. Moreover, the Monte Carlo technique allows overcoming the sensitivity of the hydrological model to the initial water table condition because the results do not come from an exact single value. The hydrological model estimates the pressure head and then we use a uniform probability distribution to assign the range of probable values. The result is a distribution of probability that takes into account not one single value of pressure head but a range of values.
This huge advantage pays off in computational time: the computational time can increase proportionally to the n samplings needed for a good quality simulation. This computation technique is very time consuming and, consequently, hardware demanding: it is necessary to use the most advanced programming techniques that allow the use of all the computational power of modern computers.

The HIRESSS code
HIRESSS, HIgh REsolution Slope Stability Simulator, evaluates the physical model described. It is crucial to find a compromise between physical complexity and time computation because the final objective is a software which is suitable for a real time shallow landslide forecasting system over a large area: in order to be useful in case of hazardous events, the slope stability evaluation has to be compatible with civil protection activation timing.
The main time demanding sources are -The physical model: particularly the Monte Carlo simulation management.
-The high spatial resolution: the target resolution of this work is from 20 to 5 m 2 pixels. These values are usually the most common resolution available for the digital elevation models (DEM) on large areas (more than 10 3 km 2 area).
-The high temporal resolution: in this work the target time step of the simulations varies from 30 m to 1 h. Finer temporal resolutions are technically possible, but we used those values because they are more commonly available for weather forecasts or automated pluviometers measurement.
-The large scale of analyzed area: HIRESSS's objective is to extend the analyzed area over thousands of square kilometers. At the spatial resolution of 5 m the physical model must be evaluated in 800 pixels every km 2 .
The solution to contain or reduce the running time of an optimized code is to distribute the computation. The idea is to divide the problem into smaller parts that can be evaluated at the same time by different processing units, the CPUs (Central Processing Units). This approach allows for the use of high performance computing (HPC) hardware that has thousands of CPUs that can operate simultaneously. These systems are called supercomputers and they can speed up huge computational problems depending on the parallelizing grade of the problem to solve.
HIRESSS is conceived with the use of a supercomputer in mind, because the analysis domain and resolution proposed is not affordable by just any workstation or computational hardware. It is necessary to have the computational power of thousands of modern CPUs to handle in a useful amount of time for civil protection purposes the huge computation amount of a high resolution slope stability simulator. However, the parallelization of calculation improves the computational timing for small areas or for non-time-constrained scientific purposes also in the modern computer. The actual desktop, workstation and often also laptop computer, have more than one processing unit that can work in parallel. The parallel programming presents some difficulties with the synchronization of the computation and it is a new challenge for the code writer.
HIRESSS is developed with these main technical characteristics: speed and portability. The code running time is crucial in a forecasting system for civil protection purposes and the code has to be usable by different machines and on different operating systems.
The code is written in C++ in conjunction with Open-MPI paradigm for the parallelization management. This programming approach guarantees the portability of the code, because it is possible to make an executable file for different operating systems or machines just by using an appropriate compiler. This is a characteristic that differentiates HIRESSS from tools dependent on an existent platform (Ar-cGIS, Matlab,...etc). Moreover, the speed of the HIRESSS is free from the heavy programming mechanism of a platform that must be multipurpose. The languages C++ and the paradigm Open-Mpi are also supported by the majority of supercomputing centers, making the software porting easier. Usually in scientific computation fields the most widely used language is Fortran, the friendliest for managing many mathematical operations and usually considered faster than other programming languages, but it was preferred to develop HIRESSS in C++ for a better integration with a possible civil protection warning system. Usually C++ is the main language for multi purpose software, not only computational oriented, and it is the main reason for this choice. Moreover, at this moment the speed gap from Fortran is not so sensible. The supercomputing diffusion and better compilers have equaled the theoretical speed of two languages, even if there is still some unfriendly management of usual mathematical operations. HIRESSS code is an approximately 5000 line code, 2500 of those are the computational core.
The developing and testing of HIRESSS involved a supercomputing hardware hosted at the supercomputing center CINECA. The supercomputer hardware is a IBM pSeries 575, codename SP6, with 5376 computing cores, 21 TB of RAM and it reaches 101 TFlop s −1 at peak performance.

Valle Armea
The Armea basin (38 km 2 ) is located in the western part of the Liguria region, in the province of Imperia, not far from the border between Italy and France (Fig. 2). The Armea stream begins in the Maritime Alps and flows into the Ligurian Sea with a total length of 16 km.
The area is characterized by very steep slopes and by deeply incised valleys; the hydraulic network has a marked erosive behavior and did not develop a proper floodplain. From a geological point of view, the bedrock of the basin is constituted by Cretaceous and Eocenic flysches (sandstones, marlstones and limestones), organized in a quite complex structural setting which favored the fracturing and the weakening of the rocks, especially in proximity of thrusts, faults, tectonic windows and recumbent folds. While hilltops are characterized by generally shallow soils, at the midslopes and footslopes wide and thick detritic deposits are usually found. Soil cover frequently includes stone blocks and its matrix composition usually reflects the bedrock lithology, ranging from silt (associated to limestones) to sand (in correspondence of sandstones).
The closeness of high mountains (up to 1298 m a.s.l.) to the sea contribute to make the Armea basin a rainy area. The mean annual precipitation is about 1250 mm and it is mostly concentrated in the autumn. The peculiar meteorological conditions, coupled with the geological and geomorphological setting, make the area highly subject to active erosive geomorphic processes. Among these, rainfall induced shallow rapid landslides (including also soil slips and debris flows) are amongst the main and most recurrent phenomena.
In this test site, HIRESSS was used to simulate an event which occurred the 8 December 2006: that day several shallow landslides were triggered by intense rainfall and damaged assets, buildings, infrastructure and injured one person. An inventory map of the landslides that occurred during the event is available, together with radar recordings of the rainfall (Segoni et al., 2009).
A 5 m resolution DEM was used to apply the model and to derive slope gradient values. In the modeling the values of the hydrological and geotechnical parameters were differentiated on a lithological basis using literature data (Segoni et al., 2009(Segoni et al., , 2012 and are reported in Table 1. Soil thickness was taken into account as a spatial variable: a 5 m resolution soil thickness map was obtained using the GIST model (Catani et al., 2010), in which the correlation between soil thickness and three morphometric attributes (curvature, relative position along the hillslope profile and slope gradient) is differentiated on the basis of geomorphological and lithological features of the area. The application of GIST model required to carry out a specific geomorphological field survey and GIS elaborations.

Ischia
The volcanic island of Ischia is located in the southern part of the Tyrrhenian sea (Fig. 2), between 40 • 44 N latitude and 13 • 56 E longitude, 33 km from Naples. This island is 7 km wide from north to south and 10 km from east to west. The coastline is 39 km long and the total surface is 46 km 2 .
For the Island of Ischia, the situation is more complex than the Armea basin as less information is available. Prior to 2006, very little information regarding shallow landslides is available. These events certainly have occurred in the past as both the geologic and geomorphologic settings of the territory are very similar to other areas in Campania (where landslides have occurred frequently and with catastrophic effects), with highly permeable volcanic soils overlying steep massifs (mainly tuff in the case of Ischia). Moreover, many debris flow deposits have been found within the volcanic succession the same as geomorphologic evidences of deep landslides especially on the western side of the island and  (Segoni et al., 2009 , 2006). However, as no precise documentation exists regarding earlier events it is not possible to create a comprehensive landslide inventory. Accurate information exists regarding only the 26 April 2006 rainfall event that triggered four landslides and caused the deaths of four people. The time and location of the landslides is known with good detail, therefore this event was used for validation. HIRESSS was applied using a 5 m resolution DEM and radar rainfall data acquired from civil protection authorities. Lithotechnical units and related main parameters values used in the test have been taken from literature (Tofani et al., 2008) and are reported in Table 2. As in the previous test site, GIST model (Catani et al., 2010) was used to feed HIRESSS with a spatially variable soil thickness map.

Lucca, Pistoia and Prato province areas
This test area is located in north-central Italy, in Tuscany, and includes part of the northern Apennines and has an extent of 3103 km 2 (Fig. 2). This area has been chosen for a test on a large area, which could be representative of a typical sub-regional alert area. The study area, encompassing the provinces of Pistoia, Prato and Lucca, is mainly mountainous and shows two different geological settings.
In the western sector, mountain tops are mainly made up of carbonaceous rocks and have very steep flanks: sub-vertical slopes are quite common and slope gradients higher than 60 • are frequent. Bare rock or discontinuous vegetation can usually be found in these sectors, which are characterized by shallow or absent soil cover. The carbonaceous mountain summits are typically connected to the lower parts of the slopes, composed of metamorphic sandstone and phylliticschist, by talus and scree deposits. This setting favors the presence of gentle midslopes (typical slope gradient values around 25-30 • ) while downslope, at bottom valley, slope gradient increases again (even over 40 • ) as a consequence of river erosive processes resulting from the Olocenic-Pleistocenic uplift of the Apuan metamorphic core. The midand low sectors of the mountainsides are largely characterized by thicker soils (up to 3 m) and dense forest (mainly chestnut).
The eastern sector shows a more uniform geological condition with the prevalence of flysch formation rock type (Macigno) which is composed of quartz and feldspar sandstone alternated with layers of siltstone. The slope gradients vary from 0 • in the alluvial plains to 55 • . In the mid and upper sections of the valleys, where most landslides usually occur, the stratigraphy consists of a 1.5 to 5 m thick layer of colluvial soil overlying the bedrock.
The resolution used to investigate this area is 10 m square pixels, for a total of over 50 million pixels.
An event which occurred in December 2009 was simulated using raingauge pluviometric measurements and feeding the model with a spatially variable soil thickness map (obtained by the application of the GIST model of Catani et al., 2010) and with a set of geotechnical parameters values which are differentiated on a lithological basis (Table 3). To retrieve the geotechnical parameters a devoted campaign has been carried out analyzing 40 survey points covering all the lithologies outcropping. Soil geotechnical parameters were collated from a series of in situ and laboratory tests, including grain size analysis, measurement of Atterberg limits, phase relationship analysis. In situ tests carried out included the Borehole Shear Test (Lutenegger and Halberg, 1981;Dapporto et al., 2000;Casagli et al., 2005), matric suction measurements with a tensiometer, and a constant head permeameter  (Tofani et al., 2008). test using an Amoozemeter (Amoozegar, 1989). The parameters used as inputs to the HIRESS model are reported in Table 3.

HIRESSS testing and preliminary results
HIRESSS performance was evaluated in the three test areas simulating historical events to estimate the reliability and the speed of the code. The objective of the developed simulator is the spatial and temporal localization of rainfall triggered shallow landslides, with a computational time compatible with a real time warning system for civil protection purposes. Clearly, good results are a compromise of the three factors of evaluation: an extremely precise spatial and temporal location is useful only if it is accomplished with a sufficient lead time. Moreover, since the model is applied at the basin or regional scale, the simulator has to obtain good results from input data that are affected by large variability and uncertainty. The validation of the results is not simple because it is very difficult to find an area where the landslides are both spatially and temporally correctly located. Landslides can be accurately detected and mapped by remote sensing techniques or field surveys, but these activities can be very time consuming and sometimes they represent the main objective of a work (Tofani et al., 2013). The time of occurrence, in turn, is very rarely known with hourly precision, and usually landslides are just reported to be related to a rainstorm, without any more precise timing.
For this reason the HIRESSS performance is tested in three areas depending on available data. A test to evaluate its spatial predictive capability was performed in the Armea basin, where an accurate inventory of landslides triggered by a single rainstorm was available. A preliminary test to evaluate the correct temporal forecasting of an unstable event was performed in the test area of the island of Ischia, where the triggering time of four shallow landslides was known with precision. The third area, the largest, was mainly used to test the code runtime performance and scalability over a regional area.

Valle Armea test results
We simulated the event which occurred on 6 December 2006 in the Armea basin area. The rainfall was measured by the Monte Settepani radar: radar measurements are very high quality data compared to the satellite or pluviometric data. In Tables 1 and 5 the data used for the simulation are summarized.
The test is a 24 h forecasting simulation, with spatial resolution of 5 m and temporal resolution of 1 h (Fig. 3). The results were analyzed using an aggregation by basin method.
The aggregation by basin is a reasonable method that consists of aggregating the results and then comparing basin to basin. The basins are defined by regrouping the cells that belong to the upslope contributing area of a common flow node, with a stream order of 1 (first level of aggregation) or 2 (second level of aggregation), according to Horton (1945). We consider the nodes that have a contribution area of more than 3000 pixel (first level of aggregation): the basins individuated allow a useful spatial information of slope stability because the average area dimension of the basins is around 0.9 km 2 . The method is reasonable because it provides a stability evaluation of a small area that will be quite homogeneous; moreover, the physical model does not take into account the parameters that can be determinant to trigger a landslide in a particular point and not in another only few meters away that has the same apparent characteristics. Moreover, shallow landslides are very dangerous when they group and flow in the channels, so it can be useful to understand the stability situation of an entire micro-basin. The test was performed also with a higher aggregation area, defined considering the nodes that regroup more than a 6000 pixel contribution (second level of aggregation). A calibration would be needed to find the threshold levels of warning but in this work this type of study is not performed because our target is only limited to a preliminary test of the model and to verify the speed of the code. However, reasonable thresholds were defined with an expert judgement to better evaluate and understand the results.
The aggregate validation, as said, is conceptually reasonable: an area is considered unstable if more than 1 % of the surface is over the 80 % of instability probability. The results are summarized in Table 6.
In the contingency table four classes are presented: -True positive: the unstable area correctly localized by the simulation (hit).
-False negative: the unstable area not localized by the simulation (miss).
-True negative: the area correctly defined stable by the simulation (correct rejection).
-False positive: the stable area that is defined unstable by the simulation (false alarm).
It is possible to observe the absence of false positive. In the Fig. 4 it is possible to see the graphical representations of the contingency Table 6.
In Table 6 the validation results of the second level of aggregation basin are summarized. The results are graphically represented in Fig. 5. The true positive percentage increases without affecting the correct stable areas detection and keeping an acceptable grade of spatial information.

Monte Vezzi test results
The temporal validation test is performed by simulating a deadly event on 30 April 2006 on the island of Ischia. Four debris flows (Fig. 6d) occurred along the northern flank of Monte Vezzi between 06:00 and 08:30 LT, in the locality of Piano Liguori, triggered by heavy rainfall. These landslides involved two buildings, a quarry and a garbage compactor and four persons were killed in their home. This meteorological event did not exceed the alert rainfall thresholds of the area, because rainfalls only marginally interested the monitoring rain gauge (Fig. 5a). The simulation was performed over two days, 29 and 30 April, at the spatial resolution of 5 m and at the temporal resolution of 30 min. The validation is focused in the area where landslides occurred since elsewhere no landslide inventory is available. The average of the failure probability is calculated over the Monte Vezzi failed slope from the single 30 min full resolution results map (Fig. 5b).
In Fig. 5c, the temporal evolution of the average failure probability resulting from HIRESSS simulation is compared with the rainfall path and the timing of landslide triggering. The simulated instability reaches its peak when the landslides actually occurred. Observing the rainfall intensity measured by the Grazzanise radar (Fig. 5c), we can observe that there is a preparatory phase that starts to undermine the slope stability before the final intense precipitation. The simulated instability is in full accordance with this behavior, showing that even a complex pluviometric path can be correctly taken into account by HIRESSS.

Runtime performance
The tests over the Armea basin and Ischia Island showed a promising reliability but the code running time is also very important: for civil protection purposes we have to obtain the results in a reasonable amount of time before an event occurs. When very large areas (e.g. thousands of square kilometres) are taken into account at high spatial and temporal resolution, the speed of the code is of paramount importance.
The development and testing of HIRESSS involved a desktop workstation and a supercomputer.
In Fig. 7, a general comparison between serial and parallel execution of HIRESSS code is presented. Disabling the Monte Carlo simulation, HIRESSS shows a slight increase in performance from a parallelized execution. This is an index of a good optimization and a rationalized structure of the code if analyzed together with the graph on the right in   The running times of the Armea area simulation, performed with the SP6 supercomputer, are plotted in the graph in Fig. 8 in relation to CPUs used. The simulation involves 24 h at spatial and temporal resolutions, respectively, of 5 m and 1 h. They are the same settings used in the spatial validation. The time needed to complete the simulation varies from 12 h using 1 CPU to 13 min using 512 CPUs. It is possible to observe that even using only 8 CPUs the time is compatible with civil protection purposes because in only one hour we have the "forecasting" of the slope stability situation for the next 23 h. Raising the number of CPUs used, the runtime decreases with a high efficiency (Figs. 8 and 9) up until 32 processors: the runtime is almost halved every time the number of CPUs is doubled. Between 32 and 64 processors the efficiency decreases but stays at good level, afterwards the improvements are not worth the resources employed.
The runtime results show that resources over 32/64 CPUs can be used to improve the quality of simulation. For example, it is possible to compute the results at more layers of the soil depth or increase the number of shoots of the Monte Carlo simulation if we have very uncertain input data. Clearly the analyzed area can be simply extended which is the objective of this work. For a regional scale test, we ran HIRESSS over the entire provinces of Lucca, Pistoia and Prato. The area is approximately 3000 km 2 wide, investigated with the spatial resolution of 10 m and a hourly temporal step. In this area, the available data regarding an event at the end of December 2009 that triggered a huge amount of landslides. Unfortunately, the spatial distribution of the slope failure probability is only partially in accordance with the timing and location of triggered landslides. The main reason is that landslides were triggered by a sudden snowmelt, which is not modeled in HIRESSS. However, the test was useful to verify the speed of the code when applied over large areas.
The runtime speed test shows that HIRESSS on a supercomputer is fast enough to be useful for civil protection purposes, managing a highly extended area with very high spatial and temporal resolution. Moreover, HIRESSS can be employed in relatively smaller areas using more affordable workstation computers. In Table 7 the computational time results for the main simulations performed are shown.

Conclusions
The objective of this work concerned the development of a physically based distributed slope stability simulator to anlyze shallow landslide triggering conditions in real time and on large areas, called HIRESSS (High REsolution Slope Stability Simulator).
HIRESSS has these main characteristics: -High temporal and spatial resolutions physically based analysis.
-Operating on a large area and at large scale.
-Fast computational time. . Speedup and efficiency are the two common parameters to evaluate a code parallelization. In parallel computing, speedup refers to how much a parallel algorithm is faster than a corresponding sequential algorithm.
The physical model proposed is composed of two parts: hydrological and geotechnical. The hydrological model receives the rainfall data as dynamical input and provides the pressure head as perturbation to the geotechnical stability model that provides results in factor of safety (FS) terms. The physical model is inserted into a Monte Carlo simulation, to overcome the exact computation problems. This technique is introduced to manage the typical geotechnical parameters' uncertainty, which is the common weak point of the deterministic models and large area management. The main characteristics of the physical model implemented in HIRESSS are -the Richards equation based hydrological model; -the modeling of hydrological diffusivity; -the modeling of soil suction effects in the stability model; -soil mass depending on saturation conditions; and -the Monte Carlo simulation to overcome the input parameters' uncertainty problems.
The high resolution analysis over a large area and the Monte Carlo simulation implementation required the use of advanced techniques of parallel programming and the computational power of high performance computing machines.
The software was written in C++ language and the parallelization used the Open-MPI paradigm. HIRESSS runtime was reduced using a high parallelization of the computation and a use of parallel data management architecture, typical of supercomputers. The HIRESSS software code main characteristics are -portability over different operating systems and hardware architectures; -parallelized code; -developed to run in HPC (High Performance Computing) hardware and supercomputers; and -can be compiled and used in modern workstations.
HIRESSS was tested on a multiprocessor workstation and on the SP6 supercomputer hosted in the Cineca HPC facilities and preliminary tests in three areas were performed to evaluate the spatial and temporal results' reliability and the computational speed. The preliminary test results confirmed a good reliability while managing very uncertain input data. The model parameters were deliberately extracted from literature measurements, geological and lithological charts combined with a small amount of geotechnical measurements to prove the ability of HIRESSS to work with data affordably collectible even on a large area.
The level of reliability of the model does not imply huge computational time, indeed HIRESSS proved fast enough to be the core of a real time landslide forecasting system for civil protection purposes.
Acknowledgements. The authors would like to thank CINECA for use of their the supercomputer facilities and for their support in the developing time. Many thanks to Jennifer Milardo for the English text revision.
The work described in this paper was in part supported by the project SafeLand "Living with landslide risk in Europe: Assessment, effects of global change, and risk management strategies" under Grant Agreement No. 226479 in the 7th Framework Programme of the European Commission. This support is gratefully acknowledged.
Edited by: M. Jaboyedoff Reviewed by: two anonymous referees