Construction of an integrated social vulnerability index in urban areas prone to flash flooding

Among the natural hazards, flash flooding is the leading cause of weather-related deaths. Flood risk management (FRM) in this context requires a comprehensive assessment of the social risk component. In this regard, integrated social vulnerability (ISV) can incorporate spatial distribution and contribution and the combined effect of exposure, sensitivity and resilience to total vulnerability, although these components are often disregarded. ISV is defined by the demographic and socio-economic characteristics that condition a population’s capacity to cope with, resist and recover from risk and can be expressed as the integrated social vulnerability index (ISVI). This study describes a methodological approach towards constructing the ISVI in urban areas prone to flash flooding in Castilla y León (Castile and León, northern central Spain, 94 223 km2, 2 478 376 inhabitants). A hierarchical segmentation analysis (HSA) was performed prior to the principal components analysis (PCA), which helped to overcome the sample size limitation inherent in PCA. ISVI was obtained from weighting vulnerability factors based on the tolerance statistic. In addition, latent class cluster analysis (LCCA) was carried out to identify spatial patterns of vulnerability within the study area. Our results show that the ISVI has high spatial variability. Moreover, the source of vulnerability in each urban area cluster can be identified from LCCA. These findings make it possible to design tailor-made strategies for FRM, thereby increasing the efficiency of plans and policies and helping to reduce the cost of mitigation measures.


Introduction
Flash floods are highly spatio-temporally localized flood events that usually occur in small steep basins.They are caused by a sudden increase in the stream flow, generally due to spatially concentrated heavy rainfall, and characteristically reach a high peak flow in a short period of time (i.e. usually within a few hours of the onset of rainfall; Borga, 2013;Wilhelmi and Morss, 2013;Bodoque et al., 2015;Terti et al., 2015).Because of their short duration, which limits or even eliminates warning time, flash floods are considered to be a destructive hazard with one of the greatest capacities for generating risk, in terms of both the socio-economic impact and the number of casualties on a global scale (Marchi et al., 2010;Borga, 2013;Terti et al., 2015).According to Barredo (2007), 40 % of flood-related casualties in Europe between 1950 and 2006 were caused by flash floods.
The growth of exposed population, the allocation of economic activities to flood-prone areas and the rise in extraordinary event frequency over the last few decades (Huntington, 2006;Frigerio et al., 2016) have led to an increase in flash-flood-related casualties and economic losses (Llasat et al., 2008;Marchi et al., 2010).This highlights the need to make people aware of risk and prepare them for living with it (Birkmann, 2013).In this regard, the United Nations has put a great deal of effort into promoting awareness of the importance of disaster reduction.This initiative started in 1990 with the International Decade for Natural Disaster Reduction.The experience gained during this decade laid the E. Aroca-Jimenez et al.: Construction of an integrated social vulnerability index foundations for the International Strategy for Disaster Reduction.This gave rise to different frameworks, of which the Sendai Framework for Disaster Risk Reduction 2015-2030 is the most topical (UNISDR, 2015).This new approach has enabled the flood risk management (FRM) to change from an engineering-based perspective, which has proved to have an ineffective response (Birkmann, 2013;Cutter et al., 2013;Koks et al., 2015), to a disaster-resilient perspective, highlighting the need to build resilient communities through the integration of vulnerability reduction approaches into risk reduction policies (Cutter et al., 2008;Birkmann et al., 2013;UNISDR, 2015).
Much effort has been put into flood hazard analysis in the past, but vulnerability assessment is still one of the biggest constraints in flood risk assessment to date (Mechler et al., 2014;Koks et al., 2015).Vulnerability assessment can be defined as the analysis of the characteristics of a person or group and their situation that influence their capacity to anticipate, cope with, resist and recover from the impact of a natural hazard (Birkmann, 2013).A hybrid approach is currently the one most frequently used for analysing vulnerability.This comprises risk-hazard approaches, which consider that vulnerability depends on biophysical risk factors and the potential loss of a particular exposed population (e.g. the hazards-of-place model of vulnerability; Cutter, 1996), and political economic-ecological approaches, which emphasize the political, cultural and socio-economic factors explaining differential exposure, impacts and capacities to recover from an event (e.g. the pressure and release model; Blaikie et al., 1994).Taking into account the key parameters for vulnerability research, outlined in the approaches mentioned above, vulnerability depends on the exposure and sensitivity to stress of the social system (i.e.any characteristics that increase vulnerability) as well as its capacity to absorb or cope with the effects of these stressors (i.e.resilience; Adger, 2006;Eakin and Luers, 2006;Birkmann, 2013;Thieken et al., 2014).Numerous papers have analysed the physical vulnerability component (Koks et al., 2014;Ocio et al., 2016).However, the social aspects of vulnerability have often been neglected (Cutter et al., 2003;Hummell et al., 2016), mainly due to the difficulty of quantifying variables that are inherently qualitative (Frazier et al., 2014).Social vulnerability tries to explain how a certain natural hazard produces an unequal impact on exposed populations (Cutter et al., 2003;Nelson et al., 2015).It can be characterized by any socioeconomic and demographic variables that influence a society's preparedness, response and recovery (Birkmann, 2013;Cutter et al., 2013;Terti et al., 2015).Social vulnerability has been assessed in different contexts (e.g.those of global environmental change and natural hazards).However, not many studies have focused exclusively on the context of flood risk (Tapsell et al., 2002;Burton and Cutter, 2008;Fekete, 2010;Mollah, 2016), and only a very few of these relate to flash floods (Balteanu et al., 2015;Karagiorgos et al., 2016).Overall, in social vulnerability analysis a separate assessment of vulnerability (Tapsell et al., 2002;Cutter et al., 2003;Nelson et al., 2015) and resilience is made (Cutter et al., 2008(Cutter et al., , 2010;;Siebeneck et al., 2015).The approach has usually been based on calculating composite indices from socio-demographic and economic characteristics.Reductionist statistical techniques, i.e. factor analysis and principal components analysis (PCA), have generally been used for this purpose (Clark et al., 1998;Cutter et al., 2003;Dwyer et al., 2004;Fuessel, 2007;Grosso et al., 2015;Hummell et al., 2016;Rogelis et al., 2016).
Less attention has been paid to the integrated analysis of vulnerability components (i.e. using the hybrid approach mentioned above).This considers the differential influence of exposure (people and assets susceptible to harm), sensitivity (the extent to which people and assets can be damaged) and resilience (the ability to absorb, cope with and recover from the effects of a disaster) on total vulnerability (Frazier et al., 2014;Pandey and Bardsley, 2015;Weber et al., 2015).The above approach helps to identify which characteristics contribute to increases or decreases in vulnerability and where they are spatially represented.Accordingly, this approach provides a much more holistic assessment of vulnerability, since it models the combined effects of its components (Fuessel, 2007;Frazier et al., 2014).Although exposure and vulnerability are assumed to be two different concepts in the traditional risk framework, the inclusion of exposure as a component for consideration is currently common practice when assessing social vulnerability as it provides a more holistic characterization (Turner et al., 2003;Adger et al., 2004).If exposure is absent it is not possible to talk about the potential for loss -in other words, vulnerability (Frazier et al., 2014).Moreover, it is worth mentioning that flash floods occur in mountainous areas where data availability may be limited.Very few studies take this constraint into account.In some cases, it is addressed by aggregating the variables under consideration in order to obtain the vulnerability index (Balteanu et al., 2015).Determination of the vulnerability index requires vulnerability factor weighting.To this end, equal weights have usually been assigned to all factors (Cutter et al., 2003(Cutter et al., , 2010)), which may not give a realistic result (Frazier et al., 2014).Differential weights according to expert judgments have sometimes been assumed (Zelenakova et al., 2015).This could in itself be a limitation as experts' judgments on the same issue may differ (Asadzadeh et al., 2015).It is also worth mentioning that statistical methods, such as a correlation-based PCA (Mollah, 2016), are increasingly being used (Asadzadeh et al., 2015).
This paper aims to calculate an integrated social vulnerability index (ISVI) for flash floods that considers the vulnerability components (i.e.exposure, sensitivity and resilience) separately, analysing how each of them is involved in the total vulnerability.To address this, a set of variables were statistically analysed, firstly by means of a hierarchical segmentation analysis (HSA) and secondly by performing a PCA.This approach constitutes an alternative methodology to what is typically used in social vulnerability assessment, making it possible to overcome the insufficient availability of data that frequently occurs in mountainous areas.Tolerance statistics were used as a variable weighting method.Latent class cluster analysis (LCCA) was also carried out in order to identify social vulnerability profiles within the study region.

Study area
The methodology proposed here was applied in the region of Castilla y León (Castile and León), which occupies almost all of central northern Spain (Iberian Peninsula; Fig. 1).Castilla y León is not only the largest region in Spain, but also in the European Union, with its surface area of 94 230 km 2 exceeding that of seventeen of the 28 member states, including Portugal, Austria and Belgium.Its relief is mainly composed of a large plateau between 700 and 1100 m above sea level, surrounded by large mountain systems with peaks up to 2600 m high.The climate is a continental variation of the Mediterranean type, with hot dry summers and cold relatively dry winters.Average annual rainfall ranges from 300 to 600 mm and falls primarily in spring and autumn, although in certain mountainous areas more than 1800 mm is not unusual.Steep slopes, which limit the development of vegetation, and spatially concentrated heavy rainfall in certain mountainous areas favour the triggering of flash floods.With regard to demographics, Castilla y León has a population of nearly 2.5 million, 5.5 % of whom are foreign born.Population densities range from 9 to 65 inhabitants per km 2 , giving a mean population density for the region of 26, which is far lower than the mean for Spain of 92 inhabitants per km 2 .The region is divided into 2248 urban areas.Of the total number of urban areas, 94 % have fewer than 2000 inhabitants but account for 26 % of the region's population.It is also worth mentioning that the region has an ageing population, with an ageing index close to 2 (i.e.there are two people aged 65 or over for each person below the age of 15).In urban areas with less than 2000 inhabitants, however, the ageing index is higher than 5.

Methodological outline
The first step was to distinguish those urban areas of the study region that were prone to flash flooding (see Sect. 2.2.1).Next, a database of socio-demographic and socio-economic variables was constructed for each of the urban areas studied (see Sect. 2.2.2).It was not possible to perform a PCA to define the ISVI, as is usual in the literature (Fekete, 2010;Frazier et al., 2014;Hummell et al., 2016), since the number of variables initially considered (71) outnumbered the urban areas of interest (39; Sarstedt and Mooi, 2014).However, the database had previously been divided into small subsets of variables (see Sect. 2.2.3) by means of an HSA.This allowed a PCA to be performed on each one, which in turn enabled extraction of the different vulnerability factors (see Sect. 2.2.4) which go on to make up the ISVI (see Sect. 2.2.5).Vulnerability factors were also used to identify social vulnerability patterns within the study area by means of LCCA (see Sect. 2.2.6;Fig. 2).

Identification of urban areas prone to flash flooding
Flash floods occur in very specific areas.It was therefore necessary to specify the urban areas prone to this type of event.This was done by considering a number of simple but robust requirements.The first of these was to identify those river reaches where the longitudinal slope across a given urban area was higher than 0.01 m m −1 (Bodoque et al., 2015).
A digital terrain model with a cell size of 200 m was used to apply this criterion.It was provided by the Spanish National Geographic Institute (IGN; layer generated in 2013) and was used as input data for the Geospatial Hydrologic Modeling Extension (HEC-GeoHMS 10.0; USACE, 2013) from which the longitudinal slopes of the river were calculated.Secondly, we examined urban environments defined by the basin water authorities as Areas with Potential Significant Flood Risk (APSFRs; layer generated in 2015; Caballero et al., 2011) and flood hazard zones with low or exceptional probability (i.e.500-year flood; layer generated in 2016), taking into account the river reaches selected in the previous step.These were provided by the Spanish Ministry of Agriculture and Fisheries, Food and Environment (MA-PAMA).The low probability scenario was chosen because it is the most comprehensive representation of urban areas that could be affected by flash floods on a regional scale.
The 500-year flood zones were obtained from preliminary flood risk assessments by competent water, coastal and civil protection authorities, as stated in Directive 2007/60/CE on flood risk assessment and management.The aforementioned flooded areas were then contrasted with the river reaches selected according to the first criterion in order to identify the urban areas of interest, which resulted in a total of 39.

Database generation
Based on existing literature (Cutter et al., 2003;Frazier et al., 2014;Hummell et al., 2016), a set of 71 variables was initially characterized for each of the 39 urban areas identified above.A total of 42 socio-demographic and socioeconomic variables were extracted from state, regional or local databases (e.g.population, education, buildings).However, another 29 variables were requested from certain public organizations and councils.This information was obtained from phone calls, from direct personal contact with a person in charge (e.g.dependency, development and infrastructures) or through personal research in which the variables were estimated from other non-specific sources of information (e.g.collective vulnerability, healthcare services).Collective vulnerability encompasses vulnerability aspects that are related to the community as a potentially sensitive unit.Variables were then normalized to a percentage or per capita (Hummell et al., 2016).Redundant variables were removed after conducting a correlation test (Cutter et al., 2003).Specifically, those variables with a correlation coefficient above 0.9 were not considered.As a result, 16 variables were eliminated from the database and the methodological approach was continued with the other 55 variables (Table 1).These variables were classified into eight thematic information blocks: (i) population (11 variables related to demography), (ii) dependency (4 variables linked to elderly people), (iii) education (2 variables associated with the level of educational attainment), (iv) employment situation (5 variables related to unemployment status), (v) healthcare services (8 variables linked to medical system characteristics), (vi) development and infrastructures (10 variables associated with the economic potential of the region and its facilities), (vii) buildings (13 variables related to construction features) and (viii) collective vulnerability (2 variables linked to the availability of infrastructures for evacuating population).

Exploring the dimensions of social vulnerability
Variables being considered were grouped together with the HSA application, using SPSS (IBM-SPSS v.19) statistical software.This is a multivariate statistical technique for automatic data classification that divides an initial set of variables into different groups.This division is based on minimizing the distance between variables in the same group and maximizing the distance between variables in different groups (Sarstedt and Mooi, 2014).The greater the distance between variables, the less similar they are.The division of variables into groups followed a hierarchical process in which initially as many groups as variables were considered.Subsequently, successive iterations of hierarchical algorithms enabled variables to converge in larger groups.Once the variables were standardized (Cutter et al., 2003), the squared Euclidean distance was used as a measure of similarity, i.e. the square of the square root of the sum of the differences between variable values.In addition, Ward's method was used as a grouping method.This seeks the least possible variability within each group (i.e. the minimum variance) as an associative hierarchical algorithm, which has been demonstrated to be one of the most effective (Pérez, 2004), especially when the sample size is small (Martín et al., 2015).The number of groups was determined by taking into account both the distance at which groups were differentiated into the graphical output of the HSA (i.e. the dendrogram) and the consistency and homogeneity of the numbers of variables contained in them.
Finally, distinguishing variables into groups made it possible to conduct a principal components analysis in each of them.

Identification of vulnerability factors
SPSS (IBM-SPSS v.19) was used to implement the principal component analysis in each group differentiated by the HSA.This aimed to reduce the number of variables to latent variables, which are not directly observable, and so-called principal components or factors, which are a linear combination of primitive variables (Sarstedt and Mooi, 2014).The Kaiser-Meyer-Olkin (KMO) statistic and Barlett's test of sphericity were estimated in order to evaluate the suitability of performing PCA in the variables under consideration.For each group, all the variables were initially examined using the factor extraction process.Variables with low communality (values below 0.5) were subsequently removed, and the factor extraction process was repeated until all variables had communality values above 0.5.Communality indicates how much of the variance of each variable can be reproduced by means of factor extraction.In cases where a group presented more than one factor, these were separated and a PCA was performed on each one of them individually.The correlations between factors and variables were represented by factor loadings, which enabled each factor to be named.Finally, factor scores were obtained using the regression method, which is the most frequently used (Sarstedt and Mooi, 2014).Factor scores embody a linear combination of the primary variables.Thus, each urban area was composed of as many factor scores as social vulnerability factors identified.

Construction of the integrated social vulnerability index (ISVI)
Factor scores for each vulnerability factor were saved as new attributes in the data file.This allowed them to be used for index construction.Factor scores were standardized and took positive or negative values depending on whether the characteristic described by a certain factor in a given urban area was above or below average (Sarstedt and Mooi, 2014).In the ISVI, factors that express sensitivity or exposure are traditionally considered as positive values and those that express resilience are negative (Cutter et al., 2013;Frazier et al., 2014;Hummell et al., 2016).In order to maintain this criterion, the signs of some factor scores were reversed (i.e.multiplied by −1).
The ISVI for each urban area was calculated according to the following equation (adapted from Frazier et al. (2014): where ISVI is the integrated social vulnerability index, E is exposure, S is sensitivity and R is resilience.Each vulnerability component was estimated using Eq.(2) (adapted from Frazier et al., 2014):  where V C is the vulnerability component (exposure, sensitivity or resilience), w f is the weight allocated to the n factor and S f represents the factor scores of the n factor.The value of a specific vulnerability component was the sum of the factor scores multiplied by their respective weights.The index construction method used was based on the one developed by Frazier et al. (2014), although the tolerance statistic was used here as a weighting method instead of the amount of explained variance.Tolerance is a statistical test to detect multicollinearity (Sarstedt and Mooi, 2014).It reaches a maximum value of 1 when one factor has no degree of multicollinearity with the other factors and a mini- mum value of 0 when one factor is a perfect linear combination of the others.Thus, vulnerability factors expressing less redundant information would have more weight in the ISVI.

Identification of social vulnerability patterns
LCCA is a model-based clustering approach employed (using Latent Gold ® 4.5) for the purpose of identifying social vulnerability patterns within the study area.Urban areas of interest were classified into clusters (Vermunt and Magidson, 2002), and the sources of vulnerability for each cluster were shown by the statistical model.3 Results

Integrated social vulnerability index (ISVI)
The dendrogram shows the five groups of variables that were differentiated by the HSA (Fig. 3).Groups were homogeneous in both the number of variables (each comprising between 10 and 13 variables) and the type of variables included.The first group contained variables mainly related to large facilities.The second group of variables was connected to types of construction and the region's economic potential.
The third group was related to demographic characteristics and the employment situation in the region.The fourth group of variables was primarily associated with the elderly population.Finally, the fifth group did not show a clear dominance of any variables over others, although certain variables displayed significant correlation with at least one variable (i.e.p < 0.05).A total of 11 vulnerability factors were extracted from the groups of variables identified by the dendrogram (Table 2): (1) total social exposure, (2) exposure in the urban built-up environment, (3) constructive resilience, (4) constructive exposure, (5) youth social sensitivity, (6) mature social resilience, (7) labour social sensitivity, (8) social sen-sitivity due to dependency, (9) economic resilience due to investments, (10) social hospital sensitivity and (11) social health sensitivity.In all cases, the KMO scores were higher than 0.5 and the Barlett's test of sphericity values were significant (i.e.p < 0.05).For vulnerability factors comprising two variables, the value of the correlation coefficient was indicated instead of the KMO score.Correlation coefficients for these vulnerability factors were considered significant (i.e.p < 0.05).In addition, all identified vulnerability factors showed a percentage of explained variance over 70 %.Factor loadings were used to allocate names to the vulnerability factors.This enabled vulnerability factors expressing exposure, sensitivity or resilience to be classified.Consequently, factors number one, two and four were considered to express exposure; factors number five, seven, eight, ten and eleven expressed sensitivity; and factors number three, six and nine expressed resilience.
Figure 5 illustrates the ISVI value for each urban area using the quintiles classification.In this regard, ISVI has high spatial variability, defining values that range from 0.085 to −0.055.Urban areas with the highest ISVI values are mainly concentrated in the northwest, while urban areas with the lowest values are found in the east and northeast of Castilla y León.Each urban area has an associated bar chart showing the decomposition of each ISVI value into its components (exposure, sensitivity and resilience).The direction of the bar indicates whether the sign of the vulnerability component is negative or positive.The height of the bar shows the value of the vulnerability component (each vulnerability component was calculated by combining any vulnerability factors that contributed to each component, taking factor scores and different weights into consideration).In addition, the numbers located in each bar show categories based on the classification of the quintiles in which each vulnerability component is found.Number 1 is associated with a very low category (i.e.very low exposure, sensitivity and resilience) and number 5 with a very high category (i.e.very high exposure, sensitivity and resilience).

Social vulnerability patterns
BIC and CAIC statistics were used to select the more parsimonious number of clusters (i.e. the number of clusters that provides as much information as possible when the number of estimated parameters are taken into account).The minimum values of BIC and CAIC determined that the optimum number of clusters of urban areas was three (Table 3).
In order to assess vulnerability factor usefulness in terms of identifying these patterns, the parameters for each identified cluster are shown in Table 4. Neither the "Social Hospital Sensitivity" factor nor the "Economic Resilience due to Investments" factor was statistically significant when discriminating among the three clusters of urban areas (p>0.05,highlighted in bold in Table 4).The percentage given under each cluster title shows the proportion of urban areas making up each cluster.
Finally, Fig. 6 shows the three different clusters of identified urban areas, which help to characterize the profile of each detected pattern.Moreover, each cluster is associated with a bar chart depicting the cluster profile over the most representative urban area in each of them.This is calculated from the number of coincidences among the signs and the minimum distances between factor scores for each urban area and the mean factor scores for each identified cluster.The bar charts include the standard deviation values from the mean for each vulnerability factor (values that are located above each bar).The direction of the bar is related to the sign of these standard deviation values -that is positive values for factors expressing exposure or sensitivity indicate more exposure or sensitivity than the cluster mean, but positive values for factors expressing resilience indicate more resilience than the cluster mean.Each cluster can be characterized as follows: -Cluster 1 comprises 51.1 % of urban areas of interest (i.e. a total of 20) and is made up of urban areas with the highest levels of constructive exposure and labour social sensitivity factors.
-Cluster 2 comprises 30.9 % of urban areas (i.e. a total of 12).It contains urban areas with the highest levels of social health sensitivity and social sensitivity due to dependency factors.On the other hand, with regard to youth social sensitivity and labour social sensitivity factors, these urban areas show the lowest levels.Also included here are urban areas with the lowest levels of total social exposure and exposure in the urban builtup environment.Regarding resilience factors, the lowest levels of factors associated with constructive resilience and mature social resilience are in the urban areas found in cluster 2.
-Cluster 3 comprises 18.0 % of urban areas (i.e. a total of 7).It is made up of urban areas with the highest values in factors related to total social exposure, exposure in the urban built-up environment and youth social sensitivity.These urban areas also show the highest values in the constructive resilience factor and the lowest values in the constructive exposure factor.On the other hand, they show the lowest values in factors related to social sensitivity due to dependency and social health sensitivity.The highest values of the mature social resilience factor are also found in these urban areas.
It seems that there is a relationship between the ISVI and the clusters to which urban areas belong (Fig. 6).In this regard, there are only significant differences between the ISVI  values of clusters 1 and 2 (i.e.p < 0.05; ANOVA analysis).Moreover, it is verified that cluster 1 urban areas are more vulnerable than cluster 2 urban areas, with an ISVI mean value of 0.013 and −0.017, respectively.

Data sources and methodology
Flash floods usually affect small mountainous urban areas (Marchi et al., 2010;Terti et al., 2015).Generally, the information available in these areas is limited, either because it is not available in public databases (i.e. it has to be requested from different councils) or because it is not generated on this work scale (i.e. it is estimated from a bigger work scale).This imposes limits on any assessment related to flash floods (Ruin et al., 2009).However, this constraint does not usually apply to studies on fluvial floods since, in terms of population, these frequently affect significant urban areas, which generally means greater availability of data and a larger number of event records.It is also worth mentioning that it is very difficult not to include data from different years in this type of analysis related to flash floods, since different databases are usually consulted and each public agency has its own mechanisms for updating data.This lack of information may condition the selected work scale, which should coincide with the scale of planning for flood risk mitigation (Cash and Moser, 2000).An insufficient work scale could result in homogeneous vulnerability reduction measures being put in place in areas where the spatial variability of vulnerability is high.This would reduce their effectiveness and might not guarantee a uniform reduction in

Vulnerability factors
Cluster vulnerability (Eakin and Luers, 2006;Frazier et al., 2014).In this study, the selected work scale was the urban area, as this entity tends to be small and homogeneous in the region of Castilla y León.Furthermore, sensitivity and resilience are usually considered as static components (i.e. the results give a snapshot of vulnerability) when in fact they vary over time and space (Cutter et al., 2003;Eakin and Luers, 2006).The identification of spatial patterns here represents a step forward towards improving FRM on a regional scale.Regarding temporal variability, we suggest periodic monitoring of identified variables as an explanation of social vulnerability to flash floods.Periodic recalculations would allow urban ar-eas to stay informed about the behaviour of SVI values over time.
As regards calculation of the ISVI, it is crucial that ISVI values should not be considered as absolute.This means that the ISVI can be used qualitatively to determine whether one urban area is more vulnerable than the others and, if so, to what extent (Cutter et al., 2013).In the methodology proposed here, conducting a preliminary HSA helps overcome the limitations of the PCA sample size (Sarstedt and Mooi, 2014).Most published works either do not discuss this aspect or tackle it by adding the variables directly (Balteanu et al., 2015).The HSA enables the vulnerability variables to be divided into groups.However, it did not provide information on the relative significance of variables within each group, making it necessary to subsequently perform a PCA (Cutter et al., 2003(Cutter et al., , 2013;;Fekete, 2009;Nelson et al., 2015;Hummell et al., 2016).Regarding the weighting method used here, although many authors support the idea of assigning the factors equal weight (Chakraborty et al., 2005), it seems reasonable to suppose that not all factors have the same importance in the construction of the ISVI (Brooks et al., 2005;Eakin and Luers, 2006;Liu and Li, 2016), especially when there may be variations in the number of variables forming each factor and their explained variance.It is even possible that there is a spatial variation in each factor's importance.This can be solved by carrying out a geographically weighted principal component analysis (Frazier et al., 2014;Gollini et al., 2015).

Integrated social vulnerability and variables involved
In spite of differences among variables, considered in literature as a means of explaining social vulnerability, there are some key variables common to all the indicators examined, such as age, gender, race, socio-economic status and living conditions (Cutter et al., 2003;Adger et al., 2004;Penning-Rowsell et al., 2005;Frazier et al., 2014).However, each region has its own particular characteristics and constraints, and these should be taken into consideration during the variable selection procedure (Frazier et al., 2014).Vulnerability factors identified in Castilla y León (see Table 2) reflect the specific characteristics of this region, whose cartographic representation gives us an idea of the spatial distribution of vulnerability and helps us to spatially identify vulnerability hotspots (see Fig. 4).Vulnerability factors making up the exposure component (see Fig. 4a, b and d) are mainly related to public buildings such as schools, kindergartens and health facilities ("Total Social Exposure" factor).They are usually occupied by sensitive people (e.g.small children, the elderly, the sick), who generally require external assistance during an evacuation due to flash floods.Moreover, the single-family dwellings that abound in the study area tend to have basements and ground floor rooms (i.e.living rooms, kitchens and sometimes bedrooms; "Exposure in the Urban Built-up Environment" and "Constructive Exposure" factors), spaces which are both prone to flooding (Bodoque et al., 2016b;Karagiorgos et al., 2016).With regard to the vulnerability factors that make up the sensitivity component (see Fig. 4e, g, h, j and k), urban areas of interest have a mean dependency rate higher than 70 % ("Social Sensitivity due to Dependency" factor).This is due particularly to the presence of elderly people, who may hinder the population evacuation process as they tend to have reduced mobility.Moreover, the elderly usually need economic support during the post-disaster period (Cutter et al., 2003).Unemployment is another vulnerability factor to be considered ("Youth Social Sensitivity" and "Labour Social Sensitivity" factors).It is related to the possible inability of a household to invest economical resources in flood insurance or in flood mitigation measures, all of which contribute to a slower recovery (Cutter et al., 2003;Fekete, 2010).As far as accessibility to health facilities is concerned ("Social Hospital Sensitivity" and "Social Health Sensitivity" factors), the frequent lack of nearby medical services in the urban areas studied may hamper the provision of immediate relief and extend disaster recovery time (Cutter et al., 2003).
Finally, with regard to the resilience component (see Fig. 4c, f and i), the structural capacity of households in good condition to cope with flood impacts was considered to be high ("Constructive Resilience" factor), so direct losses and repair costs would be lower (Cutter et al., 2003).Inhabitants aged 15 to 64 were also deemed to be a resilient factor ("Mature Social Resilience" factor), since they are able to help evacuate people during a flash flood event (Fekete, 2010).Lastly, urban areas with a higher public budget available per capita may implement a larger number of mitigation measures aimed at reducing flood damage ("Economic Resilience due to Investments" factor).Fixed investments per capita are related to the level of economic wealth, and this can determine the ability to absorb losses and enhance resilience (e.g. through the implementation of individual flood risk mitigation measures; Kunreuther et al., 2013;Haer et al., 2016).
The integrated social vulnerability assessment analyses interactions among the different vulnerability components and even between these and the ISVI (see Fig. 5).In addition, there is great heterogeneity in the combination of vulnerability components that generate the different ISVI categories.Despite this, the most vulnerable urban areas have the highest exposure component values.Urban areas in the high ISVI category usually have higher values for the sensitivity component than for exposure, although exposure quintile categories range from 2 to 5. Urban areas included in very low and low ISVI categories have the highest resilience component values, coinciding with the lowest levels of exposure.Thus, the highest ISVI values are mainly controlled by the exposure component.
These variations in ISVI values confirm the idea supported by other authors that vulnerability has a high spatial vari-ability and therefore cannot be treated homogeneously (Cutter et al., 2008;Frazier et al., 2014).Integrated social vulnerability assessments not only help identify which factors should be acted upon to reduce vulnerability, but also which of these factors should be strengthened to increase resilience.In the same way, the identification of vulnerability patterns (see Fig. 6) also helps us discern the sources of vulnerability and resilience within each cluster of urban areas, in particular whether these influences are direct or inverse and how strong they are.This facilitates the development of specific FRM strategies for each cluster.The optimum number of clusters can be established from BIC and CAIC criteria (in this case 3 clusters).From a practical point of view, the above means that an increase in the number of clusters from 3 to 4 or 5 would split a fairly homogeneous cluster of urban areas into several subgroups which would not be very different from each other.Therefore, a greater level of disaggregation would not help improve the implementation of different flood risk mitigation measures for each cluster of urban areas.

Policy implications
The high human and economic losses due to flash floods that continue today (Wilhelmi and Morss, 2013) draw attention to the need for a change in traditional FRM towards an integrated approach requiring comprehensive analysis of the social risk component (Koks et al., 2015).It is therefore essential to carry out a social vulnerability analysis from a holistic point of view.It is important not only to identify which socio-economic and demographic characteristics increase population sensitivity to flash flood damage but also to know which features increase a population's capacity to resist, cope with and recover from its impact (Cutter et al., 2010;Frazier et al., 2014;Zhou et al., 2015), as demonstrated here.This would enable local competent authorities to plan and implement specific strategies to reduce vulnerability and strengthen resilience, in addition to developing specific mitigation measures to reduce flood risks (Frazier et al., 2014;Nelson et al., 2015;Hummell et al., 2016).It is an approach that goes further than the traditional one of seeking to reduce flood hazard by delineating flood-prone areas and designing structural mitigation measures.
The identification of social vulnerability patterns can help to identify the most suitable mitigation measures for each cluster of urban areas identified by LCCA and also prioritize available resources.For instance, mitigation measures for those urban areas included in cluster 1 should be targeted towards improving physical resilience (e.g.raising the firstfloor elevation above ground level) and giving the population financial help to put mitigation measures in place (e.g.providing financial aid for dwellings located in flood-prone areas).On the other hand, people living in the urban areas included in cluster 2 are highly dependent on external assistance due to the high rates of ageing population.Therefore, different evacuation routes should be designed and clearly defined by the emergency services and shelters constructed near these urban areas.Finally, mitigation measures for urban areas included in cluster 3 should be aimed at collective facilities (e.g.carrying out flood emergency drills) and should encourage the implementation of individual mitigation strategies (e.g. through a financial incentive system, such as repayment of part of the money spent on municipal taxes).
However, in order to achieve greater effectiveness for FRM plans, it is necessary for all stakeholders, both public authorities and communities, to engage with them (Eakin and Luers, 2006;Koks et al., 2015;Haer et al., 2016).This is especially important in small mountainous areas prone to flash flooding because they are managed by local administrations where available economic resources tend to be limited.This makes individual adaptation measures particularly relevant as they partly depend on risk perception and the level of awareness (Bodoque et al., 2016a).Furthermore, both individual social networks and social contexts are of key importance in decision making related to public preparedness (Haer et al., 2016).Since the social component plays a decisive role, a suitable design is required for flood risk communication strategies to accompany integrated social vulnerability analysis.Traditional top-down communication strategies have proven ineffective, and a change towards peoplecentred strategies is currently taking place, which seeks to reflect population heterogeneity (Bodoque et al., 2016a;Haer et al., 2016).Therefore, a comprehensive characterization of the social component of flood risk requires not only an integrated social vulnerability assessment, but also that the people affected are aware of their situation and have the appropriate knowledge to reduce possible flood impacts at the individual level (Albano et al., 2015), so that social learning can be translated into disaster risk reduction (Cutter et al., 2008).

Conclusions
A comprehensive characterization of social vulnerability is critical for an integrated FRM.The implementation of an HSA helps to overcome the PCA sample size limitation.This means using an alternative methodology to the one usually used to construct an ISVI in areas where available data is limited.The results show the high spatial heterogeneity of social vulnerability within the study region and the high variability in ISVI scores regarding interactions between vulnerability components, which give integrated analysis greater importance.The identification of vulnerability patterns through LCCA gives the sources of vulnerability in each urban area.This simplifies the spatial heterogeneity analysis of social vulnerability and indicates which aspects need to be improved to decrease sensitivity and exposure and which aspects need to be reinforced to increase resilience.This allows the ISVI results to be more effectively integrated into FRM plans and policies, which in turn enables specific strategies of vulnerability reduction to be proposed, thereby increasing their efficiency.
Data availability.The data on population (population by age, education, employment situation) and housing (characteristics and types) can be downloaded from http://www.ine.es.Unemployment rates can be downloaded from http://www.sepe.es,and data on long-term unemployed people are made available upon request to ecyl.empleo@jcyl.es.Information related to health centers is available at http://regcess.msssi.es.The data on hospital beds and tourist accommodation can be downloaded from http://www.jcyl.es/sie.Information related to education infrastructures is available at https: //www.educacion.gob.es/centros.The data on retirement homes can be downloaded from http://www.dependencia.imserso.es.The data on municipal debts, fixed investments and municipal budgets can be downloaded from http://www.minhafp.gob.es.Information related to per capita income was generated from data requested through http://www.ief.es.The data on buildings can be downloaded from https://www.sedecatastro.gob.es.The entire database is available upon request to Estefania Aroca-Jimenez.
Author contributions.EAJ built the database and wrote the manuscript with contributions from all the co-authors.JMB conceived the research and critically reviewed the manuscript.JAG designed the statistical approach, which was implemented by EAJ and JAG.ADH critically reviewed the manuscript and designed Fig. 2.
Competing interests.The authors declare that they have no conflict of interest.

Figure 2 .
Figure 2. Methodological outline containing the different steps followed in the construction of the ISVI and the social vulnerability patterns.

Figure 3 .
Figure 3. Dendrogram resulting from the HSA.Each rectangle corresponds to an identified group, with a total of five groups (G1, G2, G3, G4 and G5).
The classification was made by creating a latent categorical variable.This measures the probability of belonging to a certain cluster according to the characteristics of the vulnerability factors.Factor scores were used as indicators for identifying the different clusters.A z-standardization of factor scores was implemented before they were entered into the statistical software.Five models integrating from one (sample homogeneity) to five clusters (sample heterogeneity with 5 patterns) were examined.Information criteria based on the model log-likelihood BIC (Bayesian information criterion) and CAIC (consistent Akaike information criterion) were used as model selection tools in order to choose the optimum model, based on the minimum values of these two criteria(Morin et al., 2011).

Figure 4 .
Figure 4. Factor scores for identified vulnerability factors.For exposure and sensitivity factors, very high categories are coloured in red, while, for resilience factors, very high categories are coloured in blue.Source: municipal boundaries (available online: http://www.ign.es;accessed on 5 December 2016).

Figure 6 .
Figure 6.Characteristics of the urban areas that form the identified clusters.Bars with a meshed plot represent factors which were not statistically significant in the discrimination of clusters of urban areas.Bars are sorted by vulnerability component (exposure, sensitivity and resilience).Source: municipal boundaries (available online: http://www.ign.es;accessed on 5 December 2016).

Table 4 .
Parameters of urban area clusters associated with vulnerability factors.Vulnerability factors are sorted by vulnerability component (exposure, sensitivity and resilience).Factors highlighted in bold were not statistically significant when discriminating among the three clusters of urban areas (p > 0.05).

Table 1 .
Set of variables used in the exploratory analysis of social vulnerability dimensions.

Table 2 .
Vulnerability factors identified with the PCA and additional statistical information (PCA results).The sign of the variable loadings indicates whether the correlation among variables making up a certain vulnerability factor is positive or negative.
a KMO: Kaiser-Meyer-Olkin statistic (vulnerability factors with more than two variables).b Correlation coefficient: bilateral correlation (vulnerability factors with two variables).

Table 3 .
Model fit summary of the latent class cluster models initially considered.
* Best model according to BIC and CAIC.