Chalk aquifers are an important source of drinking water in the UK. Due to their properties, they are particularly vulnerable to groundwater-related hazards like floods and droughts. Understanding and predicting groundwater levels is therefore important for effective and safe water management. Chalk is known for its high porosity and, due to its dissolvability, exposed to karstification and strong subsurface heterogeneity. To cope with the karstic heterogeneity and limited data availability, specialised modelling approaches are required that balance model complexity and data availability. In this study, we present a novel approach to evaluate simulated groundwater level frequencies derived from a semi-distributed karst model that represents subsurface heterogeneity by distribution functions. Simulated groundwater storages are transferred into groundwater levels using evidence from different observations wells. Using a percentile approach we can assess the number of days exceeding or falling below selected groundwater level percentiles. Firstly, we evaluate the performance of the model when simulating groundwater level time series using a spilt sample test and parameter identifiability analysis. Secondly, we apply a split sample test to the simulated groundwater level percentiles to explore the performance in predicting groundwater level exceedances. We show that the model provides robust simulations of discharge and groundwater levels at three observation wells at a test site in a chalk-dominated catchment in south-western England. The second split sample test also indicates that the percentile approach is able to reliably predict groundwater level exceedances across all considered timescales up to their 75th percentile. However, when looking at the 90th percentile, it only provides acceptable predictions for long time periods and it fails when the 95th percentile of groundwater exceedance levels is considered. By modifying the historic forcings of our model according to expected future climate changes, we create simple climate scenarios and we show that the projected climate changes may lead to generally lower groundwater levels and a reduction of exceedances of high groundwater level percentiles.
The English Chalk aquifer region extends over large parts of south-western England and is an important water resource aquifer, providing about 55 % of all groundwater-abstracted drinking water in the UK (Lloyd, 1993). As a carbonate rock the English Chalk is exposed to karstification, i.e. chemical weathering (Ford and Williams, 2007), resulting in particular surface and subsurface features such as dolines, river sinks, caves and conduits (Goldscheider and Drew, 2007; Gutiérrez et al., 2014; Stevanovic, 2015). Consequently, karstification also produces strong hydrological subsurface heterogeneity (Bakalowicz, 2005). The interplay between diffuse and concentrated infiltration and recharge processes, as well as fast flow through karstic conduits and diffuse matrix flow, results in complex flow and storage dynamics (Hartmann et al., 2014a). Even though Chalk has a tendency for less intense karstification, for instance compared to limestone, its karstic behaviour has increasingly been recognised (Maurice et al., 2006, 2012; Fitzpatrick, 2011).
Apart from the high water quality, favourable infiltration and storage dynamics which make chalk aquifers a preferred source of drinking water in the UK, their karstic behaviour also increases the risk of fast drainage of their storages by karstic conduit flow during dry years. This also increases the risk of groundwater flooding as a result of fast responses of groundwater levels to intense rainfalls due to fast infiltration and groundwater recharge processes. Groundwater flooding, i.e. when groundwater levels emerge at the ground surface due to intense rainfall (Macdonald et al., 2008), tend to be more severe in areas of permeable outcrop like the English Chalk (Macdonald et al., 2012) as also experienced repeatedly in other karst areas in Europe (Parise, 2003, 2010; Bonacci et al., 2006; Jourde et al., 2007; Gutiérrez, 2010; Naughton et al., 2012; Parise et al., 2015). Groundwater drought indices tend to be more related to recharge conditions in Cretaceous Chalk aquifers than in granular aquifers (Bloomfield and Marchant, 2013). Due to the fast transfer of water from the soil surface to the main groundwater system, chalk aquifers tend to be more sensitive to external changes, as for instance shown by Jackson et al. (2015), who found significant groundwater level declines in 4 out of 7 chalk boreholes in a UK-wide study using historic groundwater level observations.
Climate projections suggest that the UK will experience increasing temperatures, with less rainfall during the summer but warmer and wetter winters (Jenkins et al., 2008). This may stress these groundwater resources and increase the risk of groundwater droughts and potentially winter groundwater flooding. For those reasons, assessment of potential future changes in groundwater dynamics, concerning groundwater droughts, median groundwater levels as well as groundwater flooding is broadly recommended and is the subject of current research around the world (Naughton et al., 2012, 2015; Jackson et al., 2015; von Freyberg et al., 2015; Jimenez-Martinez et al., 2016; Moutahir et al., 2017; Perrone and Jasechko, 2017).
However, present approaches mostly rely on statistical distribution functions to express groundwater dynamics and groundwater level exceedance probabilities (e.g. Bloomfield et al., 2015; Kumar et al., 2016) and it is questionable whether the shapes of these distribution functions remain the same when climate or land use change. Physics-based hydrological simulation models that incorporate hydrological processes in a relatively high detail can be considered to potentially provide the most reliable predictions, especially under a changing environment. However, there are considerable limitations in obtaining the necessary information to estimate the structure and the model parameters, especially for subsurface processes, and this inevitably increases modelling uncertainties (Perrin et al., 2003; Beven, 2006).
The definition of appropriate model structures and parameters from limited information becomes problematic when modelling karst aquifers. In order to achieve acceptable simulation performance they have to include representations of karstic heterogeneity in their structures. Distributed karst modelling approaches are able to simulate groundwater levels on a spatial grid but their data requirements mostly limit them to theoretical studies (e.g. Birk et al., 2006; Reimann et al., 2011) or well explored study sites (e.g. Hill et al., 2010; Jackson et al., 2011; Oehlmann et al., 2015). Lumped karst modelling approaches consider physical processes on the scale of the entire karst system. Although they are strongly simplified, they can include karst peculiarities such as different conduit and matrix systems (Maloszewski et al., 2002; Geyer et al., 2008; Fleury et al., 2009). Since they are easy to implement and do not require spatial information, they are widely used in karst modelling (Jukić and Denić-Jukić, 2009). Simple rainfall–run-off models with more than 5–6 parameters are often assumed to end up in equifinality (Wheater et al., 1986; Jakeman and Hornberger, 1993; Ye et al., 1997); i.e. their parameters lose their identifiability (Wagener et al., 2002; Beven, 2006). For that reason, recent research took advantage of auxiliary data, such as water quality data or tracer experiments (Hartmann et al., 2013b; Oehlmann et al., 2015). These studies showed that adding such information allows the necessary model parameters to be identified, therefore enabling the model to reflect the relevant processes.
Up to now, most lumped karst models have been applied for rainfall–run-off simulations. Groundwater levels were simulated in quite a few studies (Adams et al., 2010; Ladouche et al., 2014; Jimenez-Martinez et al., 2016) but mostly relied on very simple representations of karst hydrological processes and disregarding the scale discrepancy between borehole (point scale) and modelling domain (catchment scale) at which they were applied.
In this study, we present a novel approach to predict and evaluate groundwater level frequencies in chalk-dominated catchments. It uses a previously developed semi-distributed process-based model (VarKarst, Hartmann et al., 2013b) that we further developed to simulate groundwater levels. To assess groundwater level frequencies we formulated a percentile of a groundwater-based approach that quantifies the probability of exceeding or falling below selected groundwater levels. We exemplify and evaluate our new approach on a Chalk catchment in south-western England that had to cope with several flooding events in the past. Finally we apply the approach to simple climate scenarios that we create by modifying our historic model forcings to show how changes in evapotranspiration and precipitation can affect groundwater level frequencies.
Location map with an overview of the Frome catchment.
Located in West Dorset in the south-west of England the river Frome drains
a rural catchment with an area approximately 414
In order to consider karstic process behaviour in our simulations we use the process-based karst model VarKarst introduced by Hartmann et al. (2013b). VarKarst includes the karstic heterogeneity and the complex behaviour of karst processes using distribution functions that represent the variability of soil, epikarst and groundwater and was applied successfully at different karst regions over Europe (Hartmann et al., 2013a, 2014b, 2016). We use a simple linear relationship that takes into account effective porosities and base level of the groundwater wells (see Eq. 1) enabling the model to simulate groundwater levels based on the groundwater storage in VarKarst. Finally, a newly developed evaluation approach is used by transferring simulated groundwater level time series into groundwater level frequency distributions and comparing them to observed behaviour at a number of monitored wells.
The VarKarst model operates on a daily time step. Similarly to other
karst models, it distinguishes between three subroutines representing
the soil system, the epikarst system and the groundwater system but it
also includes their spatial variability, which is expressed by
distribution functions that are applied to a set of
The VarKarst model structure.
All available data used in the study.
The model was driven by two input time series (precipitation and
potential evapotranspiration, PET), and the 13 variable model
parameters (see Table
The related parameters are
Model parameters, descriptions, ranges and optimised values.
The daily discharge data for gauge East Stoke were obtained from the Centre
for Ecology & Hydrology (CEH,
We use the shuffled complex evolution method (SCEM) for our
calibration, which is based on the Metropolis–Hastings algorithm
(Metropolis et al., 1953; Hastings, 1970) and the shuffled complex
evolution algorithm (SCE, Duan et al., 1992). The Metropolis–Hastings
algorithm uses a formal likelihood measure and calculates the ratio of
the posterior probability densities of a “candidate” parameter set
that is drawn from a proposal distribution and a given parameter
set. If this ratio is larger than or equal to a number randomly drawn
from a uniform distribution between 0 and 1, the candidate
parameter set is accepted. This procedure is repeated for a large
number of iterations. If the proposal distribution is properly chosen,
the Markov chain will rapidly explore the parameter space and it will
converge to the target distribution of interest (Vrugt et al.,
2003). In the SCEM algorithm, candidate parameter sets are drawn
from a self-adapting proposal distribution for each of a predefined
number of clusters. Again a random number [0, 1] is used to accept or
discard candidate parameter sets. The SCEM algorithm was applied
in default mode using the Gelman–Rubin convergence criteria (Vrugt
et al., 2003). In our study, we use the Kling–Gupta efficiency (KGE;
Gupta et al., 2009) as the objective function, which can be regarded as an
informal likelihood measure or more generally as a monotonically
increasing performance metric of model skill (Smith et al., 2008). It
was chosen by trial and error, comparing the simulation performances
during calibration and validation obtained with different objective
functions (RMSE and others). We found that we obtain the most robust
results with the KGE. To decide whether to accept or discard
a parameter set, we compare the KGEs of the candidate and the
given parameter sets. This procedure was already applied in various
studies (Engeland et al., 2005; Blasone et al., 2008; McMillan and
Clark, 2009) and is possible if the error functions monotonically
increase with improved performance. We achieved this in the SCEM
algorithm by defining KGE as
The posterior parameter distributions derived from SCEM provide information about the identifiability of the parameters. The more they differ from a uniform posterior distribution the higher the identifiability of a model parameter. We present different calibration distributions to show the use of auxiliary data for parameter identifiability.
Parameter ranges were chosen following previous experience with the
VarKarst model (Hartmann et al., 2013a, b, 2014b, 2016). Besides the
quantitative measure of efficiency, a split sample test (Klemeš,
1986) was carried out. Our data covered precipitation,
evapotranspiration, discharge and groundwater levels from 2000 to the
end of 2012. We calibrated the model for the period 2008–2012 and used
the period 2003–2007 for validation. We chose this reversed order to
be able to include the information on three boreholes that was only
available for 2008–2012. Three years were used as warm-up for
each of calibration and validation. During calibration, the most
appropriate of the
Schematic description of the percentile approach.
This procedure was repeated for each well and each Monte Carlo run and finally provided the three model compartment numbers that produce the best simulations of groundwater levels at the three operation wells and the best catchment discharge according to our selected weighting scheme. During calibration, we used a weighting scheme which was found by trial and error, as we stepwise added borehole data to our discharge observations. Discharge and the borehole at Ashton Farm were both weighted as one-third as Ashton farm is located in the lower parts within the catchment, while the other two boreholes were located at higher elevation at the catchment's edge and weighted as one-sixth each. In order to explore the contribution of the different observed discharge and groundwater time series during the calibration, we use SCEM to derive the posterior parameter distributions using (1) the final weighting scheme, (2) only discharge, (3) only Ashton farm, and (4) only the other two boreholes (equally weighted). Posterior parameter distributions are plotted as cumulative distributions. The more parameters that show sensitivity, the more information is contained in the selected calibration scheme.
Even though the VarKarst model includes spatial variability of system
properties by its distribution functions, its semi-distributed
structure does not allow for an explicit consideration of the
locations of groundwater wells. Its model structure allowed for an
acceptable and stable simulation of groundwater level time series of
the three wells (see Sect. 4.1), but for groundwater management,
frequency distributions of groundwater levels calculated over the
timescale of interest are commonly preferred. For that reason we
introduced a groundwater level percentile-based approach. Other than
Westerberg et al. (2016), who transferred discharge time series into
signatures derived from flow duration curves, we calibrate directly
with the discharge and groundwater time series in order to evaluate
the performance of our approach for selected time periods (see
evaluation below). Similarly to the calculation of standardised
precipitation or groundwater indices (e.g. Lloyd-Hughes and Saunders,
2002; Bloomfield and Marchant, 2013), we create cumulative frequency
distributions of observed groundwater levels and the simulated
groundwater levels from the previously evaluated model. Now, the
exceedance probability or percentile for a selected observed
groundwater level (for instance, the groundwater level above which
groundwater flooding can be expected) can be used to define the
corresponding simulated groundwater level, and the number of days
exceeding or falling below the chosen groundwater level can directly
be extracted from the frequency distributions (Fig.
Modelled discharge (
As the approach is meant to be applied in combination with climate change
scenarios, we perform an evaluation on multiple timescales and flow
percentiles. We assess the 5th, 10th, 25th, 50th, 75th, 90th and 95th
percentiles on temporal resolutions of years, seasons, months, weeks and
days. The deviation between modelled and observed number of exceedance days
of these different percentiles is quantified by the mean absolute
deviation (MAD) between simulated exceedances (SE) and observed exceedances
(OE):
Cumulative parameter distributions (blue) of all model
parameters; strong deviation from the
Given the model performance assessment above, we use our approach
to assess future changes of groundwater level frequencies at our study
site. We derive projections of future precipitation and potential
evapotranspiration by manipulating our observed “baseline” climate
data. We extract distributional samples of percentage changes of
precipitation and evaporation from the UK probabilistic projections of
climate change over land (UKCP09) for (1) a low-emission scenario and
(2) a high-emission scenario for the time period of 2070–2099. This
enables us to capture, in a pragmatic and computationally efficient
approach, for the two emission scenarios the general range of changes
for the most pertinent variables that we think will most impact
changes to monthly seasonal GW responses. We focus on projected median
delta values for change in mean temperature (
Table 2 shows the optimised parameter values as well as the model
performance. The simulation of the discharge shows KGE values of 0.73
and 0.58 in the calibration and validation periods respectively. The
borehole simulations show high KGE values and only slight
deteriorations in the validation period. The parameters are located
well within their pre-defined ranges. Mean soil storage
Figure
When simulated peak values of groundwater levels are compared to the
observations, we find a rather moderate agreement. Using the
percentile approach we find different thresholds that exceed our
selected groundwater level percentiles. This is elaborated for the
90th percentile of simulated and observed groundwater levels of Ashton
farm (Fig.
Illustration of the percentile approach. Time series of the observed (grey dots) and modelled (green line) groundwater levels at Ashton Farm. The dotted lines represent the respective 90th percentile.
Deviations of simulated to observed exceedances of different percentiles in the validation period (borehole: Ashton Farm). The left value is the mean absolute deviation MAD (d), the right value is the deviation percentage PAD (%).
Table 3 shows the mean observed and modelled exceedances of all selected thresholds (the 5th, 10th, 25th, 50th, 75th, 90th and 95th percentiles) at all temporal resolutions in the validation period. By comparing matches in the number days of exceedance we evaluate our model with different percentiles and timescales. The left value is the mean absolute deviation (MAD) and the right value is the percentage of absolute deviation (PAD). We can see that the higher the percentile, the larger the deviation between observed and modelled exceedances. The same is true for the PAD when moving from lower to higher temporal resolutions. The MAD decreases with higher temporal resolution.
Mean model input (
Model output and (non-)exceedances of percentiles in the reference period and the two scenarios (borehole: Ashton Farm, time period 2070–2099).
The results of applying the two climate projections to the model can
be found at Table
A decrease in the simulation performance in the validation period is
normally to be expected because there is always a tendency to
compensate for structural limitations and observational uncertainties
during the calibration. The low decrease in model performance from
11 % (groundwater prediction at Black House,
A look at the parameter values reveals an adequate reflection of
reality. However,
Additionally, the mean epikarst storage coefficient
Based on the idea of the standardised precipitation or groundwater indices (Lloyd-Hughes and Saunders, 2002; Bloomfield and Marchant, 2013) our percentile approach permits the performance of the model to be improved to reflect observed groundwater level exceedances. It yields acceptable performance for years to days up to the 90th percentile. A reduction of precision with the timescale is obvious but in an acceptable order of magnitude when the validation period is considered. Although deviations are considerable both in the calibration and validation periods, they are stable demonstrating certain robustness but also the limitations of our approach. Although the variable model structure of the VarKarst model was shown to provide more realistic results than commonly used lumped models (Hartmann et al., 2013b) it still simplifies a karst system's natural complexity. This can be seen in the simulated time series at Ashton Farm and Black House, which indicate an overestimation of high levels and an underestimation of low levels. The reason for this behaviour might be due to the modelling assumption of a constant vertical porosity, despite the knowledge that there can be a strongly non-linear relation between chalk transmissivity and depth. Several studies acknowledge that hydraulic conductivity in the Chalk follows a non-linear decreasing trend with depth (Allen et al., 1997; Wheater et al., 2007; Butler et al., 2009). This is mainly attributed to the decrease in fractures because of the increasing overburden and absence of water level fluctuations (Williams et al., 2006; Butler et al., 2012). Hydraulic conductivities in the Chalk can span several orders of magnitude (Butler et al., 2009) and are particularly enhanced at the zone of water table fluctuations (Williams et al., 2006). In addition, cross-flows occurring in the aquifer can lead to complicated system responses in the Chalk (Butler et al., 2009). For the sake of a parsimonious model structure, these characteristics were omitted in this study but their future consideration could help to improve the simulations if information about the depth profile of permeability is available. A decrease in performance was also found for standardised indices that use probability distributions instead of a simulation model (Vicente-Serrano et al., 2012; Núñez et al., 2014; Van Lanen et al., 2016). To improve the approach's reliability for higher groundwater level percentiles, a model calibration that is more focussed on the high groundwater level percentiles may be a promising direction. A consideration of the time spans above the 90th percentile will allow for a better simulation quality. This could be further evaluated by using different percentile weighting schemes and stepwise increasing the weight on the target percentile.
We prepared two scenarios by manipulating our input data using probabilistic projections of annual changes in precipitation and potential evaporation at 2070–2099 for a low- and a high-emission scenario. This may neglect some of the changes on climate patterns predicted by climate projections but it is based on local and real meteorological values of the reference period, therefore avoiding problems that arise when historic and climate projection data show pronounced mismatches during their overlapping periods. Our results revealed that both scenarios lead to fewer exceedances over higher percentiles and more non-exceedances of lower percentiles, indicating a higher risk of groundwater drought at our study site. However, one problem that arises from our approach is that we do not consider changes in the seasonal patterns of our input variable, for example the increase in winter precipitation. If this increase was considered, the results would probably yield more exceedances of higher percentiles, as for instance found by Jimenez-Martinez et al. (2016). The purpose of the simple climate scenarios was to provide an application example of the new methodology, which is rather hypothetical considering the large uncertainties of current climate projections. We believe that our nine realisations are sufficient to show that different possible future changes have a non-linear impact on groundwater level frequencies. Although quite simplistic, our results are qualitatively in accordance with previous studies indicating increased occurrence of droughts in the UK (Burke et al., 2010; Prudhomme et al., 2014). The risk of drought occurrences might increase depending on the magnitude of change in evapotranspiration. However, more research and the application of more elaborated scenarios are necessary to completely understand the consequences of the change in groundwater frequency patterns in the UK chalk regions.
As the VarKarst model is a process-based model that includes the relevant characteristics of karst systems over range of climatic settings (Hartmann et al., 2013b), our approach can to some extent be used to assess future changes of groundwater level distributions and also be applied in other regions. This may bring some advantage concerning approaches that used transfer functions (Jimenez-Martinez et al., 2016) or regression models (Adams et al., 2010) for estimating groundwater levels if enough data for model calibration and evaluation are available.
As has been noted by Cobby et al. (2009), the likelihood and depth of groundwater inundations is one of the major challenges for future research of groundwater flooding. Since it is a lumped approach it may provide, after Butler et al. (2012), “a good indication of the likelihood of groundwater flooding, but do[es] not indicate where the flooding will take place”. A spatial determination of the groundwater table such as that in Upton and Jackson (2011) would be possible but only in catchments where the borehole network is extensive. Thereby, the possibility to model several boreholes with one single calibration, due to compartment structure in VarKarst, might be also an advantage. Butler et al. (2012) noted that the parameterisation of the unsaturated zone is a major difficulty in the Chalk region. Since this study also struggles with the porosity, future work should take a closer look at this subject.
We used an existing process-based lumped karst model to simulate groundwater
levels in a chalk catchment in south-western England. Groundwater levels were
simulated by translating the modelled groundwater storage into groundwater
levels with a simple linear relationship. To evaluate our approach we
analysed the agreement of observed and simulated groundwater level
exceedances for different percentiles. Finally, a simple scenario analysis
was undertaken to investigate the potential future changes of groundwater
level frequencies that affect the risk of groundwater flooding as well as the
risk of groundwater droughts. The model performance for discharge and
the groundwater levels was satisfying and showed the
general adequacy of the model for simulating groundwater levels in the chalk.
It also revealed shortcomings concerning higher groundwater levels. This was
corroborated by the percentile approach that showed a robust performance up
to the 90th percentile. A scenario analysis using UKCP projections on
expected regional climate changes showed that expected changes may lead to an
increased occurrence of low groundwater levels due to increasing actual
evaporation. Overall, our study shows that semi-distributed process-based
modelling can be a valuable tool for simulating and predicting groundwater
frequencies in Chalk regions where information is too limited for the
application of distributed models. Here, a thorough model evaluation is
essential for obtaining reliable and consistent results. In order to obtain
more reliable results we recommend collecting more data about the
hydrogeological properties of our study site to improve the structure of our
model regarding the porosity and the unsaturated zone. In addition, longer
time series and an adapted calibration approach which, in particular,
emphasises the
The sources of all underlying data are described in Table 1. All precipitation, discharge and potential evaporation data are freely available from the CEH website. Groundwater level observations and climate delta values can be accessed via request at EA and UKCP, respectively.
Within the VarKarst model, the parameter
The recharge from the soil to the epikarst
Parameters, descriptions and equations solved in the VarKarst model.
time step
Similarly to the epikarst compartment, variable groundwater storage
coefficients
The authors declare that they have no conflict of interest.
This publication contains Environment Agency information © Environment Agency and database rights. Thanks to Jens Lange and Sophie Bachmair, University of Freiburg, for their valuable advice. Support for Gemma Coxon, Jim Freer and Nicholas J. K. Howden was provided by NERC MaRIUS: Managing the Risks, Impacts and Uncertainties of droughts and water Scarcity, grant number NE/L010399/1. The article processing charge was funded by the German Research Foundation (DFG) and the University of Freiburg in the funding programme Open Access Publishing. Edited by: Mario Parise Reviewed by: Andrew Long and two anonymous referees