Journal cover Journal topic
Natural Hazards and Earth System Sciences An interactive open-access journal of the European Geosciences Union
Nat. Hazards Earth Syst. Sci., 17, 357-366, 2017
http://www.nat-hazards-earth-syst-sci.net/17/357/2017/
doi:10.5194/nhess-17-357-2017
© Author(s) 2017. This work is distributed
under the Creative Commons Attribution 3.0 License.
Research article
08 Mar 2017
Efficient bootstrap estimates for tail statistics
Øyvind Breivik1,2 and Ole Johan Aarnes1 1Norwegian Meteorological Institute, Allegaten 70, 5007 Bergen, Norway
2Geophysical Institute, University of Bergen, Bergen, Norway
Abstract. Bootstrap resamples can be used to investigate the tail of empirical distributions as well as return value estimates from the extremal behaviour of the sample. Specifically, the confidence intervals on return value estimates or bounds on in-sample tail statistics can be obtained using bootstrap techniques. However, non-parametric bootstrapping from the entire sample is expensive. It is shown here that it suffices to bootstrap from a small subset consisting of the highest entries in the sequence to make estimates that are essentially identical to bootstraps from the entire sample. Similarly, bootstrap estimates of confidence intervals of threshold return estimates are found to be well approximated by using a subset consisting of the highest entries. This has practical consequences in fields such as meteorology, oceanography and hydrology where return values are calculated from very large gridded model integrations spanning decades at high temporal resolution or from large ensembles of independent and identically distributed model fields. In such cases the computational savings are substantial.

Citation: Breivik, Ø. and Aarnes, O. J.: Efficient bootstrap estimates for tail statistics, Nat. Hazards Earth Syst. Sci., 17, 357-366, doi:10.5194/nhess-17-357-2017, 2017.
Publications Copernicus
Download
Short summary
Return values can be estimated from large data sets stemming from numerical models. The question explored here is how much of the original data must be kept in order to compute unbiased return estimates. We find that retaining only a small fraction is usually enough. This offers huge storage and computational savings. We provide a set of examples to demonstrate how this can be done.
Return values can be estimated from large data sets stemming from numerical models. The question...
Share