Investigation of the characteristics of geoelectric field signals prior to earthquakes using adaptive STFT techniques

Abstract. An earthquake is one of the most destructive natural disasters that can occur, often killing many people and causing large material losses. Hence, the ability to predict earthquakes may reduce the catastrophic effects caused by this phenomenon. The geoelectric field is a feature that can be used to predict earthquakes (EQs) because of significant changes in the amplitude of the signal prior to an earthquake. This paper presents a detailed analysis of geoelectric field signals of earthquakes which occurred in 2008 in Greece. In 2008, 12 earthquakes occurred in Greece. Five of them were recorded with magnitudes greater than Ms = 5R (5R), while seven of them were recorded with magnitudes greater than Ms = 6R (6R). In the analysis, the 1st significant changes of the geoelectric field signal are detected. Then, the signal is segmented and windowed. The adaptive short-time Fourier transform (adaptive STFT) technique is then applied to the windowed signal, and the spectral analysis is performed thereafter. The results show that the 1st significant changes of the geoelectric field prior to an earthquake have a significant amplitude frequency spectrum compared to other conditions, i.e. normal days and the day of the earthquake, which can be used as input parameters for earthquake prediction.


Introduction
Earthquakes and their aftermaths are one of the most destructive natural disasters.One of the major earthquakes that occurred in the world after the Alaskan earthquake, with a magnitude of M w = 9.2, in 1964 is the Northern Sumatra earthquake, with a magnitude of M w = 9.1, on 26 December 2004.The Northern Sumatra earthquake killed over 230 000 people (Hariyadi, 2004).The most recent giant earthquake with a magnitude of M w = 9 occurred on 11 March 2011 in Tohoku, Japan (JMA, 2011;USGS, 2011).This earthquake was followed by a tsunami that killed around 18 000 people (Gavinfarreli, 2011).Considering these catastrophic effects, it is highly important to know well ahead when an earthquake will occur in order to reduce the number of victims and material losses.
There have been a lot of concerted efforts in reducing the catastrophic effects of an earthquake recently (Bhargava et al., 2009).One of the most notable efforts is research into the ability to accurately predict an incoming earthquake far ahead of time.Thus, earthquake prediction involves forecasting the occurrence of an earthquake with a specific magnitude, the time, and region of occurrence.In other words, earthquake prediction refers to the knowledge of prognostic parameters including the epicentre of the earthquake, the time of occurrence, and magnitude of the earthquake (Thanassoulas and Klentos, 2010).
Based on the time prediction frame, earthquake prediction can be divided into three different types -namely long term, intermediate, and short-term prediction (Bhatgava et al., 2009).Long-term prediction is not so accurate and also rarely used for public evacuation.Intermediate prediction, which consists in prediction over a period of years to weeks, is also less used.Short-term predictions involve forecasting the likely occurrence of an earthquake within months, weeks, and days from the time of prediction.Furthermore, it is very important not to mobilize for a long period of time, so as to avoid social panic and chaos.However, the short-term prediction must leave enough time for evacuation.
There are two analyses used to predict earthquakes.The first is earthquake prediction based on the past history of fault movement, which is useful for long-term earthquake prediction.The second is earthquake prediction based on geophysical phenomena that can be observed prior to an earthquake, such as seismological, geodetic, geochemical, hydrological phenomena, or electro fields (Ikeya, 2004).One of the methods that has been applied to short-term earthquake prediction based on the earth's electric field is the VAN method (Uyeda, 1995).
This work implements the feature extraction process by using the 1st significant change in geoelectric field signal prior to an earthquake (EQ).The spectrum analysis of the 1st significant change in the signal prior to an EQ is compared with that of the normal condition (i.e. with no EQ) and that on the day of an EQ.The characteristics of the geoelectric field prior to an EQ are also discussed in this work.STFT with an adaptive sliding window is used as the feature extraction technique for the 1st significant change in the signal.The resulting feature can be used as the input to develop the model for EQ prediction.
This paper is organized as follows.Section 2 describes the data collection for the geoelectric field.Section 3 discusses the proposed feature extraction method, which is based on frequency analysis of signal.The experimental results are presented in Sect. 4. Finally, the conclusion is presented in Sect. 5.

Geoelectric field
Before the onset of a strong EQ, there are a great number of geophysical and geochemical phenomena that occur which are caused by geotectonic stress load changes in the lithosphere, specifically in the seismogenic region (Thanassoulas and Tselentis, 1993;Lazarus, 1996;Thanassoulas, 2008).Continued monitoring of this geoelectric potential has been extensively implemented in Greece.This method, VAN, was proposed by Lazarus (1995) and it is named after the researchers initials.This method is carried out by continuously monitoring the earth's electric potential and the east-west (E-W) and north-south (N-S) polarity gradients.The geoelectric field is registered by a number of electrodes which are in contact with the ground surface at a certain distance to the epicentre area.Several short dipoles with different lengths (50-200 m) in both E-W and N-S directions and a few long dipoles (2-20 km) in the appropriate direction are installed (Uyeda, 1996).
Geoelectric field data used in this work were collected from the database of the earth's electric field (Thanassoulas, 2007).The implementation of these data to predict EQs has been widely used in Greece.There are three monitoring site that are installed in various areas of Greece; these are Athens (ATH), Pyrgos (PYR), and HIO (Hios).(Thanassoulas, 2007).
Based on the measurements of the earth's ground surface electric field, it is assumed that the x and y components of the electric field, i.e.E x and E y , are registered by horizontal dipoles.The z component, which is in the vertical direction, contains the same quality of information as the x and y components.However, due to technical difficulties, the measurement of the z component requires a vertical dipole with 150-200 m depth in the ground.Because of this, the z component has been ignored (C.Thanassoulas, personal communication, 2012).The total magnitude of the electric field measurement is given by x + E 2 y . (1)

Data collection
The database consists of valuable data from three different monitoring sites in Greece; it is available at www.earthquakeprediction.gr(Thanassoulas, 2007).Measurements for this database consist of data which include 1440 data samples per day.The following monitoring sites in Greece are shown in Fig. 1: -PYR monitoring site to ATH monitoring site -216.6 km.
-ATH monitoring site to HIO monitoring site -209.5 km.
-PYR monitoring site to HIO monitoring site -425.6 km.

Proposed feature extraction technique
The proposed feature extraction technique in this work is based on the geoelectric field signal.The objective of this work is to analyse the characteristics of the signal which can be used as feature input to the EQ prediction system.Figure 2 shows the flowchart of the proposed technique.As shown in the flowchart, the raw data must be read properly.The differencing technique is then applied to the amplitude of the signal.This step is performed to observe the change in the signal prior to an EQ.Furthermore, the peak detection technique is used to find the 1st significant change in the signal.The short-time Fourier transform (STFT) technique is applied to investigate the characteristics of the spectrum of the 1st significant change in the geoelectric field.The classification of an EQ-prone area, based on the location and the  observation of the threshold value in the area, is made by this technique.

Reading of geoelectric field data
This stage involves the process of reading and interpreting data extracted from the geoelectric field database (Thanassoulas, 2007).The raw data of the geoelectric field signal consists of two pairs of polarities, i.e. the E-W and N-S polarities.These two pairs of polarities need to be combined.The dataset consists of data from 3 different monitoring sites.Each dataset consists of 1440 samples with 5 columns, which can be described as follows: -1st column: time in hh:mm format  1, which provides the E-W polarity.The 4th column registers the data from N-S polarity.The 3rd and 5th columns are the output of an analogue band-pass filter.These data consist of information from electric field measurements which have proven to have little change (C.Thanassoulas, personal communication, 2012).The total amplitude of geoelectric field can be found by using the following expression: An example of the data from two polarities E-W and N-S are plotted in Fig. 3.The 1st row (a) contains data from E-W polarity.The second row (b) contains data from N-S polarity.This sample of data was taken on 9 December 2007 prior to the 6.6R EQ on 6 January 2008.

1st difference in the geoelectric field (GFdiff) signal
The purpose of implementing the 1st difference method for the signal is to observe the change in the signal over a certain period of time.E [n] denotes the value of the amplitude of Gfdiff at discrete time [n], and . The 1st difference of signal, also called geoelectric field signal different, is given by where E is the resulting GFdiff.An example of the GFdiff is shown in Fig. 4.

Peak detection algorithm
The idea behind the peak detection algorithm is to find the 1st maximum amplitude of GFdiff prior to an EQ.This 1st maximum amplitude of GFdiff is called the signal's peak.
The time when the peak of GFdiff occurs is referred to as the peak time.As illustrated in Fig. 4, the 1st significant change in Gfdiff with the amplitude of 493.5 [mV min −1 ] is found at the peak time of 260 min.If there is more than one peak with the same amplitude, the highest peak found would be selected.The algorithm for the peak detection of GFdiff is shown in Fig. 5.The maximum amplitude is detected by segmenting GFdiff by day.The maximum amplitude of the segmented signal is compared with those of others within 1 week of segmented signals.The maximum amplitude among the segmented signals is assigned as the 1st significant change in GFdiff.Once the 1st significant change in the Gfdiff is detected in one of the monitoring sites, the same procedure would be applied to the rest data from other monitoring sites.This procedure begins at the point where the 1st significant change from the previous monitoring sites is detected.

Adaptive short-time Fourier transform (STFT)
The short-time Fourier transform (STFT) is a fast Fourier transform (FFT) technique, which is applied to a small segment of data (Salivahanan et al., 2000).The data are divided into segments with the same length.The segmented data is then multiplied by a window function.amplitude in the segmented data become the centre of the window to ensure that the important information is included in the window.This type of windowing technique is called an adaptive sliding window (shown in Fig. 6).Finally, the FFT technique is then applied to the windowed data.
In order to determine the best length of the segment and the most suitable window type, an experiment is performed on the GFdiff signal recorded at three different monitoring sites.The selection of the segment length and window function is performed to achieve the highest accuracy.In this work, the measurement accuracy refers to the condition where the geoelectric field data segment containing the peak time coincides with the spectral segment containing the maximum spectral amplitude.In the attempt to get the best length of segment that gives the best accuracy, 6 different segment lengths are used and compared.The segment lengths are 180 (which is equal to a sampling period of 3 h and dividing the data into  8 segments), 240 (equal to a sampling period of 4 h and dividing the data into 6 segments), 288 (equal to a sampling period of 4.8 h and dividing the data into 5 segments), 360 (equal to a sampling period of 6 h and dividing the data into 4 segments), 480 (equal to a sampling period of 8 h and dividing the data into 3 segments), and 720 (equal to a sampling period of 12 h and dividing the data into 2 segments).The segmented data is then multiplied by the Hamming window function since it provides a good degree of accuracygreater than 70 % -and less noisy data (Dactron, 2003).In order to obtain a faster but accurate prediction, the length of segment 180 is selected for the three different monitoring sites.Onto the selected data segment, we apply 8 different window functions -namely the Hamming, Hanning, Blackman, Bartlett, Flattop, Tukey, Kaiser and rectangular window functions.The Hamming window is selected for the three monitoring sites since it has high accuracy in detecting the highest amplitude based on the 1st significant difference in GFdiff.

Classification of EQ-prone area and threshold determination.
The 12 EQ datasets registered in Greece between 1 January 2008 and 30 June 2008 have been classified based on the area where the EQs have happened.The frequency spectrums of each are observed.The observation will determine the characteristics of the frequency spectrum for the GFdiff of that area and the threshold for GFdiff in each of the areas, as shown in Fig. 7.

Experimental result and discussion
The detailed analysis is performed on EQ data registered in Greece between 1 January 2008 and 30 June 2008.During this period 12 EQs took place.Of these 12, 5 EQs with magnitudes greater than M s = 5R (5R) and 7 EQs with magnitudes greater than M s = 6R (6R), as shown in Table 1.The locations of these EQs and the corresponding magnitudes are shown in the maps in Figs. 8 and 9.The 1st significant GFdiff prior to the EQs registered at the three monitoring sites has been investigated.For the analysis the GFdiff data is segmented and windowed.STFT is then applied to the windowed GFdiff.The analysis is then performed on the data before, on the day when the significant GFdiff is registered, and on the day which the EQ occurs.Table 2 shows the dates on which the analysis is performed.
As a matter of comparison, the data from the days of the normal condition, i.e. a few days before the 1st significant End Fig. 7. Classification of EQ-prone area and threshold determination.difference in GFdiff, are also included in Table 2. Table 2 also includes the data from the day of the EQ and the 1st significant difference in GFdiff.The amplitudes of the spectrum of each day in the categories is compared.The result shows that the 1st significant differences of GFdiff have the most significant amplitude compared to that of the normal days and the day of EQ.Figures 10,11,and 12 show that the 1st significant difference in GFdiff has the significant amplitude frequency compared to the other two.
Grouping is performed in EQ prone areas from the 12 EQs which happened during the period of 1 January 2008 and 30 June 2008 -shown in Fig. 13.The classification of data based on the location is shown in Table 3. From the results the following can be shown: -The greater the magnitude of the EQ, the longer the time before the 1st significant GFdiff is detected prior to the EQ.-The smaller the magnitude of the EQ, the shorter the time before the 1st significant GFdiff is detected prior to the EQ.

Nat
-The greater the magnitude of the EQ, the smaller the amplitude spectrum of GFdiff detected.
-The smaller the magnitude of the EQ, the greater the amplitude spectrum of GFdiff detected.
Therefore, it is highly probable that these unique characteristic of the 1st significant difference of GFdiff could be used as one of the characteristics in predicting and determining the location and magnitude of EQs.This characteristic can be used as input to the identification system to predict any incoming EQs.

Conclusions
In this work, the characteristics of the geoelectric field prior to an EQ has been investigated.For the analysis, we first generate the data of GFdiff (geoelectric field difference), which represents the change in geoelectric field signal over a certain period of time.In this case, for analysis purposes, the 1st significant GFdiff is detected.Then, the signal is segmented and windowed, and the adaptive STFT technique is applied to the windowed signal.The spectrum analysis of GFdiff is performed on the data of EQs that occurred in 2008 in Greece.In the analysis, geoelectric field data registered at normal conditions, i.e. no significant change, and the data on the day of the EQ are observed and compared with geoelectric field data www.nat-hazards-earth-syst-sci.net/13/1679/2013/Nat.Hazards Earth Syst.Sci., 13, 1679-1686, 2013

Fig. 5 .
Fig. 5. Flow chart for algorithm of the determination of the 1st significant change in geoelectric field prior to an EQ.

Table 1 .
List of EQs registered in Greece in 2008.

Table 2 .
List of EQs registered in Greece in 2008.