Pre-seismic anomalies from optical satellite observations: a review

Detecting various anomalies using optical satellite data prior to strong earthquakes is key to understanding and forecasting earthquake activities because of its recognition of thermal-radiation-related phenomena in seismic preparation phases. Data from satellite observations serve as a powerful tool in monitoring earthquake preparation areas at a global scale and in a nearly real-time manner. Over the past several decades, many new different data sources have been utilized in this field, and progressive anomaly detection approaches have been developed. This paper reviews the progress and development of pre-seismic anomaly detection technology in this decade. First, precursor parameters, including parameters from the top of the atmosphere, in the atmosphere, and on the Earth’s surface, are stated and discussed. Second, different anomaly detection methods, which are used to extract anomalous signals that probably indicate future seismic events, are presented. Finally, certain critical problems with the current research are highlighted, and new developing trends and perspectives for future work are discussed. The development of Earth observation satellites and anomaly detection algorithms can enrich available information sources, provide advanced tools for multilevel earthquake monitoring, and improve shortand medium-term forecasting, which play a large and growing role in pre-seismic anomaly detection research.


Introduction
The earthquake is one of the most devastating and catastrophic natural disasters, which exposes human beings to tremendous risks (Geiß and Taubenbö ck, 2013). Therefore, strengthening the monitoring research of seismic activities using different technical means is necessary. Since the 1970s, remote sensing technology has been applied to seismic monitoring. In the 1980s, 25 thermal infrared (TIR) satellite data were used for the first time to analyse thermal anomalies prior to earthquakes (Gorny et al., 1988;Tronin, 1996 ). Earthquake monitoring based on passive or active remote sensing data is considered a promising research interest by the international seismology and remote sensing communities (Tronin, 2010), having the advantages of interdisciplinary research and being a new exploratory technology and tool.
2 Satellite remote sensing has its own advantages compared to traditional approaches of seismic hazard monitoring. Ground stations dedicated to earthquake monitoring are spatially relatively sparse; thus, their capability to monitor crustal movement and seismic information is limited. Meanwhile, global remote sensing data with high spatiotemporal resolution can be obtained through the advances in satellite technology. Satellite TIR data are a considerably reliable data source that exhibits the continuity, stability and accessibility. These data allow the study of the phenomena of large-scale and fast-changing seismic 5 thermal anomalies. Satellite anomaly recognition can quickly capture the anomalies on the macroscopic scale, which contains thermal information of the Earth's interior caused by the crustal movement. Moreover, satellite data can reveal large-scale linear structures and short-term (i.e., from days to weeks) variations of thermal anomalies over tectonic plate borders and active faults (Filizzola et al., 2004;Tramutoli et al., 2005;Tronin et al., 2004). After more than 30 years of research, cases of anomalies monitored by satellite remote sensing gathered from a large number of earthquakes have been repeatedly tested and 10 verified. Results show that different anomalies are present in seismogenic areas before earthquakes with the medium or strong magnitude.
The sudden dislocation of active faults due to surging tectonic stress causes tectonic earthquakes. In addition to the considerable amount of the strain energy released during the earthquake itself, the elastic energy continuously accumulates during the preparation process of the earthquake. Different theories to explain the physical mechanism of the pre-seismic anomalies 15 derived from optical satellite data have been proposed. The p-hole model (Freund, 2011;Freund et al., 2009) indicates that electronic charge carriers, also known as positive holes, in crustal rocks activated by tectonic stress and flow out of the stressed rock volume and propagate fleetly. They cause air ionization at the land surface-atmosphere interface when accumulated in a thin surface/subsurface layer, and generate non-thermal infrared emission as a result of the recombination of positive holes.
The rock stress adjustment from active faults probably causes anomalies of land surface or air temperatures prior to the 20 earthquake, which could be observed through satellite TIR sensors (Chen et al., 2015;Ren et al., 2017;Wu et al., 2006). The spatiotemporal evolution of the rock temperature field is closely related with its deformation. The rock compression causes the obvious increase of temperature, and rock tension gives rise to temperature decrease. The solid Earth is about 1650 times of thermal capacity of the atmosphere. The change of elastic stress of 1 MPa is likely to bring in the variation of air temperature with the order of 1 K based on the energy balance. The local greenhouse effect due to the emanation of CO2, CH4, etc. has 25 been invoked to explain the anomalous variations of top-of-atmosphere (TOA) brightness temperature or outgoing longwave radiation (OLR) (Ouzounov et al., 2006;Ouzounov et al., 2007;Tronin et al., 2002). Besides, the increased emission of radon from active faults and cracks in seismogenic regions is also considered to bring about the air ionization, which can concentrate water molecules on air ions, further lead to the anomalies of the atmospheric water vapor and temperature, and accelerate the latent heat flux before earthquakes (Pulinets et al., 2006b). It should be noticed that the p-hole model and temperature response 30 to stress-strain variations in rock are based on rigorous laboratory experiments. However, when these theories are extended to the field environment, there are too many factors that influence the results. In particular, it is difficult to verify these theories, which hinders forming a complete logical chain. Therefore, much further work need to be carried out to apply this mechanism in the very complex field environment. Other theories also face a similar situation.
Despite lots of promising research results about pre-seismic anomalies, the anomaly detection approach is still met with skepticism and controversies (Geller, 2011;Wyss 1997), because early warning signals from earthquake-prone regions are 5 diverse, subtle, and often transient. Moreover, this approach is generally applied with some serious misjudgments and misses earthquake prediction due to the weak physical basis of pre-seismic anomalies and perturbations from surface and atmospheric conditions and human activities (Tronin, 2010). Furthermore, the transfer process of surface thermal radiation is considerably complex, and numerous factors that affect the TIR radiation are highly varied. The recognition of pre-seismic anomalies can be significantly hindered by topography, vegetation, soil moisture, surface type, and atmospheric conditions. Therefore, 10 various anomalies caused by seismic activities sink into the background with various other factors and are considerably weak relative to the surface or atmospheric thermal radiative variations caused by other natural or anthropogenic changes and thus can hardly be discerned. Seismic precursor identification has not yet formed an objective, stable and verifiable anomaly detection method; thus, it has not been completely accepted by the seismic community.
This paper attentively reviews the advances in pre-seismic anomaly monitoring through satellite technology mainly over the 15 last decade in terms of possible earthquake precursors and anomaly detection approaches. Besides, it is noted that reanalysis datasets can also be used in the anomaly analyses as mentioned in some studies below. These datasets are created via geophysical models and data assimilation scheme with spatial continuity and relatively low spatial resolution. The satellite products and reanalysis data provide abundant data in this domain. Although inherent physical problems are always present, this work provides some assistants for attempts to detect pre-seismic anomalies using satellite datasets. 20 The rest of this paper is organized as follows. Section 2 presents the geophysical variables used for pre-earthquake anomaly detection in a number of earthquake cases studies. Section 3 describes several types of anomaly detection methods, their basic properties, and recent applications. Section 4 discusses the issues around the pre-seismic anomaly identification by satellite data products, which prove to be essential hindrances in earthquake prediction. Finally, Section 5 presents the expected development and perspectives for the next decades based on the review of current research status and trend, and analysis of 25 existing problems in this domain.

Possible seismic precursors
All types of signals from the Earth-atmosphere system before strong earthquakes have been analysed in a large amount of literature. The possible earthquake precursors used for the anomaly detection involve various geophysical variables, such as seismic wave velocity, surface deformations, gravity, geoelectricity, geomagnetism, groundwater chemistry, temperature, 30 moisture, atmospheric composition and ionospheric parameters (Cicerone et al., 2009;Tronin, 2006). Nevertheless, the pre-4 seismic anomaly analysis of precursors at a global scale and of high time-frequency has to rely heavily on satellite remote sensing datasets. This paper focuses on several widely studied anomalies associated with seismic activities from above mentioned datasets. In the 1980s, after the discovery of thermal anomalies prior to earthquakes, seismic experts gradually realized the advantages of TIR remote sensing and conducted relevant studies. For more than 30 years, many geophysical variables, which can reflect 5 thermal radiation information about the Earth's surface and/or atmosphere using satellite observations and products (e.g., TOA brightness temperature, OLR, surface temperature, and latent heat flux) have been used to find a correlation between different anomalies and specifications of time, location and magnitude of imminent earthquakes. The following sections contain detailed explanations about these candidate physical quantities (Table 1)-from TOA, the atmosphere to the Earth's surface-and their applications in the pre-seismic anomaly analyses. 10 A comprehensive table is shown in Table 2 that contains the possible pre-seismic precursors, the satellite sensors that can retrieve theses parameters, the spatial and temporal resolutions of these data and their retrieval precision. We select some sensors that are still in operation, have high quality and abundant products, and widely used in remote sensing community.

TOA brightness temperature and OLR
The TOA brightness temperature (BT) is a measurement of the radiance of the TIR radiation at 3-5 μm or 8-14 μm from the 15 Earth's surface, atmosphere, and clouds to satellite sensors. It is a direct and fundamental physical quantity expressed by the equivalent blackbody temperature of one channel (e.g., 11.77-12.27 μm band) in Kelvin. In other words, BT is not the normal temperature due to ignoring the emissivity effect. Many satellite sensors obtain TIR measurements, such as the Moderate Resolution Imaging Spectroradiometer (MODIS), Visible Infrared Imaging Radiometer Suite (VIIRS) and Advanced Very High Resolution Radiometer (AVHRR) aboard polar-orbiting satellites, and Spinning Enhanced Visible and Infrared Imager 20 (SEVIRI) and Advanced Himawari Imager (AHI) aboard geostationary satellites.
TOA BT data are one of the mostly used remote sensing data in thermal anomaly detection because of their easy accessibility and data processing. Several studies indicate that significant BT anomalies existed in seismically active areas about one month to eight days before earthquakes Lisi et al., 2015;Pergola et al., 2010). The regions with strong tectonic activities also showed intensive BT anomalies during the preparatory phases of low-magnitude earthquakes . 25 After a 10-year retrospective correlation analysis in Greece, TOA BT anomalies are considered to satisfy the qualifications involved in a multi-parametric system for the time-dependent assessment of seismic hazards (Eleftheriou et al., 2016). As shown in Fig. 1, BT anomalies mainly occurred before one earthquake with magnitude greater than 5 and for seismic events with magnitude less than 5, the indicator of anomalies became less significant.
Another frequently used precursor at TOA is OLR, which is a critical component of the Earth's energy budget. OLR is the 30 total longwave radiative flux (W/m²) in the TIR wavelength from 4 μm to 100 μm, which is emitted from the Earth and its atmosphere going to space. More specifically, it is the integrated result of the reflection, absorption, and emission of longwave radiation by the Earth's surface (land and sea), atmosphere, aerosols and clouds, and finally leaves the Earth system. OLR represents the radiation characteristics of the cloud top when the instantaneous field of view is covered by clouds and indicates the longwave radiation emitted by the surface and atmosphere under cloudless conditions. OLR cannot be directly measured but can be derived from measurements of sensors (e.g., AVHRR, Atmospheric Infrared 5 Sounder (AIRS), and Clouds and the Earth's Radiant Energy System (CERES)). Among them, the National Oceanic and Atmospheric Administration (NOAA) OLR is the commonly used data product in various studies and is retrieved from the AVHRR TIR bands (10.5-12.5 μm) on a 2.5°× 2.5° latitude-longitude grid with daily and monthly mean values from 1981 to the present (Gruber and Krueger, 1984). The latest AIRS OLR product, which is retrieved from hyperspectral TIR data, is AIRS Version 6 Level 3 monthly mean product on a 1°× 1° latitude-longitude grid beginning from September 2002 (Aumann 10 et al., 2003). Finally, the CERES OLR dataset is the monthly means from 2000 in the CERES energy balanced and filled edition 4.0 datasets on a 1°× 1° latitude-longitude grid (Rose et al., 2013).
Several studies suggest that OLR can be effectively used as a short-term or impending earthquake precursor. OLR appeared as a steadily increasing trend around the epicentre and thus can be used as an indicator of seismic activity (Ouzounov et al., 2007). Figure 2 presents the time series of daily OLR anomaly before Mw 9.0 Sumatra Andaman Islands, Northern Sumatra 15 earthquake (3.09°N, 94.26°E) occurred on December 26, 2004. After several small peaks beyond the mean field of OLR, an intense rise showed up shortly following this mega earthquake. Anomalous OLR occurred two months to several days prior to the main shock with at least several thousands of square kilometres. OLR anomaly has a direct significance for earthquakes with magnitude above 5.5 (Eleftheriou et al., 2016). Anomalous OLR variations can reach 30-45 W/m 2 around epicentral areas 7-8 days prior to the major earthquake (Rawat et al., 2011). Cumulative elastic energy in the earthquake preparation regions 20 may result in intensive thermal emissions associated with the release of a large amount of energy when an earthquake occurs.
Increase in surface temperature, air temperature, water vapour and other greenhouse gases result in the increase in OLR and TOA BT, especially around the epicentre under clear sky conditions (Tramutoli et al., 2005).
TOA BT and OLR represent longwave radiation variations of the entire surface-atmosphere system, and are significantly affected by atmospheric conditions. The height and coverage of clouds have a dominating influence on the total intensity of 25 longwave radiative fluxes received at the TOA (Turner et al., 2015). Therefore, eliminating cloud effects is necessary in applications, which can be difficult when working with low-resolution data. Meanwhile, OLR anomalies may be the result of other atmospheric effects and not of increased surface temperature (Tronin, 2010). In the cases under clear sky conditions, water vapor is another significant factor that has strong absorption and re-emission capabilities on longwave radiation and rapidly changes in the spatiotemporal pattern. These situations are particularly unfavorable for detecting weak pre-seismic 30 6 anomalies; thus, surface land temperature (LST) and similar data products that are not heavily affected by atmospheric disturbances are preferred (Filizzola et al., 2004;Lisi et al., 2015).

Atmospheric water vapor and temperature
Atmospheric water vapor and temperature are two important physical quantities that describe the status of the atmosphere, especially in the atmospheric boundary layer. They have strongly dependencies and thus can present synchronous responses 5 to the impending earthquakes.
Over 99% of atmospheric moisture is in the form of water vapor, and it is abundant and highly varies in abundance in the atmosphere. Water vapor is the primary greenhouse gas in the Earth's climate system and influences long-term climate changes.
The water vapor associated with atmospheric energy drives the development of weather systems and the transfer of heat from the tropics to the poles. Atmospheric water vapor content can be referred to as the total column water vapor contained in a 10 vertical column of atmosphere between the Earth's surface and the TOA, which can be retrieved from near infrared, TIR or passive microwave data (Chesters et al., 1983;Ferraro et al., 1996;Gao and Kaufman, 2003). For example, the MODIS total precipitable water product (MOD05_L2 and MYD05_L2) has two types of data, one using near-infrared algorithm at the 1km spatial resolution during the day, and the other using infrared algorithm at 5-km resolution during day and night. In general, the accuracy of water vapor products ranges from 5% to 10% (Gao and Kaufman, 2003). 15 Anomalies of atmospheric water vapor content have been observed prior to several strong earthquakes Dey and Singh, 2003). The 2001 Gujarat earthquake triggered anomalous atmospheric moisture increase of about 15 mm, which were detected using passive microwave data of the Special Sensor Microwave Imager (Dey et al., 2004). The anomalous increase of water vapor level occurred around the epicenter in land regions before the earthquake and in ocean regions after the earthquake. Meanwhile, AIRS data led to the discovery of anomalous changes in relative humidity at different pressure 20 levels reaching 500 hPa with even larger variations at the lower levels (Singh et al., 2010a). Dey et al. (2004) related this phenomenon with the increase of SLHF, which is discussed in Section 2.5.
Aside from atmospheric moisture, air temperature (or atmospheric temperature) is the most commonly measured weather parameter. It refers to the screen-level air temperature in meteorological terms, which is normally measured at the attitude of 2 m above the ground surface. Air temperature is a more complex parameter than LST, which is affected by LST, winds, 25 topography, etc. LST can be estimated based on satellite TIR measurements. However, air temperature cannot be directly derived from LST, and no operational remote sensing data product of air temperature is available at present. The atmospheric profiles provide the air temperature and water vapor amounts at different altitudinal layers in the vertical atmosphere, but this air temperature is different from the screen-level temperature. The atmospheric profiles can be retrieved from TIR multispectral sensors (e.g., MODIS) or infrared sounders (e.g., AIRS) (Aumann et al., 2003;Seemann et al., 2003). 7 Significant near-surface air temperature anomalies related to earthquakes have been reported in different studies (Panda et al., 2007;Pulinets and Dunajecka, 2007). An approximate increase of air temperature at 2-4 K in a north-south pattern along the Nosratabad fault zone during 18-20 December was observed prior to the Ms 6.0 Kerman earthquake in Iran at December 20, 2010 (Fig. 3) (Alvan et al., 2013). Anomalous changes of air temperature at different pressure levels from AIRS data were found up to 500 hPa with the variations becoming larger at the lower levels (Singh et al., 2010a). An increase in air temperature 5 along with low air humidity has been observed prior to earthquakes (Pulinets et al., 2006a;Singh et al., 2010a). Ouzounov et al. (2007) and Pulinets and Dunajecka (2007) consider that the anomalies in air temperature and relative humidity prior to strong earthquakes indicated by historical meteorological data are primarily related to the air ionization caused by increased radon emanations from faults in earthquake preparation regions. Anomalous areas are considerably larger than seismic sources, which indicates the difficulty of identifying the epicenter before a main shock. 10

Atmospheric trace gases
Atmospheric trace gases make up less than 1% of the Earth's atmosphere by volume. These gases include CO2, CH4, O3, SO2, and NOx, which are produced from natural geothermal sources, volcanic activity, ecosystem and anthropogenic sources. Trace gas concentrations are crucial in investigating atmospheric processes. Atmospheric trace gas data, such as CH4, CO2 and CO, can be retrieved from satellite measurements (e.g., AIRS hyperspectral TIR data) (Chahine et al., 2006;Clarisse et al., 2011). 15 Seismic activities can change atmospheric composition, especially the increment of greenhouse gases prior to strong earthquakes. A 6% increase in the columnar ozone prior to the 2015 Nepal earthquake was observed (Ganguly, 2016). An increased CO emission at near-surface pressure levels associated with LST over epicenter areas was detected by the Measurements of Pollution in the Troposphere onboard Terra satellite approximately one week before the earthquake (Singh et al., 2010b). Anomalous increases in total column CO and O3 occurred over the epicenter areas one to six months prior to 20 several strong earthquakes (Cui et al., 2013). Figure 4 indicates that anomalous CO appeared along the San Andreas Fault, while the anomalies of O3 spread over wider areas around the epicenter before and after Ms 7.1 Baja California earthquake on 5 April 5, 2010. However, these phenomena are unstable with only parts of studied earthquakes exhibiting anomalous variations and with persistent time being irregular.
Changes in pressure between rocks due to seismic activity are believed to result in greenhouse gas emissions and local 25 greenhouse effect (Choudhury et al., 2006;Singh et al., 2010a;Zoback and Gorelick, 2012). The experimental results show that gases, such as CH4 and CO2, can obtain energy from the transient electric field and cause a temperature increase. The trace gas anomaly before medium to strong earthquakes was preliminarily believed to exhibit 1) a sudden release of gas and 2) a static electricity of field mutation. Earthquakes may lead to the release of underground carbonaceous gas, which provides a scientific basis for the mechanism of pre-seismic thermal anomalies. In addition, CO2-rich fluids in the deep crust are 30 considered as the possible important carbonaceous gas source in the lithosphere-coversphere-atmosphere coupling (LCAC) 8 process (Chiodini et al., 2011). The anomalies in trace gas concentration can be attributed to gas emissions in the lithosphere and photochemical reactions (Cui et al., 2013). The LST anomalies prior to the 2009 Mw 6.3 L'Aquila earthquake illustrate the strong correlation with the topographic height and were clearly distributed over the mountain tops but not the valleys (Piroddi et al., 2014). The anomalous mechanisms of the activation of electronic charge carriers and emission of greenhouse gases or radon gas are considered to demonstrate a consistent manner because of prominent LST anomalies within the 5 limestone regions in this earthquake case. As a result, LST and air temperature may increase, resulting in changes in SLHF and atmospheric water vapor condensation, and propagation to the BT observed by satellite sensors.

Aerosols
Aerosols are the fine solid and liquid particles suspended in the atmosphere, which are produced from various sources, such as desert dust, wildfire smoke, sea salt particles, volcanic ash, and urban haze. Aerosols have different optical and 10 microphysical properties depending on their type and size distribution. They have an important influence on cloud formation and are often expressed by the AOD. AOD is a dimensionless physical quantity used to describe quantitatively the extinction degree of the direct solar radiation reaching the ground by the absorption and scattering of aerosols. AOD is defined as the integrated extinction coefficient within a vertical column of unit cross section along the entire atmospheric path, where the planetary boundary layer accounts for most of them. For now, most of multispectral radiometers in the visible, near-infrared 15 and shortwave-infrared wavelength regions can be used to retrieve AOD. For instance, the MODIS daily Level 2 aerosol product (MOD04), which is produced by the "Dark Target" and "Deep Blue" algorithms at a spatial resolution of 10 km at nadir in the Collection 6 ( Levy et al., 2013), includes many aerosol parameters, such as AOD, aerosol type, angstrom exponent, asymmetry factor, and single scattering albedo.
Aerosols are also used as indicator in many per-seismic anomaly studies. AOD was observed an increase of 40% prior to the 20 2015 Nepal earthquake (Ganguly, 2016). A clear increase in AOD around the epicenter 2 days prior to the 2010 Chile earthquake was found in Fig. 5 using three anomaly detection methods in order to reduce uncertainty of detecting anomaly date (Akhoondzadeh, 2015). An anomalous increase of ground-measured AOD is quasi-synchronous with several abnormal hydrothermal parameters (Wu et al., 2016). Over the sea surface, significant changes in aerosol parameters are considered to be related to the 2001 Gujarat earthquake (Okada et al., 2004). Although aerosols do not directly relate with the TIR radiation, 25 the charged aerosol particles play a role in the water vapor condensation and cloud formation over earthquake fracture areas.
A high aerosol concentration causes stronger electric field intensities and increased infrared emissions (Liperovsky et al., 2005).
Therefore, indirect effects of aerosols on the pre-seismic anomalies cannot be ignored. For another thing, the air ionization is a possible reason that keeps aerosols suspended in the atmosphere for many days before the earthquake, which could be explained by the p-hole theory (Freund et al., 2009). The predominantly, sometimes exclusively positive airborne ions 30 generated at the Earth surface rise through the atmospheric column and attach to aerosol particles, thus the ionized aerosols 9 can suspend in the atmosphere for a longer time. More field measurements, which can validate this theory, still need to be carried out.

SLHF
SLHF (W/m 2 ) is the flux of the heat absorbed or released by the phase transition (i.e., condensation, evaporation, and melting) of water from the Earth's surface to the atmosphere. SLHF is an important component of Earth's surface energy budget and is 5 mainly affected by the atmospheric relative humidity, wind speed, surface temperature, and season. Traditionally, SLHF is measured through the Bowen ratio technique or by the eddy covariance, and can also be obtained from satellite products.
MODIS global evapotranspiration product (MOD16) has a spatial resolution of 1 km at eight-day, monthly, and annual intervals (Mu et al., 2011).
Several studies have reported the abnormal increase of SLHF prior to earthquakes. The SLHF increased by approximately 50 10 W/m 2 peaking at 115 W/m 2 over 3-23 days before earthquakes in the Middle East (Mansouri Daneshvar et al., 2014). SLHF can also be a possible precursor to coastal earthquakes because anomalous SLHF tends to be more intensive over sea surface than over land surface Cervone et al., 2006). The analysis of coastal earthquakes shows that the SLHF increased abnormally a few days before earthquakes, which is the result of the increase in surface temperature in seismically active areas (Pulinets et al., 2006a). SLHF anomaly is significantly weaker on land surface than on sea surface due to obvious 15 differences in air humidity, wind speed, and surface temperature between sea and land surfaces. However, Qin et al. (2014a) found that anomalous SLHF, which is indicated by the dashed circle in Fig. 6, occurred approximately 10 days before the inland Pu'er earthquake (23.8092°N,101.15°E) on 2 June, 2007. Conversely, the study of Zhang et al. (2013) indicates that no significant anomalous variation of SLHF exists prior to ten studied marine or coastal earthquakes, and that data accuracy and parameter settings have an important effect on quantification of the correlation between SLHF anomaly and earthquake's 20 occurrence.
SLHF anomaly prior to earthquakes is considered to relate with the underground fluid movement and the interaction among the underground, surface and atmosphere (Alvan et al., 2013). Mansouri Daneshvar et al. (2014) believed that water vapor responses to seismic preparation phases follow a seismic-triggered chain, which undergoes air ionization, SLHF enhancement, water vapor condensation, and increased rainfall. Several studies have proven that SLHF anomalies before earthquakes are 25 linked with soil moisture Dey and Singh, 2003). Meanwhile, air ionization in the lithosphereatmosphere-ionosphere coupling (LAIC) model may be responsible for soil moisture and SLHF anomalies due to the increase of water condensation on newly formed ions .

Surface temperature
LST, also known as surface skin temperature, represents a remarkably thin surface layer of medium temperature state, which is affected by solar radiation, surface albedo, vegetation cover, soil moisture, etc. LST affects the energy exchange between the surface and boundary layers and determines the air temperature near the surface. Compared to TOA BT and OLR, LST can indicate the surface warming process better. MODIS LST data are currently the most extensively used products in the 1-5 km spatial resolution with an inversion accuracy of approximately 1 K (Wan, 2014).
LST is suitable for the analysis of inter-monthly variations, which is beneficial to short-term earthquake monitoring. A 40-day time period is sufficient for anomaly monitoring (Bhardwaj et al., 2017). The LST prior to earthquakes usually increased by at least 3-5 K (Ouzounov et al., 2006;Ouzounov and Freund, 2004;Tronin, 2006). A 1-10 K rise of LST two weeks before the main shock of 2011 Tohoku-Oki earthquake was detected around the epicenter (Zoran, 2012). Moreover, the study by 10 Bhardwaj et al. (2017) shows that snow cover levels prior to the 2015 Nepal earthquake noticeably declined and that snow surface temperature is a highly sensitive precursor that presented 3-14 K temperature anomaly in 3-17 days prior to this earthquake. Most studies use nighttime LST data for analysis due to the absence of solar heating, leading to measurements that are predominantly gathered from surface and underground thermal sources. However, LST anomalies can also be significant in the daytime (Blackett et al., 2011;Panda et al., 2007;Saraf et al., 2008). 15 The pre-seimic anomalies in epicentral areas before strong earthquakes are related to the increase in air temperature, LST and soil temperature at shallow layers. LST anomaly indicated by MODIS data was consistent with anomalous air temperature variations observed at the meteorological station close to the epicenter (Akhoondzadeh, 2013). The anomalous temperature variations are related to the seismic deformation and follow the fault evolution under seismic stress changes (Ma et al., 2008).
Thus, LST is more useful in detecting anomalies than TOA BT or OLR because it is less affected by the atmosphere, which 20 allows it to produce a higher signal-to-noise ratio (SNR) (Aliano et al., 2008;Lisi et al., 2015). Figure 7 shows the spatial and temporal evolution of the sequence of LST anomalies by different colored dashed circles (Lisi et al., 2015). The green dashed circle indicates the possible local warming effect caused by clouds. The LST anomalies which followed the main tectonic lineaments (green lines) in Central Italy (black circle), the Balkan region (blue circle) and the Padania plain (purple circle) were possibly associated with the medium-strong earthquakes. Meanwhile, LST, as a thermodynamic parameter of dynamic 25 equilibrium, derived by remote sensing data contains a degree of uncertainty due to the influence of surface types, processes of surface energy balance and atmospheric turbulence.
Several cases of marine earthquake monitoring using remote sensing TIR data have been successful. Sea surface temperature (SST) anomaly can be related to seismicity (Ouzounov and Freund, 2004), but SST is strongly influenced by weather conditions and currents. Seawater has a large thermal inertia that allows its temperature to change more slowly; therefore, 30 some mechanisms of LST anomalies are not applicable to SST anomalies. The study of the 2000 Blanco earthquake sequence 11 shows that ocean hydrothermal system can be altered by large earthquakes and significant decline of temperature was observed following the main shock (Dziak et al., 2003). Ouzounov et al. (2006) analysed the decrease of SST before the Ms6.8 2003 Boumerdes earthquake and implied that this phenomenon was attributed to the upwelling of cold water from the deep ocean caused by crustal deformation. Freund et al. (2009) disagreed with his explanation, and offered a physically meaningful mechanism to explain the phenomenon of SST anomalies. The p-holes activated by the tectonic stress transfer from the 5 sea floor to the ground-to-water interface and generate the oxidation from H2O to H2O2, which changes the thermal radiation balance of the sea surface.

Anomaly detection methods
The pre-seismic anomaly can generally be defined as the departure of a current observation value from the long-term average reference (Ouzounov et al., 2007). It is used to relate seismicity with temporal and spatial variations of thermal radiation related 10 signals from satellite observations. Different data processing methods have been proposed in this domain over the past years.
These methods aim to detect pre-seismic anomalies, attempt to point out possible future strong earthquake in terms of time, spatial, and magnitude windows within stated limits, and estimate the probability of this earthquake's occurrence. Figure 8 presents the methods that are classified and discussed below.

Visual interpretation 15
Visual interpretation method is a type of manual analysis method that interprets the relationship between an earthquake and the relevant parameters through a time series of visible and near-infrared or TIR data before and after the target earthquake.
The observed quantity, such as surface temperature, is measured in considerably greater amounts prior to the earthquake around epicenter areas compared to other time period and neighboring regions. The method is simple and effective; thus, it is always used in interpreting preliminary pre-seismic anomalies. Several studies have utilized this method, such as in the aerosol 20 anomaly (Qin et al., 2014b), SHLF anomaly (Dey and Singh, 2003), anomalies of multiple parameters (Pulinets et al., 2006a;Singh et al., 2010a), and BT anomalies of AVHRR and MODIS data (Saraf et al., 2012).
Visual interpretation is mainly based on experience interpretation, which results in strong subjectivity and does not allow current computer technology to deal with massive amounts of satellite data. From Fig. 9, we can distinguish the LST anomalies around the epicenter, but this difference is subtle causing the randomicity during the anomaly recognition. In addition, visually 25 identifying abnormal information can be difficult and calls for further processing, such as the use of pseudo-color synthesis, difference method, and fault profile line method. At the same time, increasingly sophisticated anomaly detection algorithms have been applied to extract signals about pre-seismic anomalies with increases in data source and amount and the quantitative requirement of further research.

Image difference methods
The simple image difference method involves identifying the differences in the same region over a series of images captured at different times for qualitative analysis. If the difference exceeds the pre-defined threshold, then the observed precursor value could be regarded as an anomaly that may be associated with an impending earthquake. This method eliminates the influence of background signals and highlights the spatial distribution of pre-seismic anomalies to a certain extent. For example, TIR 5 images captured before and after earthquakes were used to calculate difference values that highlight anomalous variations (Ma et al., 2008;Tronin et al., 2002). The method that uses BT difference has a certain ability to identify cloud pixels, but cannot eliminate background influence. Therefore, temperature anomaly is introduced through normalizing the area-averaged LST by subtracting and dividing by multiyear mean LST (Zoran, 2012).
The general anomaly method is to find the difference between current data during an earthquake and the long-term average 10 value (also called a reference value) in order to improve result robustness and eliminate potential bias (Mahmood et al., 2017;Qin et al., 2014a;Wu et al., 2016). This method can be expressed as where v( , , ) is the value of a pixel at a spatial location ( , ) and time t of satellite data required, which can be TOA BT,15 OLR or LST;and ( ,) is the average value at the same position in the same or similar time slot of years using multiyear data. In other words, the anomaly is the difference between what is happening and what is expected over a long-term average trend. General anomaly method is an improvement of the image difference method .
Given complex influences and possible uncertainties with respect to seasons, local weather, topography, and surface inhomogeneity, the pre-seismic anomaly is a weak signal in the strong background level. Thus, understanding variations of 20 the background field and the variation trend of each influencing factor over time are the key to the anomaly detection. The establishment of regional reference field based on multiyear data is a basic procedure to identify pre-seismic anomaly. Because reference field can express the trend of physical quantity in the spatiotemporal pattern and relieve random variability because of local meteorological factors. General anomaly method is used to eliminate this tendency change of current observation, thereby highlighting the pre-seismic anomaly. The k times standard deviation (δ) is generally the reference level relative to the 25 long-term normal field. Beyond ±kδ from the mean value of the reference field is regard as the anomaly. A positive anomaly means that the value is higher than the normal trend, whereas a negative anomaly indicates a lower value than the normal value.
The thermal anomaly is a weak signal under strong interference, thus isolating it by simple comparison or threshold is difficult.
This may also be an important reason that not all earthquakes exhibit significant pre-seismic anomalies. The eddy field method is used to calculate and add the difference data between adjacent lattice points of the data to obtain the spatial distribution of 30 the vorticity value (Ouzounov et al., 2007). The formula is as follows: ( , ) = 4 • ( , ) − ( − 1, ) − ( + 1, ) − ( , − 1) − ( , + 1), where EF( , ) is the eddy value at spatial location ( , ), and v is the observation value of a pixel. This method can highlight anomalous changes in the neighborhood and is used for earthquake-related anomaly detection by NOAA OLR data (Ouzounov et al., 2007;Xiong et al., 2010).
Furthermore, normalization by Z-score (ZS) is as an anomaly index, which hereafter is referred to as the ZS method (Ouzounov 5 et al., 2011). The ZS index is defined as follows: where ( , ) is the standard deviation of this reference field. This method is similar to the robust satellite technique (RST) method, which introduces the concept of SNR. The difference between current and mean values of a reference field represents 10 the signal part, and the standard deviation of a reference field is the noise part. A higher SNR indicates a greater possibility of anomalies prior to an earthquake (Tramutoli et al., 2001).

RST method
At present, RST, which is extensively used in the anomaly detection, is a multi-temporal statistical method for analyzing longterm satellite records with similar observation conditions (e.g., same month, same time of day, and same sensor data) (Genzano 15 et al., 2015;Lisi et al., 2010;Pergola et al., 2010;Tramutoli et al., 2005). RST combines the neighborhood difference in the eddy field method and the normalization of the reference field in the ZS method, and it clearly defines the mathematical expression of the pre-seismic anomaly in statistics. This method considers the statistical characteristics of the historical data of seismically active areas to discriminate the possible pre-seismic abnormal behavior of current signals. It is also considered applicable to different atmosphere and surface conditions and different satellite observation geometries, and can reduce the 20 probability of occurrence of false anomalies (Filizzola et al., 2004;Pergola et al., 2010).
The robust estimator of TIR anomalies (RETIRA) index is used to express the intensity of anomalies in RST method, as shown as follows: Δ (x, y, t) = ( , , ) − ̅ ( ), where ̅ (t) is the spatially average value in the homogeneous neighborhood; such that Δv(x, y, t) represents the difference between the value of a pixel in (x, y) and ambient homogeneous pixels; Δ (x, y) and Δ (x, y) are respectively the mean and standard deviation in (x, y) of the reference field calculated from Δv(x, y, t) in the same or similar time slot from many years, which are referred to Eq. (2) and (5). The RETIRA index is further used to indicate the relationship with future seismic activity.
Generally, a value greater than 3 to 4 is used as the threshold for the presence of anomalies that are associated with possible earthquake occurrences (Eleftheriou et al., 2016;Lisi et al., 2010;Lisi et al., 2015).
If continuous and large-scale anomalies appear from a time series of images within study areas, then they can be considered as pre-seismic anomalies (Aliano et al., 2008). Therefore, an index of significant sequences of thermal anomalies (SSTAs), which measures the persistence and continuity of thermal anomalies in temporal and spatial patterns, is defined to remove 5 problematic values as far as possible Lisi et al., 2015). SSTAs standards include the following: 1) relative strength: RETIRA index is greater than or equal to 3; 2) spatial persistence: the area of thermal anomalous pixels in the region of 1°×1° is larger than 150 km 2 ; 3) temporal continuity: the phenomenon of thermal anomalies occurs at least once within 7 days before or after current time.
The purpose of SSTAs is similar to that of a deviation-time-space-thermal (DTS-T) anomaly recognition method proposed by 10 Wu et al. (2012), which is applied to analyses of several large earthquakes. The DTS-T method defines a normalized quantitative index to express the stability of thermal anomalies considering the significance of numerical differences, synchronization of time, and adjacency of geographical location. Both indices are intended to improve the stability of detected thermal anomalies that are related with impending earthquakes as much as possible. In addition, geostationary satellite data are more advantageous than polar-orbiting satellite data in the anomaly detection (Aliano et al., 2008;Lisi et al., 2015). Because 15 it has a higher temporal resolution and a fixed observation geometry, and thereby its data have a higher SNR in detecting seismically related anomalies when using RST-based method. The principle of this algorithm is concise; however, the impact of short-term meteorological warming is difficult to eliminate. Therefore, this method should be combined with other data to identify the interpretation of anomaly detection results.

Methods based on wavelet transform 20
Wavelet analysis is one of the most popular seismic anomaly detection methods in recent years. Wavelet decomposition, wavelet packet, and power spectrum are used in determining the transformation and decomposition of TIR signals to extract useful feature signals for identifying pre-seismic anomalies. Wavelet transform has the feature of multiresolution analysis, which can characterize the local characteristics of the signal in time-frequency domain.
The influence factors of surface thermal radiation are complex and varied; therefore, influences of various field sources are 25 difficult to separate. Furthermore, non-tectonic factors, such as landforms, land covers, and meteorology, always have greater effects on thermal radiation than tectonic activities or thermal radiation induced by earthquakes. The weak signal under strong interference is difficult to separate, resulting in unstable results of anomaly detection methods. Continuous TIR radiation data can be considered to include the long-period components (e.g., the basic radiation field of the Earth, the inter-annual or seasonal radiation field, and terrain induced radiation field), the short-period components (e.g., diurnal radiation field, cloud and cold 30 air flow in several hours to days, and thermal radiation components caused by tectonic movements or earthquakes), and the 15 last residual factors (e.g., high-frequency noise components). Wavelet transform can decompose the time series signal into signals of different frequency bands that correspond to above different components; thus, studying the energy variation in each frequency band is convenient.
The relative power spectrum (RPS) method performs wavelet transform on time series data to realize bandpass filtering, remove long-and short-period components, and retain certain intermediate frequency band information that may be connected 5 with seismic-related signal changes and is expressed in the form of power spectrum. RPS method was used to analyze BT anomalies of the 2008 Wenchuan earthquake (Zhang et al., 2010). Figure 10 presents the obvious BT anomalies from April 25 to June 19, 2008 over the Longmenshan fault zone, and demonstrates the good results of the RPS method. The wavelet transform is initially used to process the multiyear data after cloud filtering and filling in of missing data. For each pixel, the basic temperature field, annual variation temperature field, daily variation temperature field and other factors can be separated, 10 and remaining information may include earthquake-related information. The RPS is then calculated by Fourier transform with a certain time window and sliding window lengths. On the basis of TOA BT, RPS approach is used to analyze pre-seismic anomalies, but non-seismic anomalies always appeared (Xie et al., 2013).
The wavelet transform method is used to eliminate low-frequency annual and seasonal components and high-frequency random components, such as local meteorological conditions and human activities. The local maxima are then extracted from time 15 series of MODIS LST data by a predefined threshold, which is sensitive to peak changes (Saradjian and Akhoondzadeh, 2011).
Another wavelet maxima method proposed by  uses one-dimensional wavelet transform to identify the isolated local maxima. The maxima mostly associated with impending earthquakes are then refined based on the geometrical continuity from spatial and temporal constraints to remove wrong maxima lines due to significant noise in input data. Wavelet maxima method is further utilized to analyze the result of eddy field method (Xiong et al., 2010). 20 Wavelet transform is based on the communication theory, thus its mathematical and physical foundation is complete, making it a highly promising method. The time series physical quantities derived from remote sensing are taken as the input information via a series of filtering and other processing, and finally pre-seismic anomalies are extracted. Wavelet transform can decompose the time-series data into mutually independent frequency components almost without loss of information. Each component has a different dominant factor, thereby making the physical meaning much clearer than other methods. However, 25 the determination of high-and low-frequency signals in this method requires the geography-and geology-related knowledge with a certain arbitrariness. Although this algorithm achieves a high degree of decomposition and analysis of e.g. TIR signals, its mechanism that relates with earthquake preparation must to be further improved. Moreover, wavelet transform requires continuous time series of input data, but missing values in TIR data always appear due to cloud cover. Thus, the interpolation of missing data is required, which bring in uncertainties with regard to real values. 30

Multiparameter comprehensive analysis
Single precursor in pre-seismic monitoring research shows its limitations. Changes in various land, ocean and atmospheric parameters before and after strong earthquakes can be observed from satellite platforms, along with ground measurements.
Comprehensive utilization of multiple precursors provides another promising choice, which can raise the reliability and credibility of the relationship between different anomalies and imminent seismic events (Eleftheriou et al., 2016;Singh et al., 5 2010a). The preliminary step is to identify which precursors present anomalous variations prior to strong earthquakes around the epicenter. OLR, air temperature, and relative humidity were used in the seismic anomaly analysis for 2005 Mw 7.6 Kashmir and 2013 Mw 7.7 Awaran earthquakes (Mahmood et al., 2017). Singh et al. (2007) analysed changes in SLHF, atmospheric water vapor, SST, wind, cloud liquid water, precipitation rate and chlorophyll concentrations prior to and after 2004 Sumatra earthquake. In short, many geophysical parameters show abnormal variations prior to earthquakes to some extent. 10 Anomalous variations of parameters in the surface, atmosphere and ionosphere prior to earthquakes present interrelations with one another and show the existence of strong coupling effects (Singh et al., 2007). Multiparametric analysis of the 2008 Wenchuan earthquake by Jing et al. (2013) indicates different response times for different parameters. Anomalous OLR occurred first, followed by air temperature, relative humidity, and air pressure presented abnormal variations. Finally, SLHF anomaly occurred the day before the main shock. Wu, et al. used multiple parameters from reanalysis datasets and ground 15 measurements to interpret the potential LCAC process (Wu et al., 2016). In Fig. 11, the precursor parameters of soil moisture, soil temperature, total column water vapour and screen-level air temperature present quasi-synchronous anomalies in the time window between 29 March and 1 April 2009 preceding the Mw 6.3 L'Aquila earthquake. Anomalous multiple parameters also indicate that the process of energy transfer is caused by tectonic activities, which is from the deep of crust to the land surface and then to the atmosphere. Multiparametric analysis helps in better understanding the sequence of processes that occurs in 20 the lithosphere, hydrosphere and atmosphere (Mansouri Daneshvar and Freund, 2017).

Other methods
The night thermal gradient (NTG) method calculates the linear trend of the nighttime surface temperature using geostationary satellite LST data with high temporal resolution (Piroddi and Ranieri, 2012;Piroddi et al., 2014). In general, the surface temperature at night gradually decreases and reaches the minimum in the early morning, thus an NTG value greater than 0 is 25 regarded as an abnormal temperature increase. Furthermore, the most intensive anomalies of soil and air temperature are spatially coincident with TIR anomaly detected by NTG method (Wu et al., 2016). This approach is highly suitable for the geostationary satellite data that have continuous time series observations. The interquartile method is used to establish upper and lower boundaries of temperature changes. The values that exceed boundaries are considered anomalous values that are associated with possible near-future earthquakes (Akhoondzadeh, 2013;30 Saradjian and Akhoondzadeh, 2011). This method is only sensitive to the highest intensity anomaly, and its formula is as follows: where M is the median value, k is a threshold value, IQR is the interquartile range between 75th and 25th percentiles, and V is the observed value. This equation can be reformed as 5 The interquartile method is analogous to the ZS method, in which the average value and standard deviation are substituted by the medium value and interquartile range, respectively.
A variety of machine learning methods have been used in the anomaly detection, which show great advantages in various fields in recent years and may have a considerable potential in this field. Autoregressive integrated moving average, artificial 10 neural network (ANN), support vector machine, and adaptive network-based fuzzy inference system (ANFIS) have been compared for detecting thermal and total electron content anomalies (Akhoondzadeh, 2013). ANN model can handle well the noise in input data. ANFIS that integrates both ANN and fuzzy inference systems is more effective at modelling non-linear time series predictions than other methods. However, machine learning methods are always required to set certain empirical parameters, and appropriate selection of these parameters is a challenging task in phases of model training and prediction. The 15 Kalman filter method is an iterative optimization system with high predictive power and is also used for detecting surface temperature anomalies (Saradjian and Akhoondzadeh, 2011). This approach takes advantage of predicting nonlinear times series data with non-normal distribution, such as the occurrence of LST anomalies. These methods are more complex and have not been widely used in the anomaly detection as other above mentioned methods. The comparison among these methods with different independent properties should be conducted based on more earthquake cases in order to screen out more effective 20 approaches.

Issues in thermal anomaly detection
We review the possible precursors for pre-earthquake anomaly detection in Section 2, as well as state several frequently used anomaly detection methods in Section 3. Although these precursors and methods give us some uplifting results in which we can notice some significant anomalies prior to many earthquake cases, we still need to take some criticism and deeper insights 25 on them in order to achieve the further goal, i.e. forecasting moderate and strong earthquakes ahead. Accordingly, in this section, we point out the misunderstandings in the above stated studies from the perspective of data sources and several other aspects.
The anomalous phenomena often occur prior to various earthquake cases, whereas the features of these phenomena are often different. The interference of clouds is often highly serious, thereby resulting in faint and unstable evolution patterns in 30 18 anomalous areas, although certain cases show clear characteristics of this evolution. Even though many hypotheses have been used to explain the mechanism of pre-seismic anomalies, the physical mechanisms that drive various pre-seismic anomalies are yet insufficiently validated and consequently not widely accepted. Thus far, objective, stable, and testable detection methods or precursors are still unavailable for pre-seismic anomalies based on current research and technology. Therefore, the anomaly detection has not become a well-recognized technology in the scientific community. 5 1) Some important concerns of mentioned studies are the reasonable definition of "pre-seismic anomaly" and quantitatively measuring its quantity. Given that different definitions are used for different methods, data sources, or even earthquake cases, one method based on one type of data source may get a good result for one earthquake, but it may be invalid when using these standards in other earthquakes or the prediction. The arbitrary definition confuses anomalous signals by earthquake events and impedes identifying abnormal variations from complicated background signals. 10 2) Quantitatively validating results by anomaly detection methods is difficult partly because of the previous difficulty of the definition. Whether anomalies detected in one earthquake case are general or incidental phenomena is often controversial. One of the general strategies is based on the validation and confutation method for verification Lisi et al., 2010;Lisi et al., 2015;Xiong et al., 2013). In the validation phase, anomalies associated with earthquakes should be found; however, no anomalies should be confirmed in the confutation phase. Nevertheless, the validation and confutation approach 15 alone is not enough. Most research results are based on and limited to a handful of case studies. Therefore, obtaining convincing conclusion from statistical results is difficult when earthquake cases are insufficient in the statistics analysis (Pulinets and Dunajecka, 2007).
3) Providing complete spatiotemporal data is difficult because of atmospheric interference, cloud cover, and instrument performance. Satellite remote sensing on polar-orbiting or geostationary platforms provides the most important data sources 20 at regional or global scales, which are beneficial for global seismic monitoring studies. Surface data products, such as LST, can be used in analyzing pre-seismic anomalies to mitigate the atmospheric effect. However, the greatest weakness of TIR remote sensing is that it cannot penetrate clouds to obtain surface thermal radiation information. This weakness results in the existence of a large number of missing pixels in the data product, seriously affects anomaly detection, hinders the understanding of spatiotemporal evolution of pre-seismic anomalies in a broader time scale, and finally leads to many cases 25 of omission or false positive rates (Lisi et al., 2015). Therefore, addressing the significant issue of cloud interference is critical in the anomaly detection when monitoring the seismically active regions. 4) Other factors, including the surface (e.g., topography, geomorphology, and land covers) and atmosphere (e.g., cloud, precipitation, and wind), also strongly influence the change of surface thermal radiation (Chen et al., 2013;Chen et al., 2014).
Satellite thermal anomalies (e.g., air temperature or OLR anomalies) are vulnerable to local atmospheric conditions, such as 30 the inversion of air temperature, thereby resulting in inaccuracies in detecting earthquake anomalies. Thus, more attention should be given to the possibility that other causes other than seismic activities should be responsible for the extracted all kinds of anomalies. Especially, anthropogenic activities in factories and urban areas also have a significant effect on the anomaly detection (Piroddi and Ranieri, 2012). For instance, removing the normal fluctuations of thermal emission related to meteorological or artificial factors from the earthquake-imposed thermal anomalies is an urgent problem that must be solved. 5) Nowadays, the data with low spatial resolution have been successfully used in the anomaly detection, such as NOAA OLR 5 data with 2.5° spatial resolution (Liebmann, 1996). The spatial scale of OLR anomaly from 1°×1° NOAA data is approximately 1000-2000 km from epicenters (Xiong and Shen, 2017), which is an unexpected result for the seismic activity monitoring.
However, the use of satellite data with a high spatial resolution (e.g., 1 km) can provide richer and deeper information. This way, the spatial location of different anomalies can be better correlated with seismic activity areas, which is more conducive in studying the characteristics of spatial anomalies, rather than limited to single point or local areas. For example, high-or 10 low-temperature bands occur before rock bursts, and the main ruptures may occur along these band areas and result in temperature anomalies; however, data with coarse resolution are difficult to accurately capture such active tectonic positions.
6) The mechanism of pre-seismic anomalies is still inconclusive. Several mechanisms for generation of pre-seismic anomalies detected by satellites sensors have been proposed and are in discussion (Saraf et al., 2009;Tramutoli et al., 2013). The positive hole theory has been proposed to explain this phenomenon. The electronic charge carriers (positive holes) release when the 15 peroxy links break in the stressed rocks, arrive at the Earth's surface and lead to the ionization of air at the ground-air interface.
And the recombination of charge carriers at the surface leads to a spectroscopically distinct, non-thermal IR emission (Freund, 2011;Freund et al., 2009). Besides, a unified LAIC model is proposed in which the radon emission in fault zones plays an important role Pulinets and Ouzounov, 2011). Later, Wu et al. (2016) added the coversphere to the LAIC model after analyzing its importance in the understanding of mechanisms and geophysical processes in earthquake 20 preparation areas. The radon emission in fault zones plays an important role in the LAIC model, which however is physically impossible based on the p-hole model (Freund, 2011;Freund et al., 2009). Radon is very rare in the average crustal rocks.
Moreover, the measurements of radon emanation on the ground or in the underground shows that radon emission increases only by a factor of about 10 prior to an earthquake. These insufficient radon atoms cannot bring in a significant increase of the air conductivity by a factor 100-250. Therefore, further validation of these distinctive models is required from physical 25 simulation experiences and synergetic measurements of multiparameter in the field. This situation leads to weak physical mechanism of detection algorithms and arbitrary explanations about results of various anomalies. These defects seriously restrict the application and popularization of this approach.

Future development and perspectives
Thus far, no single measurable geophysical variable and data analysis method illustrates considerable potential for a 30 sufficiently reliable and effective operational earthquake forecasting practice. Forecasting moderate and strong earthquakes in 20 the future several months to days at the regional or global scale is of high social value, which is the advantage of satellite monitoring. Based on the discussion in the preceding section, the following perspectives are expected. 1) Passive microwave remote sensing can obtain surface BT data under clouds, which is less affected by atmospheric conditions. Although passive microwave data generally have lower spatial resolution (10-50 km in the nadir view) than that of TIR data (1-5 km), they remain sufficient for monitoring pre-seismic anomalies at a global scale (Lisi et al., 2015). The fusion of TIR 5 and passive microwave data is an important aspect for realizing the global observation of surface temperature (Duan et al., 2017;Wang et al., 2014). The data fusion technique can take advantage of the high precision and spatial resolution of satellite TIR data and the penetrating clouds ability of microwave data. The improvement of spatial resolution of prospective passive microwave sensors could be favorable to matching different data sources.
2) A variety of precursors has been used in the pre-seismic anomaly detection with different uncertainties. Thus, joint 10 application and analysis of multiple parameters could be a choice to enhance the robustness about the earthquake forecasting.
The results of multiparameter anomaly detection are combined to form a more reliable collective result, which reduces false anomaly recognitions and uncertainties, therefore improves the stability and accuracy of earthquake prediction. At the same time, ground measurements, such as surface and air temperatures along the most populous and active faults can provide a better understanding and supplement the missing data of satellite imageries due to persistent cloud cover. Furthermore, other 15 geophysical data, such as underground temperature, geomagnetism, gravity, properties of groundwater, and gases, can be combined with satellite data from the local, regional to global scales. This integrated study from different levels of the space, aerial platform, terrestrial surface, and underground will promote pre-seismic monitoring research (Venkatanathan et al., 2017).
3) A global-scale monitoring system based on a number of satellite data products, from polar-orbiting and geostationary satellites, can offer complete earthquake cases analysis that render a better solution for detecting and analyzing the signals 20 prior to major earthquakes. Furthermore, abundant satellite data, which are global long time series of continuous measurements, can support the sufficient long-term correlation research for various candidate precursors. Eleftheriou et al. (2016) analyzed M ≥ 4 earthquakes using OLR data in Greece for 10 years. Xiong and Shen (2017) also performed preliminary work on such global statistics of pre-seismic OLR anomalies. The statistical uncertainties can serve as a priori knowledge to determine weight coefficients of precursors in a multiparametric monitoring system. 25 4) In the near future, remote sensing data, which are applicable to seismic activity monitoring, will be further increased with higher spatial, temporal or spectral resolutions. Thus, emerging technologies can be integrated to improve the predictive efficacy. The ultrahigh spectral TIR data could be used to derive surface temperature, emissivity, atmospheric profiles of temperature and humidity, and concentrations of gas composition simultaneously. Applying these parameters in joint monitoring of the seismic activity is a possible development direction. New satellite project, such as TwinSat, which integrates 30 21 multiple observation technologies and ground measurements, can improve the understanding of seismic-triggered LAIC mechanism (Chmyrev et al., 2013). 5) Several methods can be synthetically applied to improve the efficacy of the anomaly detection. Because utilizing only a single approach to identify the pre-earthquake anomaly is not statistically robust (Bhardwaj et al., 2017). Moreover, integrating multiple methods has shown good potentialities in improving the capability of detecting actual anomalies. Given that 5 qualitative analysis is subjectively influenced, developing quantitative anomaly detection method is a promising direction, which is also beneficial in realizing the comprehensive application of various methods. (Akhoondzadeh, 2013;Saradjian and Akhoondzadeh, 2011). For example, probability models that can integrate multiple methods or even various precursor variables may provide a more credible prediction with quantitative statistical uncertainties, other than the qualitative voting selection.
Meanwhile, developing new algorithms based on novel data mining technique (e.g., machine learning), which explore the 10 possibility of enhancing the performance in earthquake prediction, can optimize this multimethod scheme.
6) The evaluation of anomaly detection algorithms can contribute to investigate the defects of methods and verify their validity.
Building the quantitative determination of indicators and anomaly-identifying standards in the evaluation is imperative, which is the basis for quantitatively estimating the efficacy of methods and comparability among them. According to these indicators and standards, the big data statistical analysis based on long-term series of remote sensing data and global earthquake cases is 15 an important means to establish the statistical relationship between pre-seismic anomalies and earthquake events. In view of mathematics, these steps can maximally eliminate subjectivity. Furthermore, the credibility of anomalies derived by satellite data can be improved only after several means of verification and in-depth reveal of the regularity and precursory characteristics of the earthquake. Analyses of detection results should combine with fault activities, crustal stress changes, and underground heat flow. The spatial distribution of various anomalies, such as LST anomaly, should be consistent with the fault 20 activity mode and its mechanical model. After excluding the influence of non-tectonic factors, the derived anomalies should be analyzed together with measurements from other means (e.g., meteorological data). More importantly, the proposed methods could be easily inspected by other researchers and ensure the reproducibility, thereby increasing statistical significance of detection results. Thus, anomaly detection criteria should be clearly established and strictly used in other earthquake case studies and subsequent earthquake prediction . 25   7) The study of geophysical mechanisms and development of theoretical models about pre-seismic anomalies should be strengthened or even updated. The numerical simulation based on knowledge of seismo-tectonics can be used to establish the relationship between anomalous signals and seismic events. The diagnostic index with practical value could be created based on this relationship, and the problem of anomaly index construction may be theoretically solved. Meanwhile, the former theories should also be examined with new minds and technologies. For example, the microfracturing theory is somehow 30 fragile (Freund, 2011). By the basic definition, fracturing can create new surfaces and is possible only if empty space is created 22 between the two sides of the crack. For the rocks at seismogenic depth (7-45 km and deeper), the overload is so large that the possibility of creating any micro-or macro-fracturing, is very small. In other word, the difference between hydrostatic and lithostatic pressure becomes so large that open porosity of rocks cannot be maintained. Besides, the synergistic observations of relevant parameters from underground to ionosphere in seismically active regions are necessary to validate these theoretical models. 5 Yet still, further efforts on satellite anomaly detection are required to study the physical mechanism, new satellite data, and anomaly detection approaches. Although various techniques with different measurements and approaches have been used in this field for decades and show the potential and possibility for the recognition of seismic precursors, no any technique has the full ability to forecast imminent strong earthquakes in a short-term (e.g., a few weeks) or long-term (e.g., several years) for now. And current pre-seismic anomalies monitoring techniques from optical satellite observations could also not forewarn the 10 earthquake or reduce the seismic risk in practice. Despite the difficulties in pre-seismic monitoring, previous studies have shown that to some extent, a link exists between a variety of anomalies of satellite observations and seismic activities.
Therefore, the practicality of the anomaly detection in earthquake monitoring could be advanced from many perspectives based on existing research foundation. The anomalies which might be induced by seismic activities theoretically should be contained within the temporal and spatial distributions of satellite spectral information. Thus, the researches on the pre-earthquake 15 anomalies monitoring from satellite platforms are of high scientific interest with the rapid development of remote sensing technology. Analyzing the correlation between different anomalies and seismicity provides a promising way for short-term earthquake prediction and accelerates the present capability on seismic hazard assessment and early warning. Earthquake Dynamics, Institute of Geology, China Earthquake Administration) who discussed the physical mechanism of the pre-seismic anomalies with us and helped to modify the corresponding sections of this manuscript. Special thanks are also given to two reviewers for their valuable comments and suggestions that greatly improve this paper.

Conflicts of Interest:
The authors declare no conflict of interest. 30      surface air temperature at a height of 2-meter (d) (Wu et al., 2016).