Based on a novel estimation of background-error covariances for assimilating
Argo profiles, an oceanographic three-dimensional variational (3DVAR) data
assimilation scheme was developed for the northwestern Pacific Ocean model
(NwPM) for potential use in operational predictions and maritime safety
applications. Temperature and salinity data extracted from Argo profiles
from January to December 2010 were assimilated into the NwPM. The results show that the average daily temperature (salinity) root
mean square error (RMSE) decreased from 0.99
Operational prediction systems for forecasting waves, currents and sea level variations are fundamental for maritime safety, serving a wide range of applications such as search and rescue, oil spill, tourism-oriented bulletins, climate change monitoring and many other downstream applications, eventually through downscaling the forecasts into coastal hydrodynamic models. The Chinese Global Operational Oceanography Forecasting System (CGOFS), which is run at the National Marine Environmental Forecasting Center of China (NMEFC), predicts physical properties of the global oceans, such as temperature, salinity, current, wave and sea ice. The CGOFS consists of a suite of nested model configurations. The operational northwestern Pacific Ocean model (NwPM) is a regional model at the CGOFS, which is based on the Regional Ocean Model System (ROMS), a free-surface, primitive equation ocean circulation model formulated using terrain-following coordinates. It produces daily analyses and forecasts, up to 5 days ahead, of the main ocean variables and provides boundary conditions for the East China Sea model (ECSM) and the South China Sea model (SCSM). It thus represents a fundamental ingredient in the operational marine environment and disaster forecasting chain and alert systems developed at the NMEFC.
Ocean forecasts require the specification of initial and boundary (at the surface and laterally) conditions. The accuracy of the forecasts depends on the accuracy of the initial and boundary conditions. While lateral and surface boundary conditions are usually taken from a global (or coarser resolution) ocean model and meteorological analyses and forecasts, respectively, data assimilation is a widely used and effective way of producing the best estimates of the state of the physical system to be used as initial conditions in the prognostic model.
Many data assimilation methods have been developed for combining model and observational data. These can be broadly split into three approaches: Kalman filter and derived schemes, generally known as sequential schemes (Daley, 1991); optimal interpolation; and three-dimensional and four-dimensional variational methods (3DVAR and 4DVAR; Lorenc, 1986). The variational methods are based on the minimization of a cost function that weights the differences between the analysis and the observations and the differences between the analysis and a priori knowledge of the state of the ocean, namely the background, typically from a previous forecast. The ensemble Kalman filter (EnKF) was introduced by Evensen (2003) to avoid the explicit temporal propagation of error covariances in the Kalman Filter, replacing it with covariances derived from an ensemble system. Depending on the resolution of the ocean configuration and computational resources, the EnKF may not be suitable for operational forecasting systems. As an approximation of EnKF, an ensemble optimal interpolation (EnOI) scheme was applied to ROMS to assimilate the along-track sea level anomaly (SLA) (Lyu et al., 2014). ROMS is also equipped with a 4DVAR assimilation method (Tshimanga et al., 2008; Moore et al., 2011), which might be too computationally demanding for high-resolution configurations in the northwestern Pacific Ocean. 3DVAR represents a sound compromise between the sophistication and computational requirements of the assimilation scheme.
Operational oceanography has benefited from the development of data assimilation schemes such as EnOI, EnKF and 3DVAR, which have led to improved forecast skill scores. There are two main advantages of variational schemes over other methods. Firstly, the variational solution uses all observations simultaneously, whereas the EnOI technique requires the data selection into artificial subdomains. Second, balance constraints, e.g., geostrophy and hydrostatic balances, can be embedded into the definition of the balance operators that is implicit in the background-error covariance formulation (e.g., Weaver et al., 2005).
Despite such advantages, variational methods still have weaknesses. For instance, given both imperfect observations and prior (e.g., background) information as inputs to the assimilation system, the quality of the output analysis crucially depends on the appropriateness of prescribed errors, which, unlike EnKF, are usually defined as stationary errors and only include seasonal variations.
The 3DVAR data assimilation method is widely
used in oceanic operational forecasting systems at both global (e.g., Storto
et al., 2011; Waters et al., 2015) and regional scales (e.g., Li et al., 2008;
Dobricic and Pinardi, 2008). In this study we adapted an oceanographic
three-dimensional variational data assimilation scheme called OceanVar
(Dobricic and Pinardi, 2008) to the ROMS model in order to assimilate
temperature and salinity (
To better illustrate and evaluate the performance of the assimilation scheme, the NwPM implements an eddy-resolving resolution. This system will be used in the future to increase the quality of initial conditions for daily forecasts, whose production has already started within CGOFS (v1.0).
The paper is organized as follows. Section 2 describes the components of the data assimilation scheme for assimilating Argo profiles in the northwestern Pacific. The results from data assimilation experiments are presented in Sect. 3, focusing on the performance of 3DVAR. Section 4 discusses the performance of the system in an operational configuration. Finally, Sect. 5 presents the conclusions.
The ocean model used in this work is ROMS (Schepetkin and McWilliams, 2005;
Malcolm et al., 2009), a free-surface and primitive equation ocean
circulation model formulated using terrain-following coordinates, which is
widely used in oceanographic studies (Wang et al., 2012; Lyu et al., 2014).
The model domain is the northwestern Pacific Ocean, which extends from
8
The model is initialized from rest using the monthly climatological air–sea flux from the Comprehensive Ocean–Atmosphere Data Set (COADS; Clark et al., 1996) with a 10-year spinup in order to obtain a fairly stable initial state. From January 1990 to December 2009, momentum and buoyancy air–sea fluxes were derived from the 6-hourly NCEP Climate Forecast System Reanalysis (CFSR) product (Saha et al., 2010). The model configuration implements open boundary conditions. Water level, temperature, salinity and velocity at the open boundaries are derived from Simple Ocean Data Assimilation (SODA; Carton and Giese, 2008). Monthly climatological freshwater inflows from the Yangtze River, Pearl River and Mekong River with zero salinity and monthly varying temperatures are prescribed at the upstream boundaries. Considering the previous validation exercises (Wang et al., 2016), the model gives a good simulation of northwestern Pacific Ocean, especially in the subtropical Pacific region. The initial conditions for both the control (i.e., without assimilation) and the assimilation experiments are provided by the simulated ocean state valid on 1 January 2010. The control experiment for 2010 without data assimilation provides a basis for comparison.
The assimilation corrections are performed daily, using all the Argo profiles in the previous 1-day assimilation time window. The ocean model is used to bring the ocean fields 1 day forward in time.
There are three main steps in the system as shown in Fig. 2: (a) preparation of temperature and salinity observations from Argo profiles; (b) integration of the NwPM model using the previous analysis increments to correct the initial conditions and calculation of misfits; (c) running the data assimilation system to produce the analysis increments for the next model integration. The misfits are computed online by the model during step (b). As the misfits come from Argo floats and are evaluated during the model integration (i.e., before being incorporated into the data assimilation system in the next analysis step), they represent a fairly independent dataset for the validation, because their subsequent measurements are sampled at different locations. Thus the temporal correlation of the observational error can be reasonably assumed to be negligible.
Flow chart of the system:
Argo is a global array of free-drifting profiling floats that measures the
temperature and salinity of the upper 2000 m of the ocean. Figure 1b and c show the horizontal distributions and temporal evolutions of the
Argo profiles in 2010, respectively. The profiles are quality-controlled and
disseminated by Ifremer/CORIOLIS (Cabanes et al., 2013). The Argo network
provides a fair coverage in the northwestern Pacific region and south of the
Sea of Japan. Only a few Argo profiles are available in the northern region of
the South China Sea. From January to December 2010, there were 9101
(1011064)
The basic goal of the 3DVAR system is to provide an “optimal” estimate of
the true oceanic state at analysis time through solving the assimilation
problem by minimizing the following cost function (e.g., Ide et al., 1997):
The background term of the cost function is pre-conditioned via a control
variable transformation (Lorenc, 1988); i.e., the cost function is minimized
over the control variable
Distribution of yearly mean background-error standard deviation
reconstructed from the EOFs:
Vertical distribution of misfits for temperature (
The specification of the background-error covariance matrix is one of the
most important aspects affecting the performance of any variational data
assimilation system. Therefore, an appropriate background-error covariance
matrix is crucial for our 3DVAR system. The formulation of the background
term of the cost function is described in Dobricic and Pinardi (2008). The
background-error covariance matrix is decomposed into horizontal
correlations and vertical covariances, which are assumed to be independent
of each other, i.e., separable, namely
For the vertical component of the background-error covariance matrix,
monthly multivariate empirical orthogonal functions (EOFs) are used as in
Barker et al. (2004), namely
For the northwestern Pacific region, seasonal differences of the model errors
are large. Therefore, we adopted monthly sets of EOFs to construct the
vertical background-error covariance matrix. Each monthly set consists of 20 EOFs with 100
Figure 3 shows the map of the yearly mean background-error standard
deviation reconstructed from the EOFs, where Fig. 3a refers to sea level
(m), Fig. 3b to temperature (
This section reports the results of the assimilation of in situ data from January to December 2010. We discuss the validation of the assimilation experiment and simulation (or control) experiment, where the simulated fields and the analysis fields are called SFs and AFs, respectively.
To validate the multivariate assimilation scheme, Argo profiles are used for validation. Argo misfits are independent observation–model departures because they are evaluated before being incorporated into the assimilation system.
To validate the performance of the assimilation system, the vertical
distribution of
In order to further validate the vertical distribution of
The vertical RMSE and AE for temperature (
To investigate the performance of the assimilation system over time, we
compared the temporal evolution of
Temporal evolution of temperature (
To evaluate the performance of the assimilation system on temperature,
independent satellite sea surface temperature data (OISST,
1/4
Monthly mean temperature bias at surface (in
To evaluate the performance of the assimilation system on sea level,
independent Archiving, Validation and Interpretation of Satellite
Oceanographic data (AVISO, 1/4
Monthly mean sea level anomaly (SLA) bias (in meter) for (1) January,
(2) April, (3) July and (4) October:
Figure 9 shows the time evolution of SLA RMSE calculated from AF and SF. On
average, the RMSE decreases from 13.1 cm in the SF run to 12.8 cm in the AF
run. Because the greatest errors occur in the region of the Kuroshio
Extension as shown in Fig. 8, the RMSE decreases from 10.6 cm in the SF run
to 9.6 cm in the AF run, i.e., a 9.4 % reduction, when this region
(30–40
Temporal evolution of SLA RMSE (in meters). RMSE computed with the AVISO dataset for the AF (solid line) and SF (dotted line). The black line represents the RMSE in the entire domain and the gray line represents the RMSE in the region without the Kuroshio Extension (NKE).
Consistency checks were carried out by comparing the SF and AF monthly mean
temperature and salinity with the EN4.0.2 (1
To validate the sea surface salinity (SSS), Fig. 10 shows the monthly mean average SSS bias computed from SF (Fig. 10a) and AF (Fig. 10b) with respect to EN4.0.2, where the numbers have the same meanings as in Fig. 7. As shown in Fig. 10, there are many regions, especially near the Equator, with a bias larger than 0.3 for both the SF run and AF run. The results show that the reduction in SSS bias is, on average, from 0.363 to 0.308 psu (i.e., a reduction of 15.2 %) in October (Fig. 10a4 and b4). Due to the lack of observations, no significant improvements occur in the South China Sea.
Monthly mean salinity bias at surface (in 1): (1) January, (2) April, (3) July and (4) October:
Distribution of temperature (in
To validate the vertical properties of the system, temperature and salinity
sections of 137
Distribution of salinity (in psu) at transect of 137
Vertical
Temporal evolution of temperature
Following the assimilation scheme in Sect. 2.1, the Argo misfits are independent observations that can be used for validation. In order to evaluate the greatest potential for the system, another analysis and forecast scheme was set up, illustrated by the dotted line in Fig. 2. Compared to the original scheme (AF), the analysis increments are used to correct the initial conditions of the model at the beginning of the assimilation time window. Therefore, two sets of misfits are available, before (bDA) or after (aDA) the data assimilation correction, respectively, i.e., before or after being incorporated into the system. To investigate the performance of this scheme, we assimilated Argo profiles in December 2010. Meanwhile, a control run (CTRL) without data assimilation was performed for validation purposes.
Figure 13a, b show the RMSE vertical profiles of temperature and salinity
for the AF, bDA, aDA and CTRL runs. Here the profiles are not independent
because they have been already assimilated into the model. The aDA therefore
provides a consistency check of the assimilation system rather than
independent validation metrics. For temperature, the data assimilation led
to a large improvement within the top 800 m. On average, the RMSE for
temperature was 0.58
Figure 13c, d show the time evolution of
In order to discuss the performance of data assimilation in operational
configurations, an experiment was set up for 2–31 December 2010. The
analysis frequency was daily, with daily forecasts of up to 5 days.
Figure 14 shows the temporal evolution of
We have implemented a 3DVAR scheme in ROMS that assimilates temperature and salinity observations from Argo profiles. This work represents a first step towards a fully operational analysis and forecast system developed at NMEFC for use in maritime safety applications. The data assimilation system was implemented in an eddy-resolving configuration of the northwestern Pacific from January to December 2010. A specific feature of our 3DVAR system is the separation of the background-error covariance matrix into vertical and horizontal modes in order to reduce the size of the data assimilation problem. Horizontal correlations are modeled as Gaussian functions through a first-order recursive filter, while vertical covariances are estimated from a long-term model simulation and formulated as monthly sets of EOFs.
After assimilating the Argo profiles, the average daily temperature
(salinity) RMSE decreased from 0.988
The OISST satellite-derived datasets (SST) and AVISO
(sea level anomaly) and temperature and salinity objective analyses from
EN4.0.2 were collected for validation. A comparison of these datasets showed
that the data assimilation provides a beneficial effect for the sea level,
temperature and salinity at the surface in the model region. By comparing
the assimilation experiment with the reprocessed dataset, the data
assimilation provided a good reproduction of the vertical structure across
the data-rich transect at 137
The potential of the data assimilation system was also discussed by
assessing the assimilation experiments with validating observations before
and after their ingestion in the system. The results show that the minimum
RMSE the assimilation system is able to reach is
The assimilation system was also tested in an operational framework for a 1-month period, where daily analysis cycles and 5-day forecasts were produced. The 3DVAR initialization improved the short-term predictability in the northwestern Pacific Ocean. It led to skill scores that beat those of a non-assimilative experiment for all 5 forecast days.
Overall, the 3DVAR assimilation system performed well in the assimilation experiment. All these results encourage the implementation of the system in an operational environment for maritime safety applications. In further experiments, we plan to extend the assimilated observing networks to sea level anomaly and sea surface temperature data from satellites. This may alleviate the biases occurring in the mesoscale active region of the Kuroshio extension due to inaccuracies in the air–sea exchange fluxes and limitations in capturing the eddy-dominated ocean dynamics in that region.
The Argo profiles, which are quality controlled and disseminated by Ifremer/CORIOLIS, are described in Cabanes et al. (2013) and cover from 1995 to now. The atmosphere forcing data are from CFSR/NECP (Saha et al., 2010) and cover 1979 to now. The SODA dataset is described in Carton and Giese (2008) and is available from 1970 to 2010. The bathymetry data are obtained from the British Oceanographic Data Center (BODC). This version of the GEBCO_08 grid was released in November 2010 with a version code of 20100927. Information on the dataset is given by Hall (2002). The satellite daily OISST is described in Reynolds et al. (2007). The MGDSST is provided by JMA (Japan Meteorological Agency) and is described in Kurihara et al. (2006). The title of the reprocessed dataset is EN4.0.2, which is described in Good et al. (2013) and covers 1900 to 2015. The SLA dataset, which is provided by AVISO, is described in Le Traon et al. (2003).
This work was supported by the National Natural Science Foundation of China under contract nos. 41222038 and 41206023, the National Basic Research Program of China (973 program) under contract no. 2011CB403606, and the Strategic Priority Research Program of the Chinese Academy of Sciences through grant no. XDA1102010403. Edited by: A. Olita Reviewed by: B. Powell and three anonymous referees