This article describes the architecture of sea situational awareness (SSA) platform, a major asset within TESSA, an industrial research project funded by the Italian Ministry of Education and Research. The main aim of the platform is to collect, transform and provide forecast and observational data as information suitable for delivery across a variety of channels, like web and mobile; specifically, the ability to produce and provide forecast information suitable for creating SSA-enabled applications has been a critical driving factor when designing and evolving the whole architecture. Thus, starting from functional and performance requirements, the platform architecture is described in terms of its main building blocks and flows among them: front-end components that support end-user applications and map and data analysis components that allow for serving maps and querying data.
Focus is directed to key aspects and decisions about the main issues faced, like interoperability, scalability, efficiency and adaptability, but it also considers insights about future works in this and similarly related subjects. Some analysis results are also provided in order to better characterize critical issues and related solutions.
TESSA (TEchnology for the Situational Sea Awareness) is an industrial research project under the National Operative Program “Ricerca & Competitività 2007–2013” of the Italian Ministry for Education, University and Research. TESSA aims at strengthening and consolidating services belonging to operational oceanography in Italy and, in particular, in its southern seas. TESSA integrates both weather (e.g., wind, air temperature, precipitation, cloud cover, pressure), marine (e.g., wave height, period and direction) and ocean forecasts (e.g., currents, sea temperature) and analyses with advanced technological platforms that will allow an unprecedented dissemination of the environmental information for the “situational awareness” at sea. Situational awareness is a relatively recent concept and there has always been a lack of an agreed-upon definition because of the context-dependent nature of the concept (Sarter and Woods, 1991); Endsley and Garland (2000) defined it as “the perception of the elements in the environment within a volume of time and space, the comprehension of their meaning and the projection of their status in the near future”. Endsley's definition introduces a theoretical approach to the situational awareness that is based on information processing: it consists in a three-level model that can be considered as a chain composed by information perception, interpretation and prediction. The first level consists in the perception of elements in the environment, such as the simply initial collection of raw data, with no interpretation. Information interpretation is the second stage: collected data are integrated and analyzed in order to understand what is happening. The last stage is about prediction of future status: based on data interpretation, prediction consists in the ability to predict the future status of elements in the environment. The three-level model of situational awareness was developed initially in the aviation domain, but it has also been applied in many other safety critical domains (Stanton at al., 2001), like the maritime sector is.
Within TESSA, we meant sea situational awareness (SSA) as the capability to provide information about present and future sea conditions to support people during their activities at sea. This issue topic is strategically important for safer or optimal navigation, search and rescue, the assessment of the good environmental status of the marine ecosystem and the management of the sea territory. The lack of knowledge and awareness about sea conditions reduces readiness for reacting to emergencies and protecting the marine environment, with large socioeconomic impacts and damages. With the “Marine Knowledge 2020” communication (Green Paper, 2012), the European Commission outlined the importance of marine knowledge in helping EU member states meet targets about employment, innovation, education, social inclusion and combat climate change as outlined in the “Europe 2020” strategy. Marine knowledge would provide the fundamental knowledge to facilitate the growth of a sustainable, job-creating “blue economy” in marine and maritime sectors by improving the competitiveness and efficiency of industry, public authorities and researchers.
It has been estimated that marine data fragmentation and inaccessibility causes a loss of EUR 300 million per year and that estimation does not take into account the future growth in marine economy and the subsequent increased demand for data. Opening up marine data allows new operators to enter the market while cross-system interoperability allows businesses and academics to develop new products and services based on data from different sources and of different types. The impact assessment could be of the order of EUR 200 million per year. In addition to this, uncertainty is one of the main problems of those responsible for designing offshore structures, for managing fish stocks or for designing protected marine areas. For example, it has been estimated that a 25 % reduction in uncertainty in future sea level rise would allow public authorities responsible for coastal management to save approximately EUR 100 million per year.
Search for related or similar works about operational oceanography and SSA has identified two broad research and operational fields. The first deals with the development and operation of services for intermediate users, individuals or organizations that need raw data in order to perform value-added analysis. In order to do that, their activities are centered on designing and developing platforms able to produce raw data (observational or output of forecasting models) and make them available to users, like research organizations, so that they can perform further processing and deliver new services for other end users. Studies in this context include Bahurel et al. (2009) and Moussat et al. (2016), describing development of hardware infrastructure and software appliances for collecting data and making them available for discovery (e.g., a metadata service catalogue) and download.
The second kind of research field is focused on the design and the development of services directly oriented to create and deliver value for end users. Projects in this field generally support well-defined scenarios, like oil spill (Zodiatis et al., 2016) or ship routing (Mannarini et al., 2016a). In those cases, stand-alone clients (like mobile apps) or dynamic websites are created and maintained in order to allow the stakeholders to use the service.
In land domain, there are many examples of IT solutions in the realm of situational awareness that have been developed with the aim to deliver information and services about the environment to stakeholders for supporting them in making decisions. Many works refer to platforms adopting web 2.0 visual analytical approach and GIS technology: this kind of solution has been applied both in web (Liu et al., 2009) and in mobile (Assilzadeh et al., 2010) situational awareness contexts.
From a technological point of view, one of the core activities within the TESSA project has been the development of a common platform for providing services and decision support systems (DSS). This base infrastructure has been designed to be extended and to support a variety of complex and different scenarios, like weather and marine forecast, ship routing, extreme conditions early warning, environmental quality, oil spill movement forecasts and search and rescue operations.
Therefore, the novelty of the TESSA project (and, in turn, the SSA platform) resides in the creation of a shared infrastructure designed to be extended by additional applications, either produced in the context of the project itself or even by other third parties. More specifically, the DSS that have been developed have taken full advantage of the existing ecosystem (which provides support for common concerns like security and specific map services), speeding up the development of each single service while increasing the total value of the SSA platform itself. This enables the platform to act as a single entry point that professional and non-professional users can use in order to benefit of a collection of highly focused services, each dealing with a different aspect of the SSA.
The TESSA project was born from the collaboration between advanced oceanographic research, scientific computing and information technology groups aimed at developing advanced technological platforms (web-based, multi-target and multi-channel) for the production and provisioning of detailed information about sea conditions, at different complexity levels, and creating DSS software for ship routing, extreme conditions early warning, environmental quality, oil spill movement forecasts and search and rescue operations. Therefore, such a wide and complex objective from the scenario point of view has been faced by designing and developing a unique technological platform able to provide information services about sea conditions to multiple, domain-specific applications.
This paper describes the main driving factors of the system and the most innovative solutions that have been implemented in order to satisfy the outlined requirements. Section 2 describes the motivations and objectives behind this system, Sect. 3 introduces related work in similar fields, Sect. 4 presents the general architecture, including front-end and map-related infrastructure, while Sect. 5 describes the data analysis subsystem. Section 6 provides insights about the issues faced in implementing a scalable platform, with particular emphasis on the map rendering issues. Finally, Sect. 7 outlines conclusions and future works.
The context of interest for SSA is wide, ranging from private activities like leisure (tourism, diving), sports and fishing to institutional activities like environmental protection, search and rescue, intervention in case of oil spill and business like safe navigation and offshore activities.
In that context, it becomes very important to properly and effectively
deliver information about sea conditions. Within TESSA project we translated
this issue into the following objectives:
Information is made available everywhere and anytime. Users would like to
access information about sea conditions 24 h a day, 7 days a week, and
also have data that are contextualized to the place where they are (“Tell me
what the sea conditions in Otranto will be in the next three hours”). Information is provided about present and future sea conditions that is as
easy as possible to understand. As an example, for services targeted to the
large public, by endorsing easy-to-use user interface (UI) paradigms to
convey scientific information and, at the same time, by adopting conventions
(like naming of variables or of units of measurement) that are publicly
recognized within scientific and professional communities. Information and service are provided to both end users and service providers
that will further transform sea condition data into other information or services. Information and services are provided via the more diffused devices (like
smartphone or tablet). To provide easy-to-use services, in line with modern web and mobile applications. to transform data about present and future sea conditions into georeferenced
information like maps, graphs and other additional knowledge not evident from raw data; to automatically download, process and publish environmental data every day;
to provide different information formats, on the base of channel
capabilities (i.e., different bandwidths between web and mobile) or device
features (i.e., display dimensions), minimizing publishing and downloading time; to provide interoperable services, making them available through de facto
standards like Google Maps' Tile Map Service
All those points have been considered as the main concerns in designing and
implementing the SSA platform, a technological solution for delivering data
and services for improved situational sea awareness. Those goals have been
translated into the following system requirements:
An example of an easy-to-use service is SeaConditions (Coppini et al.,
2016), which provides ocean and weather forecasts for the Mediterranean Sea
across different channels like modern web browsers and native apps for the
most widely available mobile platforms (Android, Apple). By itself,
SeaConditions provides a custom user experience, exposing its own
business logic and allowing the user to access well-defined use cases. This
is made possible by the availability of a general software stack, called the
SSA platform, designed to be re-used in many situations as envisioned by the
TESSA project (like displaying maps).
Hence, the SSA platform is a software infrastructure that has been developed in the context of the TESSA project in order to collect, transform and provide forecast and observational data as information suitable for creating multi-channel applications like SeaConditions and VISIR (Mannarini et al., 2016b). Data analysis and presentation with care for efficiency, reliability and interoperability have been main requirements driving the platform design and implementation. During the project lifespan, several prototypes have been developed trying to overcome the challenges that have emerged during the project's timespan, mostly concerning the huge amount of data that have to be processed and provisioned. In this way, the architecture has emerged by a constant evolution driven by new functional and nonfunctional requirements, instead of being a complete up-front design.
While many other toolchains for rendering tiled maps exist, both commercial
and open source (like Mapnik
SSA platform architecture (color key: blue for front-end components, green for map components, red for external modules or externally produced data, brown/orange for the DSS-specific components and interactions).
Within the TESSA architecture, three main tiers can be identified (Fig. 1): the client tier (e.g., native apps downloaded from the respective mobile platform's store or web pages running within browsers' windows) representing client applications (like SeaConditions), the complex data analysis module (CDAM) tier, hosting forecast and DSS-specific models, and the SSA platform, processing the forecast data from the CDAM tier and provisioning map data to the client tier. The SSA platform communicates with the CDAM by fetching weather and marine forecast data as needed and by managing job submissions on behalf of the DSS applications.
In a glance, the SSA platform provides the building blocks for creating SSA applications, like standard services (e.g., the static and dynamic map services), hosting infrastructure (e.g., web server and messaging support), multi-channel availability (based on standard Internet protocols) and personalized user experience (e.g sharing user preferences across several applications and channels).
In order to make the data flow serviceable, there are several specialized software components involved, each one performing a well-defined task within the processing chain. These components can be grouped in two macro-areas: (1) front-end components, hosting applications and user-centered data, and (2) map and data analysis components, dealing with serving maps and querying data.
Front-end components include a web portal, which hosts the server and web client UIs of DSS applications, a message broker, which enables communication between the DSS and their CDAM counterparts, and a user database hosting user preferences (like application settings and favorite places).
Map components include a download daemon which fetches data from the CDAM, a batch rendering system which performs initial ingestion within the system, a map service featuring tiled maps for forecast data, and a map server serving static maps like bathymetry. Data analysis is provided by software modules that allow for on-the-fly queries of forecast data, according to a variety of patterns. The greater part of these components has been developed ad hoc for TESSA, leveraging open source software stacks and open internet-based protocols wherever possible in order to maximize reuse of well-known and proven base packages.
The web portal is also tasked with ensuring the right authorizations, so
that users can access applications according to their given permissions.
Front-end components are made available through an HTTP reverse
proxy A reverse proxy is a web server fetching resources from
other servers on behalf of some client: this is often used in IT departments
in order to provide a unique facade (e.g., host name) which hides several
other servers (as it happens within the TESSA architecture), thus improving
systems changeability.
The map service provides the HTTP endpoints for serving forecast and
observational maps and related metadata (though a custom RESTful; Fielding,
2000) API Application Program Interface, a collection of
programmatic interfaces used for developing software applications.
It must be underlined that, while the web user interface of each DSS application is being hosted within the portal, the computation-intensive part is not. In fact, the DSS are, by design, split in two parts between the user interface and the module running the algorithm, the latter being hosted within the CDAM tier. This deployment scheme is functional for reusing the computational resources of the CDAM cluster for running model software while freeing the SSA platform for performing its own data analysis and map-related tasks. Splitting each DSS is dictated by the different requirements of each application, where actions like submitting a request and getting a response may involve several seconds or minutes. The message broker makes easier to build application with such detached “submit-and-wait” logic: applications only have to assemble a message with their custom payload and queue them; the message broker will then track the request and ensure that the response from CDAM is then delivered to the client.
Front-end components deal with providing infrastructure services that allow
hosted services to provide data to clients according custom web APIs.
DSS are applications hosted within the portal
while an HTTP reverse proxy hides the complexities of the underlying
subsystems behind URLs based on a single host name (like it happens for
The web portal is a web container, based on Liferay
Front-end applications, like all DSS, are hosted within the portal as standard portlets, mini-web applications that can benefit of the infrastructure provided by portal containers like security, data sources and so on. Within TESSA, DSS applications, like VISIR, may have a heavy duty computational part that is hosted within the CDAM tier: a portlet's main purpose is to provide a UI and collect inputs, package them according to specific formats and queue them to the computing engine. In order to do this, a set of components has been designed and implemented to decouple the front end and the computing engine in order to reuse the latter across different client applications, like web and mobile. Other DSS, like My SeaConditions do not require any model computation and relay completely on services provided within the SSA platform tier.
The message broker is a middleware component that provides a channel between clients (e.g., web or mobile DSS applications) and the computation models, hosted within the CDAM tier. This allows client applications to host the user interface only, collecting input and presenting results while the actual computation is performed on a super-computing infrastructure that accesses both data and raw computing power.
In order for this model to work effectively, client applications assemble parameters into job requests and submit them to the message broker; this in turn checks the validity of the request and the queues it to the CDAM receiver. A hook, in form of a callback, is added so that the CDAM can notify job completion, either successful or not, including the data payload, without any need for the SSA platform to perform continuous polling for results. Currently, software running in the client tier still has to perform polling in order to check for results; the implemented solution allows, however, to confine periodic polling the SSA platform tier only. Because the portal is already connected to a dedicated relational database server, data related to job requests are stored within a separate schema hosted within the same database server.
The message broker is not coupled in any way to any particular DSS and acts like a “store and forward” queue: it receives and stores requests, forwards them to the computing engine and waits for responses. At the end of the process, web and mobile clients may retrieve and process such results, presenting them in their specific way (e.g., showing drawing symbols over a map or displaying a data table).
An additional feature is the ability to throttle requests: the message broker allows for setting a maximum number of pending requests for each user–service pair (e.g., each user can only make one request at a time for the VISIR DSS). This allows us to enforce limits on unnecessary workload on the CDAM tier while also preventing stale data to accumulate: in any case, a data cleaning policy can also be configured in order to purge stale requests (e.g., removing any successful or failed requests within 24 h from their submission date).
The map service is the software gateway to all the information stored within the SSA platform and it is designed to provide dynamic maps (like daily updated environmental data), static maps (like bathymetry) and data analysis functions (like simple and advanced data querying) to external systems and end-user applications. From a high point of view, it can be decomposed as a web API (RESTful web service) provisioning dynamic and static maps, a batch rendering system that constantly updates the forecast data and a computing cluster that performs the actual rendering work.
The system delivers
tile-rendered forecast maps and associated metadata for the Mediterranean
Basin, in order to allow applications to create client-side mash-ups; data querying functionality, allowing applications to browse data across the
available forecast time ranges and supported variables; on-the-fly rendering of map regions that have not been pre-rendered by the
batch rendering system.
Data querying enables clients to ask for the actual values associated for
given geographical coordinates and set of variables, with the ability to
specify start and end limits and get arrays of values for further processing
(e.g., displaying an XY chart).
A RESTful API is provided to client for querying for available maps and accessing the data browsing functions, according to the supported environmental data (see Sect. 5).
In addition to dynamic maps, static maps like country boundaries and bathymetry are also provided in order to ease the creation of meaningful mash-ups within clients.
Batch rendering system periodically fetches environmental data from data centers into a local SSA platform environmental data storage and triggers their initial ingestion within the system: the process also includes basic integrity checks and partial rendering of maps. The latter behavior means that the batch rendering system only pre-renders a limited set of map tiles in order to save resources while still allowing for rapid responses for most frequently used maps. For sections of maps that still have to be rendered, an on-the-fly rendering is triggered, queuing the task to the computing cluster. This allows for more efficient resource usage, requesting computations only when map tiles are effectively needed. Clients' requests are put on hold until the deferred rendering process has been completed. The resulting tiles are then stored within the map store so that following requests for the same tiles will be hitting the cache and get served faster.
Because of the great deal of computations required for rendering maps, the rendering system has been developed to scale horizontally by simply adding more machines. This is fulfilled by a computing cluster that provides the raw power for the rendering tasks, serving both the batch and on-the-fly scenarios.
Map tiles rendering is a natural candidate for parallel execution: the same
tasks are to be iterated on different data sets over and over. Initial
implementations used a thread-based parallelism: there was a single machine
with several CPU Central Processing Unit, the execution unit for
software instructions.
The solution consisted of implementing a distributed rendering model: a
master node navigates the tile pyramid and generates all rendering tasks.
The latter are then queued to a rendering grid composed of worker nodes that
only have one single job: to render as many tiles as they can according to
their CPU cores. Scalability is then achieved by adding as many worker nodes
as required and then automatically made available to the system, without a
restart being required; similarly, a node may be turned off without the
rendering cluster falling apart, besides having less computational power
available. Each worker node accesses data from SSA platform data storage,
executes the rendering (employing different techniques as required) and
returns the result (consisting of image bytes plus metadata). Tasks are
defined by descriptors assembled by clients (e.g., the dynamic maps service
or the batch rendering system) and queued to the cluster. A built-in load
balancing feature ensures that tasks are dispatched in a roughly fair manner
across different nodes, so that no single node is particularly overloaded.
We integrated the Hazelcast
Operational running of the system is allowed by the means of system logs and mailing: when issues happen (e.g., unavailability of data, invalid or corrupted files, internal errors), personnel can investigate the problem and operate in order to fix it (like, re-running rendering jobs for variables that were not available at an expected time). Supporting operational teams has been crucial in order to improve the final availability of the platform services to the final users and we further think of improving it by collecting statistics in order to formulate realistic service-level agreements with external users.
The aforementioned components are distributed across several nodes (virtual
machines), with different hardware configurations (CPU, memory and disk
capabilities) and the same operating system (Ubuntu Linux
The portal node hosts Liferay, the message broker and the DSS (web APIs and UIs); the database node hosts the MySQL instance with portal's and users' data; the file store node hosts the download daemon and provides storage for the NetCDF files in addition to running the components belonging to the map service (including the GeoPortal) while the worker node implements the rendering logic. The portal, database and file store nodes all have four CPU cores and 16 GB (the file store has 200 GB of additional disk space for storing the most recent downloaded data).
Each worker node (with 8 GB of RAM, four CPU cores and about 30 GB of disk space) hosts a single rendering instance of the map service, each of them able to plot a number of tiles equal to the number of CPU cores (that is, four concurrent rendering operations for each worker node). New worker nodes can be added easily in order to increase computational capacity and, collectively, all worker nodes cooperate to share the workload generated by the batch map pipeline and the on-the-fly rendering requests coming from the map service. The bigger the number of map tiles to plot, the larger the required worker resources need to be (see Sect. 6 for more information about the rendering process): the current rendering grid infrastructure scales linearly with the number of worker nodes by processing map tiles in parallel. Thread-level parallelism requires CPUs with multiple cores while process-level parallelism (e.g., multiple nodes) can combine multiple instances hosted on different machines. While thread-level parallelism is somewhat more efficient than process level (e.g., sharing data within a single process has less overhead that sharing across different processes and/or machines), machines with many cores can become exceedingly expensive while combining simpler and cheaper machines becomes paramount. Therefore, the architecture is potentially cheaper to operated and extend than single-process solutions: Google uses the same basic concept for increasing the throughput of their search engine in an affordable way (Dean and Ghemawat, 2004).
In addition to map rendering, the SSA platform also provides services for querying the data, currently providing a sequence of environmental values for a given variable. As improvement to data analysis, a subsystem (called ANSWER) for searching time slices matching specific conditions has been implemented (e.g., winds and sea currents matching some externally provided thresholds) and made available to external systems (like MySeaConditions; Coppini et al., 2016).
The simplest data query involves fetching values for a given variable from
each NetCDF file, a process that may require a few seconds before
completing, depending on the current system load and complexity of the
involved computations, which depend on the particular variable. In order to
speed the entire computation up, a data file indexing is performed and
consulted for checking each file: file reading is performed using Unidata's
NetCDF-Java library
The concept of a tile as a pre-defined georeferenced bounding box has been used for data analysis too: these “data tiles” can be used for relatively fast execution of statistics computations (such as average, standard deviation, minimum and maximum values). Web APIs, either custom TESSA-defined or OGC's standards, make these features available to external applications, like SeaConditions. A significant limitation of the current implementation is that data queries are performed sequentially, taking no advantage of the rendering grid: this is something that we plan to improve in the future since the grid-based parallelism has been found to be a major asset within the platform and should be exploited in other scenarios.
In comparison, ANSWER provides more advanced features and has been implemented in matching conditions for a given geo-coordinates point, while a prototype has been developed for analyzing entire areas in search for matching conditions (e.g., most favorable places for sailing in the next few days); the latter is presented in the following subsection.
ANSWER is a module for verifying that arbitrary weather and/or marine and/or oceanographic conditions occur in delimited areas of interest and time intervals. It is tightly integrated within the map components, complementing the map infrastructure with advanced data query features.
ANSWER implements an algorithm which allows users (like MySeaConditions) to model profiles of customized thresholds by specifying geographical, desired weather and/or marine conditions and a forecast range (e.g., next 72 h). Based on these settings, the forecast data are analyzed in search for valid matches. This processing is based on forecast data for the Mediterranean Sea, i.e., the NetCDF files produced as part of the TESSA project.
There are two main scenarios that are enabled by ANSWER.
Fixed region: the user selects a region of interest (ROI) and defines a set
of thresholds for one or more forecast variables (i.e., “wind speed over
10 knots and air temperature over 20 Fixed time: the user selects a time range and defines a set of thresholds
for one or more forecast variables (i.e., “wind speed over 10 knots and air
temperature over 20
In order to do this, ANSWER computes selection and intersection operations
(algebra) across multiple layers of variables. In a typical scenario the
user chooses the variable(s) and related thresholds (greater than, lesser
that, within), then ANSWER fetches from the SSA data storage the NetCDF
files, extracting the involved portions and checking the values one geo-coordinate (latitude and longitude) at a time.
The algorithm then assigns a value “1” if the condition is matched for the given value, or “0” if external to the wanted range: a new NetCDF file is then created, containing a matrix of 1's and 0's for each given point within the bounding box. This is the behavior of the selection operator. When only one variable is involved, the resulting NetCDF file is the final output of the algorithm (e.g., it may get displayed to the user). If more than one variable is involved, however, the process of selection is to be repeated a number of times. In that case a matrix of 1's and 0's will be produced, one for each variable.
The final result is then computed as an iterative intersection of all the matrices: the final result will be a new binary file, with 1 where all user defined conditions occur (a logic “AND” operation) (Fig. 2).
Example of intersection among three variables.
Two prototypes of ANSWER have been implemented, in GrADS and NCL, in order to evaluate performances: GrADS has proved to be faster while providing. For example, considering an area roughly equivalent to a zoom level 5 tiles, computing conditions for five different variables requires, on average, about 0.3 s in GrADS in comparison to 1.3 s required by NCL.
Let us suppose that the user is a surfer and is interested in the following
thresholds for wave height and wind speed and direction:
wave height in the range of 0 and 0.5 m; wind intensity comprised 1.5 and 3 m s wind direction ranging from NE to SE.
In addition, let us also suppose that the user is interested only in Apulian
coasts (southern Italy). ANSWER will first act by means of its selection
operator that will produce NetCDF files and images, by highlighting places
where user conditions are verified for each variable.
At this stage, the intersection operator will overlap results in order to identify ROIs that correspond to user defined conditions. Figure 3 helps visualizing the process by showing areas that are found to be matching wave height, wind direction and wind speed (respectively Fig. 3a–c) during the selection phase. Intersection is then performed iteratively between matrices “a” and “b” (resulting in matrix “d”) and, finally, between matrices “d” and “c”. Figure 3e represents the final result.
Selection operator: sample areas matching the conditions during
selection
Continuously producing and publishing maps presenting environmental data is a core business process of the SSA platform and the tile-based maps have already been used in other cases, like Tarquini et al. (2008). A lot of effort has been put in the implementation of a system that can manage large amount of scientific data in reasonable time for daily updates while still being as efficient as possible about the usage of computational resources.
Maps are collections of small image tiles: in TESSA, maps are defined for
each pair
Usage of raster images simplifies clients that only need to fetch and
present simple images: this is a trade-off which moves much of the rendering
effort of the server-side (thus, requiring it to be adequately sized).
Additionally, issues with the quality of the raster-based pictures have been
found, like excessive pixel artifacts, lack of smoothness or excessive image
blur (which is particularly evident in some cases, like stream lines).
During the years, several formats For
example, ESRI announced in February 2016 support for vector-based tiles in
their ArcGIS system. See
Tiled map generation.
Conventions for indexing the images belonging to tile pyramid has been
defined in the past by different vendors, like Google and Microsoft, and it
has then been standardized as Open Geospatial Consortium (OGC) as OpenGIS
Web Map Tile Service (WMTS) 1.0.0 specifications
Within TESSA, the process of building a single tile is quite complex but can
be abstracted as a fetch, process and store cycle that has to be repeated
for each tile belonging to a single map (and, in turn, for each map that has
to be rendered each day). The actual rendering is delegated to running
scientific tools in batch mode (e.g., providing a script in order to get an
image plot), like Grid Analysis and Display System
The algorithm that has been devised is a parallel, recursive tree traversal
of the whole tile pyramid (see Fig. 4), starting from zoom level 5 up to a
maximum zoom level The maximum zoom level it depends on the data
resolution: the higher the resolution, the higher the maximum zoom level. For
example, sea temperature (150 m resolution) can be rendered up to zoom
level 11 while precipitations (25 km resolution) can be rendered up to zoom
level 9 (image quality starts to degrade after this level).
The algorithm starts generating rendering task for each single tile and its four child tiles until the maximum zoom level is reached or the tile is filtered out (e.g., it is entirely on land while the variable being plot is a marine one). Filtering by land–sea mask enables to avoid about 13 % of the entire amount; the time necessary to compute if a tile is completely land-based (or marine-based) is negligible in comparison to the amount for plotting an empty tile, which is more than 2 orders of magnitude. The speedup has been made possible because the lookup table is completely available in memory during the rendering process.
Each tile is rendered according to a predefined set of steps:
GrADS or NCL scripts are instantiated according to the specific variable at
hand and limiting the data area to the coordinates related to the specific tile; the script instance is executing according to the specified plot engine,
producing an image representing the environmental field in the given area; image is post-processed in order to make it of the correct size and proportions. Network Common Data Format,
Introducing a new variable within the system required that custom-rendered
scripts had to be written and then parameterized once the wanted result is
achieved; these script templates are stored and then retrieved at rendering
time, with their placeholders being replaced with the actual values
(e.g., NetCDF
The entire map rendering process can be quite time consuming and it has been found to be directly related to data resolution (e.g., 150 m files take longer to process that 25 km data files), the area to plot (larger areas involve a larger number of data to be analyzed and thus greater processing times), number of files involved and computations to be performed (e.g., computing vector direction and magnitude takes considerably longer than just plotting the data read from a given file). Parallel rendering, both at process and thread level, can help to cut times off significantly: for example, the rendering of sea temperature tiles up to zoom 7 (510 tiles total) requires about 285 ss when only one thread is used (no parallelism), about 147 s with two threads and about 77 s when four threads are used.
At the beginning, requirements dictated only up to zoom level 8 and the whole rendering process happened during the night in a “batch mode” on single node: all data files were processed and the related map published in about 4 h. With the increase in the number of zoom levels and data resolutions (e.g., waves at 150 m), rendering every single tile was no more feasible and a different solution had to be found: on-demand rendering.
While tiles at zoom level 5 may require up to 3.5 s to render on average (worst case scenario for the sea currents variable), tiles at zoom level 10 only require 70 ms on average: the latter timings still apply for additional zoom levels, meaning that we hit the time required to start the external tool (GrADS or NCL) and store the resulting image, which cannot be further reduced.
Analysis was performed about each image transfer over the Internet and found that each image was transferred from the platform (production environment) to the client (a web browser) in about 225 ms (with about 30 % variance depending on server load). Additionally, has been envisioned that rendering every single tile of every time is a waste of resources since not all tiles are going to be browsed by users. This has paved the way to the idea of “on-the-fly rendering” from a given zoom level onwards instead of upfront rendering: from zoom level 7 or 8 (depending on each variable because of different resolutions), on-the-fly rendering is default setting without any noticeable effect on the user experience. In order to minimize the rendering time for topmost zoom levels, what was initially the nightly batch rendering of maps is now the initial “rendering seeding” phase before the on-demand rendering steps in.
This paper presented the main goals and design solutions that have been faced in the evolution of the SSA platform. The main goal behind it is to provide a common infrastructure for supporting the creation of application mash-ups that increase the sea situational awareness of a vast plethora of users. The platform has evolved considerably during the project, being refactored several times in order to match the increased data sizes and performance requirements needed to match daily availability to end users. Scalability, availability, interoperability and efficiency in resource usage have been primary concerns during the design.
In must be considered that current implementation still has room for additional improvements and refinements like grid-based data queries, client-based rendering using vector tiles instead of raster images and better health monitoring during the operational running of the system (which is critical in production environments and high-availability environments). These aspects will be part of future works, in addition to further improving interoperability with existing map rendering solutions and professional visualization systems. Finally, extending and consolidating the existing RESTful architecture, a business model based on the “forecasts as a service”, using a license and delivery model based on software as a service (SaaS) principles, might be explored.
Forecast and observational data used by the SSA platform are produced by ECMWF
and by the Copernicus Marine Environment Monitoring Service (CMEMS). Performance data of the SSA platform are
publicly available at
This work was performed within the framework of the TESSA Project PON01_02823 supported by PON Ricerca & Competitività 2007–2013 co-founded by the EU (Fondo Europeo di sviluppo regionale), MIUR (Ministero Italiano dell'Università e della Ricerca), and MISE (Ministero dello Sviluppo Economico). We would like to thank our project partners CMCC (EuroMediterranean Centre for Climate Change) and CNR (Italian Research Council) and our colleges at Links.
We would like to also thank the Copernicus Marine Environment Monitoring Service (CMEMS) for the ocean forecasting products, Istituto Idrografico Marina Militare for the bathymetry data, Servizio Meteorologico Aeronautica Militare Italiano for the weather forecasting products, and Istituto Superiore per la Protezione e la Ricerca Ambientale for the sea level data. Edited by: A. Olita Reviewed by: four anonymous referees