The MARS system is based on a client/server architecture, which makes it more flexible and adaptive to changes in a complex environment like ECMWF's.
The interaction between clients and servers is based on MARS requests. Clients send MARS requests to a server which looks for the data online or off-line, depending on the capabilities of the given server. If it fails to satisfy the request, the client will contact another server, according to a list of pre-configured servers.
The MARS client is a C program linked with ecCodes, to handle GRIB and BUFR data, and the MIR package, to support interpolation and regridding operations. The MARS source code is also embedded in other applications, such as Metview.
Access at ECMWF
MARS can be executed either in batch or interactive mode. Usually, clients issue requests from:
- Workstations and servers, suitable for most retrievals
- Supercomputers, retrieving data to be used as input for models
- Metview, for interactive/batch visualisation/plotting and manipulation
It is recommended to use workstations or workstation servers for data retrieval instead of the supercomputers. This avoids unnecessary idle time when data has to be read from tape.
Member States/Co-operating States access
Most of the access from Member State users comes via the ecgate system, by logging in or submitting batch jobs to ECMWF's computers. Moreover, the ECMWF Web API service allows authorised users to retrieve and list MARS data from outside ECMWF and to transfer the data directly to their host.
Details about the architecture of the MARS server are given in a separate article. MARS has evolved since this article was written back in 1999, e.g., software developed at ECMWF has replaced ObjectStore as metadata manager, but the general architecture is still valid. In 2003, the MARS contents were migrated to the High Performance Storage System (HPSS) tape management software, which replaced Tivoli Storage Manager (TSM, previously known as ADSM).
The various servers are described below.
This is the core of the MARS System and consists of the following hardware and software:
- Dedicated multiprocessor servers
- Several Terabytes of disk space, used for temporary storage before data is written to tape and for caching purposes while data is read from tape
- A set of automated tape libraries
- A set of applications written in C++ linked with MIR and ecCodes libraries
- High Performance Storage System (HPSS), which controls the tape, tape drive and robotic related activities.
Some characteristics of the main archive system such as request scheduling and data collocation are very important for users in order to optimise data retrieval.
Fields Data Base (FDB)
This is where models running at ECMWF write their outputs. It contains data produced by the most recent cycles.
Depending on the configuration and disk resources, it can contain up to several days of operational data and more recent research experiments. Also, there are many Fields Databases, several per supercomputer or server able to run ECMWF's models.
It is meant to provide very fast access as all the data resides online. This makes it very suitable for model input data retrieval or last cycles data access.
Reports Data Base
The RDB Contains online observations received via ECMWF's acquisition system. This system has been interfaced with MARS to allow real-time observation access. Access to this server is meant for monitoring and operational archive purpose only.
Interaction client/server: request execution
The MARS client has a configuration of servers to access when looking for data. The following is a schematic view of actions the MARS client performs per request:
- Checks the request syntax with the help of the MARS language
- Prints the request to be processed and the number of fields expected
- Queries the cache system (if configured)
- Queries all supercomputers' Fields Data Bases (if the data are not cached)
- Queries the main archives (if data are not in the FDB)
- Transfers the data to the client and post-processes (if needed)
- Caches the data (if the cache is present)
- Reports on the result
Note that post-processing is done while the data is being transferred and before writing to disk.