Tokenized access to data cubes is a Beta service which is closely monitored. The Data Store Service administrator reserves the right to take the necessary actions at any moment, including closing the service to maintain the overall performance of the Data Store Service.
Target audience
Access to ARCO data is programmatic; therefore, users of these resources and this documentation are expected to have some relevant programming experience.
The Data Store Service (DSS) at ECMWF offers a subset of some of the data available in the Data Stores in an Analysis Ready, Cloud Optimised (ARCO) format. These ARCO data are typically provided as Zarr archives stored in S3-compatible object storage.
DSS catalogue entries with data available in ARCO format provide specific guidance on how to access the associated Zarr assets.
This page provides general documentation on the ARCO resources, together with guidance and best practices for their effective and efficient use.
What is ARCO, and why should I use it?
Analysis Ready, Cloud Optimised (ARCO) encompasses several aspects of modern data management and delivery.
Fundamentally, ARCO is about transforming traditional datasets into formats that work natively in modern cloud-based workflows, including web applications and interactive analysis environments.
ARCO is particularly well suited to:
Applications requiring fast and frequent access such as
Workflows that access relatively small subsets of data such as
Cloud-based data science workflows such as
Applications that require downloading large volumes of data for repeated, offline processing may be better suited to traditional data access methods via the relevant Data Store request service (e.g. cdsapi).
Analysis Ready
“Analysis ready” means that the data can be used directly in downstream applications without additional preprocessing.
Specifically, this means that there is:
- no need to decode packed variables
- no need to apply scale factors or offsets
- no need to interpret non-standard or obscure metadata representations
- no need to merge multiple files before use
This significantly reduces the workload for downstream applications.
For web applications in particular, it can eliminate the need for an intermediate processing layer, reducing system complexity and improving performance and fidelity.
Cloud optimised
“Cloud optimised” means the data is structured to work efficiently over internet connections. Key characteristics include:
- Support for parallel access: Multiple users or processes can read different parts of the data at the same time.
- Minimised data transfer: Only the data needed for a task is downloaded, reducing bandwidth usage.
- Lazy loading: Data is loaded only when it’s actually needed, not all at once.
- Avoidance of large monolithic files: Data is split into smaller pieces instead of one huge file, making it easier to handle and transfer.
This is typically achieved by chunking the data. Chunking ensures that downstream applications only download the data required for a specific computation.
Data chunking
Data in Zarr archives is stored in chunks. A chunk is the smallest unit of data that can be transferred. Partial chunk downloads are not possible. Therefore, the chunking strategy used has a major impact on performance.
To address the two common, but opposing, usage patterns the DSS provides two versions of each Zarr archive: geo-chunked and time-chunked.
Geo-chunked
The geo-chunked archive version is chunked in the spatial dimensions and provides optimised access for long time series at a single point or small area.
For example, to produce time-series plots:
Time-chunked
The time-chunked archive version is chunked in the time dimensions and provides optimised access for large spatial regions, but for a short time period.
For example, to produce map plots:
Counter-intuitive chunking
The appropriate choice may appear counterintuitive:
For long time series at a single point or small area → use the geo-chunked Zarr archive.
For large spatial regions over short time periods → use the time-chunked Zarr archive.
Tokenised Access
Access to ARCO data is controlled via your Data Store Service credentials, i.e. using your API key as your Authorisation token.
The token must be included in the Authorization header of your HTTP requests. Examples demonstrating this are provided below for various access methods.
CDSAPI Key, ECMWF account and licencing
Your API key is available from your DSS profile page for one of the Data Stores you have registered with e.g. for the CDS
If you have not already registered with the Data Store Service, you must register an ECMWF account and use this to log in to one of the Data Store portals, e.g. the CDS or the ADS.
In addition to accepting the general DSS Terms and Conditions, you must also accept the licence associated with the dataset you are using from the relevant portal. Failure to accept the appropriate licence will result in authorisation errors.
Fair usage
Access to ARCO data is subject to fair usage policies.
Your access may be rate limited based on recent download volume. This is intended to prevent excessive or abusive use of the service.
The rate limiting mechanism is designed so that it does not impact typical usage patterns.
The specific parameters of the rate limiting are subject to change and are not publicly disclosed.
Access examples
In this section we are provided examples to access ARCO data through various access methods such as Python and JavaScript.
Python: xarray
Requirements
You will need to install the following packages to access the Zarr archives with xarray, they can be installed with PyPi or conda.
xarray zarr httpio fsspec
Plug and play
The following example is a xarray get-started guide, however this will not suffice for any "heavy duty" or operational workflows. Please see the Advanced Usage below for more robust workflows mechanisms.
Advanced Usage
The xarray interface to Zarr does not offer any retry mechanism as default. Given the nature of this remote access to data it is quite possible that larger workflows may result in a failed transfer of a data chunk for a number of possible reasons, e.g. a temporary loss in connectivity.
To make your workflows more robust, you can include a retry mechanism as part of your connection to the Zarr archives. Below are two examples using existing open-source libraries.
obstore
aiohttp_retry
JavaScript: zarrita
zarrita is a JavaScript toolkit for working with chunked, compressed, n-dimensional arrays in the Zarr archive. It runs natively in the browser, making it possible to stream ARCO data directly to web applications with no intermediate processing layer.
This is the approach used by the Weather Replay application, which renders ERA5 weather data on a 3D globe by reading directly from the Zarr archive on the client side.
Requirements
Install zarrita via npm or yarn etc:
npm install zarrita
Plug and play
The following example is a zarrita get-started guide, however this will not suffice for production web applications. Please see the Advanced Usage below for more robust mechanisms.
Advanced usage
For production web applications, withConsolidated is particularly important: without it, opening each array in the hierarchy requires a separate metadata fetch over the network. Wrapping the store with withConsolidated loads the entire metadata tree in one request, after which open calls resolve locally.
See the zarrita cookbook for further examples.

