You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Identify the data you want

  • SFC, 6 parameters
  • Time: 00
  • Dates: 1 to 31 March 2017
  • all steps (0/to/90/by/1, 93/to/144/by/3 and 150/to/360/by/6)
  • all perturbed members (1/to/50)
  • Area: Europe
  • Output grid: regular lat/lon 0.5/0.5

Use MARS list to find out the size

Using the MARS catalogue and the "View MARS request" functionality a list request of all the data you want can be created which can be used to find out the size and distribution of the raw data in the archive. Note the LIST verb and the OUTPUT = COST keyword.

LIST,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    PARAM      = 134.128/151.128/165.128/166.128/246.228/247.228,
    DATE       = 20170301/to/20170331,
    TIME       = 0000,
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    OUTPUT     = cost,
    TARGET     = list.txt

When we run the list action, the following output is written to the file specified with the TARGET keyword (in this example, 'list.txt'):

size=4484574297000; # 4.48 TB
number_of_fields=1348500;
online_size=799241928660;
off_line_size=3685332368340;
number_of_tape_files=93;
number_of_disk_files=281;
number_of_online_fields=240330;
number_of_offline_fields=1108170;
number_of_tapes=32;

The data is too large: 4.48 TB and split across 32 tapes.

Use the MARS catalogue to find out how the data are distributed in files and tapes

Using the MARS catalogue, browse to the data you want to retrieve until you reach the final stage which gives a selection and several options: http://apps.ecmwf.int/mars-catalogue/?stream=enfo&levtype=sfc&time=00%3A00%3A00&expver=1&month=mar&year=2017&date=2017-03-01&type=pf&class=od

For this particular case a different "Parameter", "Number" and "Step" can be selected for a specific date and time. All the fields you can choose in this page are stored on the same tape file. You should get as much data as possible from this page in a single MARS retrieval request.

If we run the list of the data that we need and appears in the final stage page we get:

LIST,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    PARAM      = 134.128/151.128/165.128/166.128/246.228/247.228,
    DATE       = 20170301,
    TIME       = 0000,
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    OUTPUT     = cost,
    TARGET     = list2.txt
size=144663687000;
number_of_fields=43500;
online_size=43232826000;
off_line_size=101430861000;
number_of_tape_files=3;
number_of_disk_files=10;
number_of_online_fields=13000;
number_of_offline_fields=30500;
number_of_tapes=3;

In this case the raw data (without post-processing) is ~144GB. The total size of the data to be transferred to your system will be less if you interpolate to lat/lon and/or filter the area.

The data is stored across 3 different tapes, which is sensible.

The size of the file that will be transferred to the system can be established by downloading a single field using the Post-processing keywords (usually GRID and AREA). Then multiply the size of the file obtained containing this single field by the total number of fields that you want to retrieve in a single request: number_of_fields=43500.

Now that we know the size of the data that we want to retrieve in one go from the same hypercube we can start the study how best to iterate to retrieve the full month of data required.

Split the request in sensible chunks iterating through the correct keywords

The browser can now be used to find out how the data are distributed in the MARS tree. Focussing on the "Current selection" section for this specific example shows:

From the top to the bottom we have to start iterating from the inner loop to the outer loop "Time", "date", etc.

This is an example BASH script to loop, retrieving data for one date and time at a time for a full month.

#!/bin/bash

#this example will filter the area of Europe (N/W/S/E) and interpolate the final fields to a lat/lon 0.5/0.5 degrees
AREA="73.5/-27/33/45"
GRID="0.5/0.5"
PARAMS="134.128/151.128/165.128/166.128/246.228/247.228"
TIMES="0000"
YEAR="2017"
MONTH="03"

#date loop
for y in ${YEAR}; do

  for m in ${MONTH}; do
    #get the number of days for this particular month/year
    days_per_month=$(cal ${m} ${y} | awk 'NF {DAYS = $NF}; END {print DAYS}')

    for my_date in $(seq -w 1 ${days_per_month}); do
      my_date=${YEAR}${MONTH}${my_date}
      
      #time lop
      for my_time in ${TIMES}; do
        cat << EOF > my_request_${my_date}_${my_time}.mars
RETRIEVE,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    GRID       = ${GRID},
    AREA       = ${AREA},
    PARAM      = ${PARAMS},
    DATE       = ${my_date},
    TIME       = ${my_time},
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    TARGET     = enfo_pf_${my_date}_${my_time}.grib
EOF
      mars my_request_${my_date}_${my_time}.mars
      if [ $? -eq 0 ]; then
        rm -f my_request_${my_date}_${my_time}.mars
      fi
      done
    done
  done
done      

 

 

  • No labels