You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Identify the data you want

  • SFC, 6 parameters
  • Time: 00, March 2017
  • all steps (0/to/90/by/1, 93/to/144/by/3 and 150/to/360/by/6)
  • all numbers (1/to/50)

Use MARS list to find out the size

 

LIST,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    PARAM      = 134.128/151.128/165.128/166.128/246.228/247.228,
    DATE       = 20170301/to/20170331,
    TIME       = 0000,
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    OUTPUT     = cost,
    TARGET     = list.txt

When we run the list action this is the output:

size=4484574297000; # 4.48 TB
number_of_fields=1348500;
online_size=799241928660;
off_line_size=3685332368340;
number_of_tape_files=93;
number_of_disk_files=281;
number_of_online_fields=240330;
number_of_offline_fields=1108170;
number_of_tapes=32;

The data is too large: 4.48 TB and split across 32 tapes.

Use the MARS catalogue to know the data distribution in files and tapes

Using the MARS catalogue we browse the data we want to get until we get to the final stage which gives us a selection and several options: http://apps.ecmwf.int/mars-catalogue/?stream=enfo&levtype=sfc&time=00%3A00%3A00&expver=1&month=mar&year=2017&date=2017-03-01&type=pf&class=od

For this particular case we can choose different "Parameter", "Number" and "Step". This means that all the fields you can choose in this page are in the same tape file. Therefore users should get as much data as possible from this page.

If we run the list of the data that we need and appears in the final stage page we get:

LIST,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    PARAM      = 134.128/151.128/165.128/166.128/246.228/247.228,
    DATE       = 20170301,
    TIME       = 0000,
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    OUTPUT     = cost,
    TARGET     = list2.txt
size=144663687000;
number_of_fields=43500;
online_size=43232826000;
off_line_size=101430861000;
number_of_tape_files=3;
number_of_disk_files=10;
number_of_online_fields=13000;
number_of_offline_fields=30500;
number_of_tapes=3;

In this case the raw data (without post-processing) is ~144GB, which would be reduced if you interpolate to lat/lon and/or filter the area.

The data is split in 3 different tapes, which is sensible.

Now that we know the size of the data that we want to retrive in one go from the same hypercube we can start the study of the way to iterate to get the full month.

Split the request in sensible chunks iterating through the correct keywords

Using the browser we know how the data is distributed in the MARS tree. Now we focus on the current selection section. In this case:

From the top to the bottom we have to start iterating from the inner loop to the outer loop "Time", "date", ...

This is an example BASH script to loop 1 time/ 1 day in one go for one month

#!/bin/bash

PARAMS="134.128/151.128/165.128/166.128/246.228/247.228"
TIMES="0000"
YEAR="2017"
MONTH="03"

#date loop
for y in ${YEAR}; do

  for m in ${MONTH}; do
    #get the number of days for this particular month/year
    days_per_month=$(cal ${m} ${y} | awk 'NF {DAYS = $NF}; END {print DAYS}')

    for my_date in $(seq -w 1 ${days_per_month}); do
      my_date=${YEAR}${MONTH}${my_date}
      
      #time lop
      for my_time in ${TIMES}; do
        cat << EOF > my_request_${my_date}_${my_time}.mars
REQUEST,
    CLASS      = OD,
    TYPE       = PF,
    STREAM     = ENFO,
    EXPVER     = 0001,
    LEVTYPE    = SFC,
    PARAM      = ${PARAMS},
    DATE       = ${my_date},
    TIME       = ${my_time},
    STEP       = 0/1/2/3/4/5/6/7/8/9/10/11/12/13/14/15/16/17/18/19/20/21/22/23/24/25/26/27/28/29/30/31/32/33/34/35/36/37/38/39/40/41/42/43/44/45/46/47/48/49/50/51/52/53/54/55/56/57/58/59/60/61/62/63/64/65/66/67/68/69/70/71/72/73/74/75/76/77/78/79/80/81/82/83/84/85/86/87/88/89/90/93/96/99/102/105/108/111/114/117/120/123/126/129/132/135/138/141/144/150/156/162/168/174/180/186/192/198/204/210/216/222/228/234/240/246/252/258/264/270/276/282/288/294/300/306/312/318/324/330/336/342/348/354/360,
    NUMBER     = 1/to/50,
    TARGET     = enfo_pf_${my_date}_${my_time}.grib
EOF
      mars my_request_${my_date}_${my_time}.mars
      if [ $? -eq 0 ]; then
        rm -f my_request_${my_date}_${my_time}.mars
      fi
      done
    done
  done
done      

 

 

  • No labels