Based on various questions from NWP users, this webpage answers FAQ regarding the Aeolus L2B BUFR format and its use for NWP.

The BUFR template document is here: AE-TN-ECMWF-L2BP_0073-L2B_BUFR_template_20151123_v1.00.pdf.  The L2B EE format IODD also provides information (indirectly) on what the L2B BUFR contains: AE_IF_ECMWF_L2BP_001_IODD_20180215_v3.10.pdf i.e. a subset of the L2B EE format data is copied over to the L2B BUFR.  There is also a new EE 2 BUFR conversion tool technical note: AE-TN-KNMI-BUFR-001_BufrConverter_SUM-20180413_v1.0.pdf

  • Why are there extra unexpanded descriptors in the Aeolus BUFR files compared to the WMO BUFR template?
    • These changes (which came with v3.00 of the L2B processing) are some modifications to the original published template. We found that the WMO approved template cannot store the satellite range for Aeolus because the decision was taken to go for a lower orbit of 320 km (rather than 408 km); this change happened after we designed the template.  The new satellite range (distance) values are outside the allowed ranges of the BUFR template! The 20xxxx descriptors modify the number of bits and the scaling so the numbers fit again. The only thing you need to do is allow the changed number of unexpanded descriptors in your code (63 in stead of 51).  Actually, after expansion, the number of descriptors will be identical to the original template, i.e. it will be reduced to 51.  You should not need any changed BUFR tables at all, since this is part of the official BUFR file format itself.

    • You should also know that we plan 2 more updates with the same method in our upcoming software version (L2Bp v3.10, to be released around late February 2019). With that update we will also modify the derivative values (dhlos_dT and dhlos_dp) for the Rayleigh channel (so adding another 8 20xxxx descriptors) to have increased precision (by mistake we didn't have enough).  For this planned change you will need to change the number unexpanded descriptors in your code to 71.  You should not need any changed BUFR tables to decode these changed files.

    • Note there is a mistake in L2Bp v3.01 EE and hence BUFR which means the value the dhlos_dp value is written out in units of m/s/hPa, rather than being converted to the correct units defined in the IODD (m/s/Pa).  This bug is fixed in the upcoming L2Bp v3.10.

    • Since getting formal registration by WMO is a very lengthy process, and we may need other modifications that we do not yet foresee, we decided to not request any changes with WMO for the time being.
    • Miscellaneous advice: If you mix the Aeolus BUFR messages with other observations in the same BUFR file, then recognizing the data type is not as straightforward now, but it is not too difficult. For example looking at the 13th descriptor "005069 ADM RECEIVER CHANNEL" this one is unique to this Aeolus template and will never be used by any other observation, nor will it shift due to the mentioned changes.

  • Is there a description or interpretation of the L2B BUFR variables for use in NWP?
    • There is not yet available an official document for users describing the meaning/interpretation of the data in the L2B BUFR files (however this webpage attempts to provide some guidance).  There is the BUFR template document, linked above, which helps to some extent.
    • Some basic advice:
      • The Aeolus L2B observations are provided as individual wind observations, each with a geolocation (not a profile with one geolocation).  Each BUFR message consists of a number of observations.
      • L2B HLOS winds are classified into different types e.g. Rayleigh-clear, Rayleigh-cloudy, Mie-clear and Mie-cloudy.  This is given by the combination of "receiverChannel" and "lidarL2bClassificationType".
      • The geolocation of the HLOS winds needed for an observation operator is described by the: "latitude", "longitude", "timeIncrement" (relative to the reference time given at the start of each message), "height" (altitude relative to geoid) and laser pointing information ("bearingOrAzimuth" and "elevation").
      • For a basic "point-like" wind observation operator we recommend using the  "coordinatesSignificance" values appropriate for the vertical and horizontal centre-of-gravity of the observation.  There is also geolocation information for start/end of the range-bin (horizontally) and top/bottom of the range-bin (vertically) in case people which to do an averaging-type (2D) observation operator rather than point-like winds.
      • Typically it is expected that L2B HLOS winds will have a horizontal averaging length-scale of around 80km.  However, due to classification of measurements (3 km scale chunks from which the observations are constructed) within a 80 km group into clear and cloudy it is possible to get observations smaller than 80 km (particularly for the Mie channel).  Where the air is all clear or all cloudy then you get the full 80 km observations.
  • What is the vertical geolocation for the HLOS wind observation?
    • The true vertical geolocation information for Aeolus is the geometric height (relative to the EGM96 geoid) of the lidar range-bin i.e. descriptor 0 07 071 Height (high resolution), for the Vertical Centre of Gravity of the observation.  There is also height for the top and bottom of the range-bin, if one likes to consider a more complex "vertical averaging" observation operator.
    • The geometric height of the range-bin should be converted to an appropriate geolocation variable for assimilation in your model.  Some models are fundamentally using pressure as the vertical reference e.g. the ECMWF IFS model.  Hence in the ECMWF model assimilation code we convert the geometric height to geopotential, and then a standard ECMWF routine obtains a pressure from the geopotential (using the background forecast) as part of the ECMWF IFS's pre-processsing steps.  Alternatively and more correctly one could forward model the geometric heights of the model levels and interpolate model winds to the Aeolus observation geometric height.  The Met Office model (UM) has geopotential height as the vertical co-ordinate, so they don't need to convert to a pressure value (but do need to convert the observation geometric to geopotential height using standard formulae). 
    • Any pressure geolocation that we could provide with Aeolus L2B winds would depend on a priori information from some NWP model, hence I don't think it's a good idea that we provide this.  We want the observations (in BUFR) to be as independent of NWP model as possible.
    • The L2B Rayleigh winds (molecular scattering) are corrected for a sensitivity to temperature (Doppler broadening) and pressure (Brillouin scattering) via the use of a priori NWP information.  Note that the Mie winds are not affected by temperature and pressure i.e. don't need a correction.  The Rayleigh winds would be biased if we didn't account for this (by e.g. 5 m/s).  The L2B processing documentation e.g. ATBD, which describes the Rayleigh-Brillouin correction.

    • There is no atmospheric pressure value provided for the vertical geolocation.  The descriptor 10004 in the Aeolus L2B BUFR is actually the reference pressure used for correcting the Rayleigh HLOS winds for a sensitivity to atmospheric pressure (a so-called Rayleigh-Brillouin correction - see the ATBD in the L2Bp v3.00 documentation).  However, I guess this is not written explicitly anywhere for the BUFR description (e.g. in the documentation TN7.3 ADM-Aeolus Level-2B BUFR description, it is not clear).  Hence users can mistake it for the proper vertical geolocation of the observation.  Our BUFR specialist at ECMWF (who defined the template with us) decided to use the already available (in BUFR) pressure descriptor rather than create a new one specially for Aeolus - the same goes for the temperature descriptor. 
    • Associated with these "reference" pressure and temperature values there are other descriptors: "derivative wind to pressure", and "derivative wind to temperature".  The idea of providing these "reference" values is that users may which to correct (linear correction) the Aeolus Rayleigh HLOS winds to a value using their own model's pressure and temperature data or use them in a Rayleigh wind observation operator.
    • Aeolus L2B Rayleigh winds rely on the AUX_MET (auxiliary meteorological) data i.e. a priori information, of the temperature and pressure at the observation's location - which is input to the L2Bp.  We create the AUX_MET data from the ECMWF short-range forecasts (i.e. background forecast).  However, there was a concern that other users of Aeolus L2B data might want to use their own model's temperature and pressure data, or even include the sensitivity to temperature and pressure in the observation operator.  To include it in the forward model, one can use the "reference" temperature and pressure values, and the linear sensitivity of the HLOS wind to temperature and pressure (part dHLOS/dT, and dHLOS/dp).  N.B. the typical dHLOS/dT  is around 0.1 m/s, so not huge!
    • Because the Mie HLOS winds do not require AUX_MET data T and p for any corrections, then they may not have these "reference" values available.  I'd have to check the code, but I guess the "reference" T and pressure is not guaranteed to be available with each observation - and clearly not based on your experience.
    • Not recommended: Of course the "reference pressure" with the Rayleigh winds (via ECMWF model) could be used if you really want to - but then you introduce some dependence on the ECMWF model - and as you say it is sometimes missing values and can't be relied upon.
  • Will the Aeolus BUFR be available on the GTS?
    • It should be available on the GTS in NRT (BUFR produced by ECMWF and then disseminated to GTS via EUMETSAT),  but this will only begin when the quality of the observations are deemed to be sufficient for dissemination to the general public, which may be many months after launch (a decision for ECMWF and ESA).  CAL/VAL teams will have early access though.
    • The BUFR files to be sent on the GTS are produced at ECMWF via the L2B Earth Explorer to BUFR converter tool.  Since each L2B EE file is typically one orbit's worth of data, then each BUFR file will also typically be one orbit of data.
  • How do we forward model the HLOS wind?  i.e. how to interpret the HLOS wind and geolocation azimuth and elevation angles?
    • See aeolus_obs_operator.pdf for some guidance on the geometry of the measurement and the observation operator.
    • Aeolus L2B BUFR doesn't provide a vector wind or a u-wind observation, but a horizontal line-of-sight (HLOS) wind component (which can easily be converted to LOS wind if you wish (using the elevation angle)).  HLOS wind is just a projection of the LOS wind onto the horizontal plane.

    • The basic forward model from the NWP model horizontal vector wind (u,v) to the Aeolus HLOS wind is: HLOS_wind = -u.sin(azimuth) - v.cos(azimuth)
    • This formula arises because of the definitions of Aeolus' azimuth angle and the sign convention for HLOS wind:
      • The azimuth angle is measured from north clockwise but based on the target to satellite pointing vector (rather than satellite to target as you would imagine).
      • HLOS wind is defined to be positive when blowing away from the instrument (i.e. reduced frequency of Doppler shift).
    • If your NWP model has the vertical wind component available, you may wish to forward model HLOS wind also including the vertical wind "contamination" (which is what Aeolus actually measures), in which case you should include the vertical wind component and elevation angle in the forward model i.e. HLOS_wind = -u.sin(azimuth) - v.cos(azimuth) -w.cot(elevation).  We assume generally that because the w component averaged over 80 km is small in the conditions Aeolus will sample, then we can ignore the vertical wind.

  • Why are there many invalid winds in the L2B BUFR (according to validity flags)?
    • You can expect a fair number good Mie winds (particularly Mie-cloudy), but also a fair number of invalid ones.  Note there are 4 types of winds available: Mie-cloudy, Mie-clear, Rayleigh-cloudy, Rayleigh-clear.
      Generally the Mie-cloudy and Rayleigh-clear to be the best quality winds, as you can imagine given the origin of the backscatter signal.  It is possible to some reasonable Mie-clear winds by misclassification of the measurement-level data - we classify measurements-bins into clear or cloudy based on their estimated scattering ratio (a L1B variable).  Rayleigh-cloudy are generally biased winds due to an imperfect correction for the Mie backscatter in the Rayleigh channel signals.

    • In the latest code (from L2Bp v3.00, rather than v2.30 that produced the BUFR you look at) we by default set all Mie-clear and all Rayleigh-cloudy winds to invalid to guide users to avoid them for DA (initially at least).

    • We recommend using the L2B HLOS wind estimated errors for QC (as provided in the BUFR).  They are derived from propagation of errors from the spectrometer counts assuming Poisson noise statistics (see the ATBD).
  • Can you explain the L2B BUFR profile numbering system?
    • We added profile information for those that are restricted to assimilate the data like it is old-style radiosonde profile i.e. one geolocation per profile.  Also, it might be useful if users prefer to assimilate wind shear (if we had varying bias along the orbit we thought wind shear assimilation might be a solution).

    • Each L2B Earth Explorer file starts afresh in terms of the profile number and the associated wind ID numbers.  We process each L1B EE file to one L2B EE file which then gets converted to one L2B BUFR file.  Each L2B BUFR file will be sent to (or retrieved by) EUMETSAT for distribution on the GTS - I'm not sure yet how they will split the data.
      Typically there is one L1B EE file per orbit (Svalbard to Svalbard ground station), but occasionally there are many orbits per L1B EE file, due to blind orbits when they can't dump the telemetry for a number of passes.

      e.g. looking at an ASCII dump of one real L2B EE file, there are 908 profiles, the last profile has:

      L2BC%Rayleigh_Wind_Profile_MDS(908)%start_of_obs_datetime = 07-OCT-2018 22:42:50.227159
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lat_start = -0999000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lat_average = -0081067802
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lat_stop = +0999000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lon_start = -0999000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lon_average = +0009943565
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_lon_stop = +0007517272
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_DateTime_Start   = 07-OCT-2018 22:42:50.227159
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_DateTime_Average = 07-OCT-2018 22:43:01.467466
      L2BC%Rayleigh_Wind_Profile_MDS(908)%Profile_DateTime_Stop    = 07-OCT-2018 22:43:01.711158
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%Channel                    = Rayleigh_Channel
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%Obs_Type                   = Obs_Type_clear_returns
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%num_winds_in_profile = +013
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%profile_id_number          = +0000000908
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(01) = +0000019407
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(02) = +0000019408
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(03) = +0000019409
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(04) = +0000019410
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(05) = +0000019411
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(06) = +0000019412
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(07) = +0000019413
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(08) = +0000019414
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(09) = +0000019415
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(10) = +0000019416
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(11) = +0000019417
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(12) = +0000019418
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(13) = +0000019419
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(14) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(15) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(16) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(17) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(18) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(19) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(20) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(21) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(22) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(23) = +0000000000
      L2BC%Rayleigh_Wind_Profile_MDS(908)%L2B_Wind_Profile%wind_result_id_number(24) = +0000000000

      The wind_id_number will start again from 1 in the next L2B EE file, as does the effective profile count.

      When it comes to the L2B BUFR we store the profile number of each wind result as you will have noticed based on your email.
      e.g. for one BUFR message (a dump of the BUFR to ASCII, using ECMWF's eccodes bufr_dump tool):



                                  {
                                    "key" : "profileNumber",
                                    "value" :
                                    [
                                      32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
                                      32, 32, 32, 32, 32, 32, 32, 32, 32, 32,
                                      32, 32, 32, 33, 33, 33, 33, 33, 33, 34,
                                      34, 34, 34, 34, 34, 34, 34, 34, 34, 34,
                                      34, 34, 34, 34, 34, 34, 34, 34, 34, 34,
                                      34
                                    ],
                                    "units" : "Numeric"
                                  },
                                  [

                                    {
                                      "key" : "observationIdentifier",
                                      "value" :
                                      [
                                        453, 454, 455, 456, 457, 458, 459, 460, 461, 462,
                                        463, 464, 465, 466, 467, 468, 469, 470, 471, 472,
                                        473, 474, 475, 476, 477, 478, 479, 480, 481, 482,
                                        483, 484, 485, 486, 487, 488, 489, 490, 491, 492,
                                        493, 494, 495, 496, 497, 498, 499, 500, 501, 502,
                                        503
                                      ],
                                      "units" : "Numeric"
                                    },

      So the profileNumber is equivalent to the index of the profile in the L2B EE file and the observationIdentifier is equivalent to the wind_result_id_number.