Occasionally,  we receive reports of negative precipitation totals being computed from IFS output encoded in GRIB.  Although such reports often refer to "negative precipitation accumulation", the same issue can affect any field accumulated from the start of the forecast and small but spurious positive accumulations are also possible.   Positive accumulations can lead to small increases in, say, solar radiation during night time hours when a zero increase is expected.  This document explains both why this happens and the circumstances in which it occurs.

The effect is often blamed on the interpolation.  However, it is a side-effect of the GRIB packing and can also occur when the field is stored on the native model octahedral reduced Gaussian grid.

Note that the problem described here affects only fields accumulated from the start of the forecast.

In particular, the ERA5 short forecast accumulations do not suffer from the same problem  because they are packed and archived as accumulations since the previous post processing (archiving) step  i.e. they are already de-accumulated before packing.

GRIB packing discretises the values

When data are packed in GRIB using the simple packing method, the values are packed by storing differences from the reference value (the minimum of all the values) to a specified fixed number of bits referred to as the "bits per value" (ecCodes key: bitsPerValue).  Thus the packing introduces a discretisation of the values with an associated packing error (ecCodes key:  packingError) such that:

\( \mathrm{\mathbf{packed\ value}} - \mathrm{\mathbf{packing\ error}} < \mathrm{\mathbf{true\ value}} \leq \mathrm{\mathbf{packed\ value}} + \mathrm{\mathbf{packing\ error}} \)

The packing error depends on the range of field values and the number of bits per value used to pack the values.

Packing error increases with increasing range of values

The issue for fields accumulated from the start of the forecast is that the packing parameters - and hence the discretisation and the packing error - change as the range of the values to be packed increases.  Because the bits per value remains constant, the effect is to increase the packing error periodically by a factor of two as the range increases.  This is illustrated in Figure 1 below which shows how the packing error (the red line) increases as the values of the sunshine duration field increase throughout a 10-day forecast

Figure 1

Figure 1 on the left shows how the packing error (red line) increases as the range of values of the sunshine duration field (blue line) increases throughout at 10-day forecast.

Each time the range of values increases by a factor of two (indicated by the points where the blue line crosses the horizontal dashed lines) the packing error also increases by a factor of two.

Although the packing error is less than 1 second in the first 24 hours of the forecast, it is 8 seconds by the end of the forecast.

Packed values depend on the packing error

The increase in packing error also has an effect on the packed values themselves.  This is illustrated in the diagrams below.

Figure 2(a):  In this case, the packing error=0.5 (represented by the green boxes) and hence values are packed to the nearest 1.0.

Note that the true "exact model" values represented at the top are not stored exactly.  The value 291.7 yields the decoded value 292.0 on unpacking whereas 297.4 becomes 297.0 on unpacking. 

Figure 2(b): At certain specific points as the range of values increases, the packing error increases by a factor of two.  In this case, the packing error=1.0 and so values are packed to the nearest 2.0. 

Here we see that a true input value of 294.6 which yields a decoded value of 295.0 on unpacking in case (a) yields the lower decoded value of 294.0 even though the true input value is unchanged.

Similarly, the true input value of 297.4 which yields a decoded value of 297.0 on unpacking in the case shown in Figure 1(a) yields a higher decoded value of 298.0 in the  case illustrated in Figure 2(b)

How small spurious negative - and positive - values appear when de-accumulating an accumulated field

The example below for a small grid of 9 points shows how the packing error and hence the packed values change as the range of the values increases.  The deaccumulated packed values are the difference between the packed values at the current step and those at the previous step.  Here, a "step" can be considered as any time step where the data has been stored, and hence packed, in GRIB. Typically, this will be a forecast time step in hours.  In this example, the values are packed allowing for 8 bits per value.

Step=0

Accumulated
model
values
Accumulated
packed
GRIB values
Deaccumulated
packed values
packing error
0.000000.000000.00000
0.000000.000000.00000
0.000000.000000.00000
0.000000.000000.00000
0.000000.000000.00000
0.000000.000000.00000

  • All accumulations start with a zero value at step=0
  • The values at all points are stored exactly when packed in GRIB

Step=1

Accumulated
model
values
Accumulated
packed
GRIB values
Deaccumulated
packed values
packing error
0.000001.5000010.00000
3.250004.550005.90000
2.200000.031252.00000
0.000001.5000010.00000
3.25000 4.56250 5.87500
2.18750 0.06250 2.00000
0.000001.5000010.00000
3.250004.562505.87500
2.187500.062502.00000
0.03125
  • The model values increase at all points except the north west
  • Not all values can be represented exactly using 8 bits per value
  • In particular, the value of 0.031250 at the southern point is exactly the packing error and its packed value becomes 0.062500
  • In this case, the packed values are all multiples of 2*packing error=0.0625;  model values of 4.55, 5.9 and 2.2 are not stored precisely but only to the nearest multiple of 0.0625.

Step=2

Accumulated
model
values
Accumulated
packed
GRIB values
Deaccumulated
packed values
packing error
0.000001.50000 20.00000
3.250004.550005.90000
2.200000.031252.00000
0.000001.5000020.00000
3.25000 4.50000 5.87500
2.25000 0.00000 2.00000
0.000000.0000010.00000
0.00000 -0.06250 0.00000
0.06250 -0.06250 0.00000
0.06250
  • Only the model value at the north east point increases
  • The increase in the range of values increases the packing error by a factor of 2

  • Now all packed values are multiples of 2*packing error=0.125

  • The value at the central point of 4.55 is now packed as 4.5 compared to a packed value of 4.5625 at the previous step - a negative accumulation of -0.0625
  • The value at the south west point of 2.2 is now packed as 2.25 compared to a packed value of 2.1857 at the previous step - a positive accumulation of +0.0625

Step=3

Accumulated
model
values
Accumulated
packed
GRIB values
Deaccumulated
packed values
packing error
0.000001.50000 40.00000
3.250004.550005.90000
2.200000.031252.00000
0.000001.5000040.00000
3.250004.50000 6.00000
2.250000.000002.00000
0.000000.0000020.00000
0.000000.00000 0.12500
0.000000.000000.00000
0.12500
  • Again only the model value at the north east point increases further from 20.0 to 40.0
  • The increase in the range of values increases the packing error.
  • Now all packed values are multiples of 2*packing error=0.25
  • The model value at the eastern point is unchanged at 5.9 but now its packed value is 6.0 giving an apparent positive accumulation  of +0.125


Handling small precipitation accumulations - a real example

For accumulated fields, such as precipitation, the GRIB packing can have a strange affect on the totals, even when the data are used on the native model reduced Gaussian grid.  In some cases, the values concerned are sufficiently small that the issue can be ignored, but for others - for example when looking at the frequency of zero rain, or light drizzle on forecast day 10 - action is needed. The figures below highlight the issue is and its impacts:

Figure 3(a): Illustration showing the change in packing error (y-axis) at increasing lead times (x-axis).  The plot shows the correspondence between the lead times where negative precipitation accumulations occur (red points) and the points where the packing error, and hence the discretisation of the field, changes (indicated by the "step" changes in the blue line).  The negative accumulations occur due to the subtraction of values from two different levels of descritisation. The example shown is for total precipitation from a 10-day ENS forecast control run. Refer also to the annotation.

Figure 3(b): Ambiguity in identifying zero rain areas on day 10 in one example ENS forecast control run.

  • Green shows where there is zero rainfall in the field; this value is reliable.
  • Pink shows where there are negative (=-0.015mm) rainfall totals; these values should be set to zero.
  • Blue shows where there are positive (=+0.015mm) rainfall totals;  these also need to be set to zero because we do not know whether or not they represent zero.

On the map plot example in Figure 3(b) one can see that a large part of the world (blue & pink areas) is affected by the issue.  The graph in Figure 3(a) shows how the capacity of the model output to represent small values in 12h (or indeed any) periods diminishes at longer lead times. The discretisation will naturally be worse at the end of, say, the monthly forecast. This also means that the total areas of zero rain in a given period will become larger as the forecast progresses, purely because of the GRIB packing.

For some products or for verification this behaviour is undesirable.  The strategy  used by ECMWF to overcome this is to set all values in an accumulated field computed by subtraction that are less than a positive threshold , x, to zero. This threshold needs to be large enough to cater for the maximum possible discretisation at the end of the period in question. 

Based on the specific analysis presented here, for precipitation totals above 1000mm (from T+0)  a discretisation level of ~0.03mm is reached (final step on the example graph).  In this case, it is recommended to set the threshold, x, to be, say, 0.04 or 0.08 to cover this. Further analysis of additional forecasts (this particular analysis is based on one month of 00 UTC ENS forecast control runs and where the graph shown represents the worst case scenario among these) these recommendations could be revised. Note also that they apply only to 10 day forecasts.

With regard to current ECMWF forecast products, very small values are sometimes seen on the precipitation scale of the ENS meteograms during dry spells. In such instances, the discretisation may affect the plotting. For the point rainfall products, which allow for multiplication of forecast values by up to ~25 in rare cases, ECMWF uses x=0.04 as the threshold.