Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. The default behaviour for grib_compare without any option is to perform a bit by bit comparison of the two messages. If the messages are found to be bitwise different then grib_compare switches to a "key based" mode to find out which coded keys are different. To see how grib_compare works we first set the shortName=2d (2 metre dew point temperature) in the file regular_latlon_surface.grib1

    Code Block
     > grib_set -s shortName=2d regular_latlon_surface.grib1 2d.grib1
    

    Then we can compare the two fields with grib_compare

    Code Block
    > grib_compare regular_latlon_surface.grib1 2d.grib1
     
    -- GRIB #1 -- shortName=2t paramId=167 stepRange=0 levelType=sfc level=0 packingType=grid_simple gridType=regular_ll --
    long [indicatorOfParameter]: [167] != [168]

    In the output we see that the only "coded" key with different values in the two messages is indicatorOfParameter which is the relevant key for the parameter information. The comparison can be forced to be successful listing the keys with different values in the -b option

    Code Block
    > grib_compare -b indicatorOfParameter regular_latlon_surface.grib1 2d.grib1


  2. Some options are provided to compare only a set of keys in the messages. The option -H is used to compare only the headers coded in the message, it doesn't compare the data values. The option "-c key1:[i|d|s|n],key2:[i|d|s|n],... " can be used to compare a set of keys or namespaces. The letter after the colon is optional and it is used to force the type used in the comparison which is otherwise assumed to be the native type of the key. The possible types are:

    • :i -> integer
    • :d -> floating point (C type double)
    • :s -> string
    • :n -> namespace.
    When the type "n" is used all the set of keys belonging to the specified namespace are compared assuming their own native type. To illustrate how these options work we change the values coded in a message using grib_filter with the following rules file (see grib_filter)
    Code Block
    set bitsPerValue=10;
    set values={1,2.5,3,4,5,6,70};
    write "first.grib1";
    set values={1,2.5,5,4,5,6,70};
    write "second.grib1";
    We first compare the two files using the -H option (only headers are compared)
    Code Block
    > grib_compare -H first.grib1 second.grib1
    The comparison is successful because the data are not compared. To compare only the data we have to compare the "data namespace".
    Code Block
    > grib_compare -c data:n first.grib1 second.grib1
     
    -- GRIB #1 -- shortName=t paramId=130 stepRange=0 levelType=ml level=1 packingType=grid_simple gridType=reduced_gg --
    double [packedValues]: 1 out of 7 different
     max absolute diff. = 2.0000000000000000e+00, relative diff. = 0.4
        max diff. element 2: 3.00000000000000000000e+00 5.00000000000000000000e+00
        tolerance=0.0000000000000000e+00 packingError: [0.0625005] [0.0625005]
        values max= [70]  [70]         min= [1] [1]

    The comparison is showing that one of seven values is different in a comparison with the (default) absolute tolerance=0. We can change the tolerance with the -A option:
    Code Block
    > grib_compare -A 2 -c data:n first.grib1 second.grib1
    and we see that the comparison is successful if the absolute tolerance is set to 2. We can also set the relative tolerance for each key with the option -R:
    Code Block
    > grib_compare -R packedValues=0.4 -c data:n first.grib1 second.grib1

    and we get again a successful comparison because the relative tolerance is bigger than the relative absolute difference of two corresponding values. Another possible choice for the tolerance is to be equal to the packingError, which is the error due to the packing algorithm. If we change the decimalPrecision of a packed field we introduce a packing error sometimes bigger than the original packing error.

    Code Block
    > grib_set -s changeDecimalPrecision=0 first.grib1 third.grib1

    and we compare the two fields using the -P option (tolerance=packingError).
    Code Block
    > grib_compare -P -c data:n first.grib1 third.grib1

    the comparison is successful because their difference is within the biggest of the two packing error. With the option -P the comparison is failing only if the original data coded are different, not if the packing precision is changed. If we try again to compare the fields without the -P option:
    Code Block
    > grib_compare -c data:n first.grib1 third.grib1
     
    -- GRIB #1 -- shortName=t paramId=130 stepRange=0 levelType=ml level=1 packingType=grid_simple gridType=reduced_gg --
    double [packedValues]: 1 out of 7 different
     max absolute diff. = 5.0000000000000000e-01, relative diff. = 0.166667
        max diff. element 1: 2.50000000000000000000e+00 3.00000000000000000000e+00
        tolerance=0.0000000000000000e+00 packingError: [0.0625005] [0.5]
        values max= [70]  [70]         min= [1] [1]

    we see that some values are different and that the maximum absolute differenc is close to the biggest packing error (max diff=0.48 packingError=0.5). The packing error was chosen to be 0.5 by setting decimalPrecision to 0 which means that we don't need to preserve any decimal figure.

  3. When we already know that the fields are not numerically identical, but have similar statistical characteristics we can compare their statistics namespaces:

    Code Block
    > grib_compare -c statistics:n first.grib1 third.grib1

    
     

    
    -- GRIB
    #1
     #1 -- shortName=t paramId=130 stepRange=0 levelType=ml level=1 packingType=grid_simple gridType=reduced_gg --

    
    double [avg]: [1.30714285714285711748e+01] != [1.31428571428571423496e+01]

        absolute
    
        absolute diff. = 0.0714286, relative diff. = 0.00543478

        tolerance
    
        tolerance=0

    
    double [sd]: [2.32907531796090587761e+01] != [2.32589679873534969090e+01]

        absolute
    
        absolute diff. = 0.0317852, relative diff. = 0.00136471

        tolerance
    
        tolerance=0

    
    double [skew]: [2.02295027950165895447e+00] != [2.02385673400705590197e+00]

        absolute
    
        absolute diff. = 0.000906455, relative diff. = 0.000447885

        tolerance
    
        tolerance=0

    
    double [kurt]: [2.12697527593972246507e+00] != [2.12906658242618895827e+00]

        absolute
    
        absolute diff. = 0.00209131, relative diff. = 0.000982264

        tolerance
    
        tolerance=0
    and we see that maximum, minimum, average, standard deviation, skewness and kurtosis are compared. While the values are different by 0.48 the statistics comparison shows that the difference in the statistical values is never bigger than 0.052
    Code Block
    > grib_compare -A 0.052 -c statistics:n first.grib1 third.grib1

    
     

    
    -- GRIB
    #1
     #1 -- shortName=t paramId=130 stepRange=0 levelType=ml level=1 packingType=grid_simple gridType=reduced_gg --

    
    double [avg]: [1.30714285714285711748e+01] != [1.31428571428571423496e+01]

        absolute
    
        absolute diff. = 0.0714286, relative diff. = 0.00543478

        tolerance
    
        tolerance=0.052

    The statistics namespace is available also for spherical harmonics data and provides information about the field in the geographic space computing them in the spectral space for performance reasons.

...