Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The outline of this page is :

1) Problem description

2) Program flow

3) Test data file and caveats


Data date of  predefined data set is: 2019-10-15 till 2019-10-17

1) Description


The WIGOS id contains four parts such as 0-2XXXX-0-YYYYY, 

...

old stations and their  WIGOS ids.


2)Program description

Code Block
languagepy
'''
Created on 22 Oct 2019


# Copyright 2005-2018 ECMWF.
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction
   
This is a test program to encode Wigos Synop
requires
   
1) ecCodes version 2.14.81 or above (available at https://confluence.ecmwf.int/display/ECC/Releases)
2) python3.6.8-01
   
To run the program
   
-i <input bufr >./addWigosProg.py  -m <mode [web|json]>  -l <logFile>  -o <output BUFR file>i synop_multi_subset.bufr -o out_synop_multisubset.bufr  -w WIGOS_TEMP_IDENT.csv
   
      
Uses BUFR version 4 template  and adds the WIGOS Identifier 301150
REQUIRES TablesVersionNumber above 28
   
Author : Roberto Ribas Garcia ECMWF 28/10/2019

'''


from eccodes import *
import argparse 
import json 
import re 
import pandas as pd 
import numpy as np 
import Modifications
    performance improvement ( uses skipExtraKeyAttributes)  and codes_clone   04/11/2019
    changes for SYNOP and TEMP messages                                       05/11/2019
    fixed codes_clone issue                                                   05/11/2019

'''
from eccodes import *
import argparse 
import json 
import re 
import pandas as pd 
import numpy as np 
import logging 
import requests 
import os 

def read_cmd_line():
    p=argparse.ArgumentParser()
    p.add_argument("-i","--input",help="input bufr file")
    p.add_argument("-o","--output",help="output bufr file with wigos")
    p.add_argument("-m","--mode",choices=["web","json"],help=" wigos source [ json file or web ]")
    p.add_argument("-l","--logfile",help="log file ")
    args=p.parse_args()
    return args 
    
def read_oscar_json(jsonFile):
    with open(jsonFile,"r") as f:
        jtext=json.load(f)
    return jtext 

def read_oscar_web(oscarURL="https://oscar.wmo.int/surface/rest/api/search/station?"):
    r=requests.get(oscarURL)
    jtext=json.loads(r.text)
    return jtext 

def parse_json_into_dataframe(jtext):
    '''
    parses the JSON from the file wigosJsonFile
    filters the stations by wigosStationIdentifiers key in the dictionaries
    '''
    
    wigosStations=[]
    nowigosStations=[]
    for d in jtext:
        if "wigosStationIdentifiers" in d.keys():
            wigosStations.append(d)
        else:
            nowigosStations.append(d)
    
    '''
    uses only the wigos 0-20XXX-0-YYYYY (surface)
    '''
    p=re.compile("0-20\d{3}-0-\d{5}")

    fwigosStations=[]
    for d in wigosStations:
        wigosInfo=d["wigosStationIdentifiers"]
        for e in wigosInfo:
            if e["primary"]==True:
                wigosId=e["wigosStationIdentifier"]
                if p.match(wigosId):
                    wigosParts=wigosId.split("-")
                    d["wigosIdentifierSeries"]=wigosParts[0]
                    d["wigosIssuerOfIdentifier"]=wigosParts[1]
                    d["wigosIssueNumber"]=wigosParts[2]
                    d["wigosLocalIdentifierCharacter"]=wigosParts[3]
                    d["oldID"]=wigosParts[3][-5:]
                    fwigosStations.append(d)
                    
    df=pd.DataFrame(fwigosStations)
    df=df[["longitude","latitude","name","wigosStationIdentifiers","wigosIdentifierSeries","wigosIssuerOfIdentifier","wigosIssueNumber",
           "wigosLocalIdentifierCharacter","oldID"]]  
    return df

def get_ident(bid):
    '''
    gets the ident of the message by combining blockNumber and stationNumber keys from the input BUFR file
    the ident may be single valued or multivalued ( only single valued are considered further) considered further)
    
    '''
    ident=None 
    if ( codes_is_defined(bid, "blockNumber") and codes_is_defined(bid,"stationNumber") ):
    '''
    ident=None 
    if ( blockNumber=codes_isget_definedarray(bid, "blockNumber") and 
        stationNumber=codes_isget_definedarray(bid,"stationNumber")
        if len(blockNumber)==1 and len(stationNumber):
==1:
            blockNumber=codes_get_array(bid,"blockNumber")
ident="{0:02d}{1:03d}".format(int(blockNumber),int(stationNumber))
        elif stationNumber=codes_get_array(bid,"stationNumber")
len(blockNumber)==1 and len(stationNumber)!=1:
           if len(blockNumber)==1 and =np.repeat(blockNumber,len(stationNumber))==1:
            ident=[str("{0:02d}{1:03d}".format(intb,s)) for b,s in zip(blockNumber),int(stationNumber))
 
                  elif len(blockNumber)==1 if b!=CODES_MISSING_LONG and len(stationNumber)s!=1:CODES_MISSING_LONG] 
        elif len(blockNumber)!=1   blockNumber=np.repeat(blockNumber,and len(stationNumber))!=1:
            ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
                   if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG] 
        elif len(blockNumber)!=1 and len(stationNumber)!=1:'''
            ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
    here only the first element of the list is returned to the main program
        this avoids lists being used in the if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG]
 dataframe query and breaking the logic
        '''
   
    return ident 

def copy_header(bid,obidif isinstance(ident,list):
    '''
    this function copies the header keys 
    ''' ident=ident[0]
    return ident 


    bhc=codes_get(bid,"bufrHeaderCentre")
    codes_set(obid,"bufrHeaderCentre",bhc)

def add_wigos_info(ident,bid,odf,obid):
    bhsc=codes_get(bid,"bufrHeaderSubCentre")'''
    codes_set(obid,"bufrHeaderSubCentre",bhsc)
    usn=codes_get(bid,"updateSequenceNumber")
    codes_set(obid,"updateSequenceNumber",usn)
    dc=codes_get(bid,"dataCategory")add the wigos information to the message ident pointed by bid
    the odf contains the WIGOS information for ident 
    codes_set(obid,"dataCategory",dc)
   obid is the output handle
    dsc=codes_get(bid,"dataSubCategory")
'''
   
  codes_set(obid,"dataSubCategory",dsc)  
    year=if codes_is_getdefined(bid, "typicalYearshortDelayedDescriptorReplicationFactor"):
    codes_set(obid,"typicalYear",year)
    monthshortDelayed=codes_get_array(bid,"typicalMonthshortDelayedDescriptorReplicationFactor")
    codes_set(obid,"typicalMonth",month)
else:
        day=codes_get(bid,"typicalDay")
shortDelayed=None 

    if codes_is_setdefined(obidbid, "typicalDaydelayedDescriptorReplicationFactor",day):
    hour    delayedDesc=codes_get_array(bid,"typicalHourdelayedDescriptorReplicationFactor")
    codes_set(obid,"typicalHour",hour)
else:
        delayedDesc=None 
       tmin=codes_get(bid,"typicalMinute") 
    if codes_is_setdefined(obidbid, "typicalMinuteextendedDelayedDescriptorReplicationFactor",tmin):
    sec    extDelayedDesc=codes_get_array(bid,"typicalSecondextendedDelayedDescriptorReplicationFactor")
    codes_set(obid,"typicalSecond",sec)
else:
       return extDelayedDesc=None 

    
    

def add_wigos_info(ident,bid,wdf,obid):    nsubsets=codes_get(bid,"numberOfSubsets")
    '''compressed=codes_get(bid,"compressedData")
    add
 the wigos information to the message ident pointed by bid masterTablesVersionNumber=codes_get(bid,"masterTablesVersionNumber")
    if masterTablesVersionNumber<28:
    the wdf is the wholemasterTablesVersionNumber=28
 wigos dataframe and obid is the output bid
    '''unexpandedDescriptors=codes_get_array(bid,"unexpandedDescriptors")
    
    outUD=list(unexpandedDescriptors)
    if codes_is_defined(bid, "shortDelayedDescriptorReplicationFactor"):outUD.insert(0,301150)
        shortDelayed=codes_get_array(bid,"shortDelayedDescriptorReplicationFactor")
    else:'''
    only treat the uncompressed messages with 1 shortDelayed=Nonesubset 

    for if codes_is_defined(bid, "delayedDescriptorReplicationFactor"):
        delayedDesc=codes_get_array(bid,"delayedDescriptorReplicationFactor")future add treatment of compressed messages with more than 1 subset
    else:'''
    
    delayedDesc=None if compressed==0 and nsubsets==1:
        '''
    

    IMPORTANT, takes into account 
    nsubsets=codes_get(bid,"numberOfSubsets")
    compressed=codes_get(bid,"compressedData")delayed replications ( all possible cases) to accommodate
    
    masterTablesVersionNumber=codes_get(bid,"masterTablesVersionNumber")
    if masterTablesVersionNumber<28:SYNOP + TEMP messages 
        masterTablesVersionNumber=28'''
        if shortDelayed is not None:
    unexpandedDescriptors=codes_get_array(bid,"unexpandedDescriptors")
     outUD=list(unexpandedDescriptors)
    outUD.insert(0,301150codes_set_array(obid,"inputShortDelayedDescriptorReplicationFactor",shortDelayed)
        
if delayedDesc is not '''None:
    only treat the uncompressed messages with 1 subset  codes_set_array(obid,"inputDelayedDescriptorReplicationFactor",delayedDesc)
    for future add treatment of compressed messages with more than 1 subset
if extDelayedDesc is not None:
     '''
    
    if compressed==0 and nsubsets==1:
codes_set_array(obid,"inputExtendedDelayedDescriptorReplicationFactor",extDelayedDesc)
            if

 shortDelayed is not None:
    codes_set(obid,"masterTablesVersionNumber",masterTablesVersionNumber)
        codes_set_array(obid,"inputShortDelayedDescriptorReplicationFactornumberOfSubsets",shortDelayednsubsets)
        if
 delayedDesc is not None:
    
        codes_set_array(obid, "inputDelayedDescriptorReplicationFactorunexpandedDescriptors",delayedDesc)outUD)
        wis=odf["wigosIdentifierSeries"].values 
        copy_header(bid,obid)
if len(wis)!=1:
          codes_set(obid,"masterTablesVersionNumber",masterTablesVersionNumber)  wis=wis[0]
        codes_set(obid,"numberOfSubsets",nsubsetswigosIdentifierSeries",int(wis))
        odf=wdf.query("oldID=='{0}'".format(ident))wid=odf["wigosIssuerOfIdentifier"].values 
        if not odf.empty:
len(wid)!=1:
            wid=wid[0]
        codes_set_array(obid, "unexpandedDescriptorswigosIssuerOfIdentifier",outUDint(wid))
            wiswin=odf["wigosIdentifierSerieswigosIssueNumber"].values 
            if len(wiswin)!=1:
                wis=wiswin=win[0]
            codes_set(obid,"wigosIdentifierSerieswigosIssueNumber",int(wiswin))
            
       wid wlid=odf["wigosIssuerOfIdentifierwigosLocalIdentifierCharacter"].values 
            if len(wid)!=1:wlid="{0:5}".format(wlid[0])
                wid=wid[0]
 logging.info(" wlid here {0}".format(wlid))
           codes_set(obid,"wigosIssuerOfIdentifierwigosLocalIdentifierCharacter",intstr(widwlid))
        codes_bufr_copy_data(bid,obid)
    else:
     win=odf["wigosIssueNumber"].values 
  logging.info(" skipping compressed  message id {0} with {1}  if len(win)!=1:subsets ".format(ident,nsubsets))
    
    return 
       win=win[0]
     

def main():
    print("ecCodes version {0}".format(codes_set(obid,"wigosIssueNumber",int(win))            _get_api_version()))
    args=read_cmd_line()
    logfile=args.logfile 
       wlid=odf["wigosLocalIdentifierCharacter"].values logging.basicConfig(filename=logfile,level=logging.INFO,filemode="w")
    
        wlid="{0:5}".format(wlid[0])infile=args.input 
    
    outfile=args.output 
   
 logging.info(" wlid here {0}".format(wlid)) mode=args.mode 
    if mode=="web":
        codes_set(obid,"wigosLocalIdentifierCharacter",str(wlid))
jtext=read_oscar_web()
        cdirectory=os.getcwd()
        codes_bufr_copy_data(bid,obidoscarFile=os.path.join(cdirectory,"oscar.json")
        else:
   with open(oscarFile,"w") as f:
         logging.info(" wigos {0} is empty for ident {1}".format(ident,odf["wigosLocalIdentifierCharacter"].values)) json.dump(jtext,f)
    else:
        loggingcdirectory=os.infogetcwd(")
 skipping compressed  message id {0} with {1} subsets ".format(ident,nsubsets))oscarFile=os.path.join(cdirectory,"oscar.json")
    
    return obid
    
     

def main()with open(oscarFile,"r") as f:
    args=read_cmd_line()
    logfile=args.logfile 
    logging.basicConfig(filename=logfile,level=logging.INFO,filemode="w"jtext=json.load(f)
    
      infile=args.input 
       
    outfile=args.output    
      wigosDf=parse_json_into_dataframe(jtext)
    mode=args.mode 
    if mode=="web":f=open(infile,"rb")
        jtext=read_oscar_web(nmsg=codes_count_in_file(f)
        cdirectory=os.getcwd(fout=open(outfile,"wb")
    for i   oscarFile=os.path.join(cdirectory,"oscar.json")in range(0,nmsg):
        with open(oscarFile,"w") as f:bid=codes_bufr_new_from_file(f)
            json.dump(jtext,fobid=codes_clone(bid)
    else:
    codes_set(bid,    cdirectory=os.getcwd('skipExtraKeyAttributes', 1)
        oscarFile=os.path.join(cdirectory,"oscar.json"codes_set(bid,"unpack",1)
        with open(oscarFile,"r") as f:ident=get_ident(bid)
       
        if ident:
   jtext=json.load(f)
         logging.info (" 
\t message {0} ident    
        {1} ".format(i+1,ident))

    wigosDf=parse_json_into_dataframe(jtext)
    
    fodf=openwigosDf.query(infile,"rb")
    nmsg=codes_count_in_file(f)
    fout=open(outfile,"wb")
"oldID=='{0}'".format(ident))           for i in range(0,nmsg):
  
      obid=codes_bufr_new_from_samples("BUFR4")
      if  bid=codes_bufr_new_from_file(f)not odf.empty:
        codes_set(bid,"unpack",1)
        ident=get_ident(bidadd_wigos_info(ident,bid, odf,obid)
        if ident:
       codes_write(obid,fout)
     logging.info (" \t message {0} ident {1} ".format(i+1,ident)) else:
            add_wigos_info(ident,bid, wigosDf, obid)
     logging.info(" wigos {0} is empty for  codes_write(obid,foutident {1}".format(ident,odf["wigosLocalIdentifierCharacter"].values))
    
        else:
            logging.info ("message {0} rejected ".format(i+1))
        codes_release(obid)        
        codes_release(bid)
    f.close()    
   
    print (" finished")


if __name__ == '__main__':
    main()


The program can be called with the following arguments

...

that are uncompressed ( compressed =0) and single subset ( numberOfSubsets=1) if their ident matches the ones in wigosDf.

5) a new function ( copy_header) was added to avoid changing the header of the message. Now, it copies the keys from bid to obid except  typicalDate which is read onlyIf  get_ident function founds many idents on a message only returns the first one.


During program execution a log  file is generated containing information about the processing.

...

  • Only uncompressed messages  (compressed =0) and  single subset (numberOfSubsets=1) are considered
  • The Oscar information retrieved from the web server has to be cleared for this program to work. This is the goal of the function parse_json_into_dataframe that uses regular expressions to filter out the WIGOS data.
  • When setting the WIGOS information It is important to preserve the data types , for example "wigosLocalIdentifierCharacter" is a character string. 
  • The masterTablesVersionNumber must be above 28 otherwise no WIGOS ids can be added. This is done in the add_wigos_info function that updates the table version number key for each message processed.


Results


The output file contains 22724 messagesfile contains 19543  SYNOP messages obtained from running the program on a input BUFR file containing raw SYNOP data received through GTS




View file
nameout_synop_wigos.bufr
height250

This file contains 7 TEMP messages obtained running the program on a BUFR file containing raw TEMP messages.

View file
namenewOutputout_temp_wigos.bufr
height250