Page History

...

The outline of this page is :

1) Problem description

2) Program flow

3) Test data file and caveats

Data date of predefined data set is: 2019-10-15 till 2019-10-17

1) Description

The WIGOS id contains four parts such as 0-2XXXX-0-YYYYY,

...

old stations and their WIGOS ids.

2)Program description

Code Block

language	py

'''
Created on 22 Oct 2019


# Copyright 2005-2018 ECMWF.
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction
   
This is a test program to encode Wigos Synop
requires
   
1) ecCodes version 2.814.1 or above (available at https://confluence.ecmwf.int/display/ECC/Releases)
2) python3.6.8-01
   
To run the program
   
-i <input bufr >./addWigosProg.py  -m <mode [web|json]>  -l <logFile>  -o <output BUFR file>i synop_multi_subset.bufr -o out_synop_multisubset.bufr  -w WIGOS_TEMP_IDENT.csv
   
      
Uses BUFR version 4 template  and adds the WIGOS Identifier 301150
REQUIRES TablesVersionNumber above 28
   
Author : Roberto Ribas Garcia ECMWF 28/10/2019

Modifications
    Addedperformance copy_headerimprovement function( to keep the header keys from the input message uses skipExtraKeyAttributes)  and codes_clone   04/11/2019


'''


from eccodes import *
import argparsechanges 
importfor jsonSYNOP 
import re 
import pandas as pd 
import numpy as np 
import logging 
import requests 
import os 

def read_cmd_line():
    p=argparse.ArgumentParser()
    p.add_argument("-i","--input",help="input bufr file")
    p.add_argument("-o","--output",help="output bufr file with wigos")
    p.add_argument("-m","--mode",choices=["web","json"],help=" wigos source [ json file or web ]")
    p.add_argument("-l","--logfile",help="log file ")
    args=p.parse_args()
    return args 
    
def read_oscar_json(jsonFile):
    with open(jsonFile,"r") as f:
and TEMP messages                                       05/11/2019
    fixed codes_clone issue                                               jtext=json.load(f)
    return jtext 

def read_oscar_web(oscarURL="https://oscar.wmo.int/surface/rest/api/search/station?"):
    r=requests.get(oscarURL)
    jtext=json.loads(r.text)
    return jtext05/11/2019

'''
from eccodes import *
import argparse 
import json 
import re 
import pandas as pd 
import numpy as np 
import logging 
import requests 
import os 

def parseread_jsoncmd_into_dataframeline(jtext):
    '''p=argparse.ArgumentParser()
    parses the JSON from the file wigosJsonFilep.add_argument("-i","--input",help="input bufr file")
    filters the stations by wigosStationIdentifiers key in the dictionaries
    '''
    
    wigosStations=[]
    nowigosStations=[]
    for d in jtext:
        if "wigosStationIdentifiers" in d.keys():
            wigosStations.append(d)
        elsep.add_argument("-o","--output",help="output bufr file with wigos")
    p.add_argument("-m","--mode",choices=["web","json"],help=" wigos source [ json file or web ]")
    p.add_argument("-l","--logfile",help="log file ")
    args=p.parse_args()
    return args 
    
def read_oscar_json(jsonFile):
    with open(jsonFile,"r") as f:
            nowigosStations.append(djtext=json.load(f)
    
return jtext   '''

    uses only the wigos 0-20XXX-0-YYYYY (surfacedef read_oscar_web(oscarURL="https://oscar.wmo.int/surface/rest/api/search/station?"):
    r=requests.get(oscarURL)
    '''
jtext=json.loads(r.text)
     p=re.compile("0-20\d{3}-0-\d{5}")

    fwigosStations=[]
    for d in wigosStations:return jtext 

def parse_json_into_dataframe(jtext):
    '''
    parses the JSON from the file wigosJsonFile
    filters the stations by wigosInfo=d["wigosStationIdentifiers"]
 key in the dictionaries
    for e in wigosInfo:'''
    
    wigosStations=[]
    if e["primary"]==True:nowigosStations=[]
    for d in jtext:
        if wigosId=e["wigosStationIdentifierwigosStationIdentifiers"]
 in d.keys():
              if p.match(wigosId):wigosStations.append(d)
        else:
            wigosParts=wigosIdnowigosStations.splitappend("-"d)
    
    '''
    uses only the wigos 0-20XXX-0-YYYYY (surface)
   d["wigosIdentifierSeries"]=wigosParts[0] '''
    p=re.compile("0-20\d{3}-0-\d{5}")

    fwigosStations=[]
    for d in wigosStations:
         wigosInfo=d["wigosIssuerOfIdentifierwigosStationIdentifiers"]=wigosParts[1]
        for e in wigosInfo:
         d   if e["wigosIssueNumberprimary"]=wigosParts[2]=True:
                    dwigosId=e["wigosLocalIdentifierCharacterwigosStationIdentifier"]=wigosParts[3]
                    d["oldID"]=wigosParts[3][-5:]if p.match(wigosId):
                    fwigosStationswigosParts=wigosId.appendsplit(d"-")
                    d["wigosIdentifierSeries"]=wigosParts[0]
     df=pd.DataFrame(fwigosStations)
    df=df[["longitude","latitude","name","wigosStationIdentifiers","wigosIdentifierSeries","wigosIssuerOfIdentifier","wigosIssueNumber",
           d["wigosLocalIdentifierCharacter",wigosIssuerOfIdentifier"oldID"]=wigosParts[1]
  
    return df

def get_ident(bid):
    '''
    gets the ident of the message by combining blockNumber and stationNumber keys from the input BUFR file
 d["wigosIssueNumber"]=wigosParts[2]
                the ident may be single valued or multivalued ( only single valued are considered further)
    '''
    ident=None 
    if ( codes_is_defined(bid, "blockNumber") and codes_is_defined(bid,"stationNumber") ): d["wigosLocalIdentifierCharacter"]=wigosParts[3]
                    d["oldID"]=wigosParts[3][-5:]
                    fwigosStations.append(d)
        blockNumber=codes_get_array(bid,"blockNumber")
        stationNumber=codes_get_array(bid,"stationNumber")
    
    if len(blockNumber)==1 and len(stationNumber)==1:df=pd.DataFrame(fwigosStations)
            ident="{0:02d}{1:03d}".format(int(blockNumber),int(stationNumber))
  df=df[["longitude","latitude","name","wigosStationIdentifiers","wigosIdentifierSeries","wigosIssuerOfIdentifier","wigosIssueNumber",
      elif len(blockNumber)==1 and len(stationNumber)!=1:
  "wigosLocalIdentifierCharacter","oldID"]]  
    return    blockNumber=np.repeat(blockNumber,len(stationNumber))df

def get_ident(bid):
    '''
    gets the ident  ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
       of the message by combining blockNumber and stationNumber keys from the input BUFR file
    the ident may be single valued or multivalued if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG] 
  ( only single valued are considered further)
     
 elif len(blockNumber)!=1 and len(stationNumber)!=1: '''
    ident=None 
    if (  ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) codes_is_defined(bid, "blockNumber") and codes_is_defined(bid,"stationNumber") ):
        blockNumber=codes_get_array(bid,"blockNumber")
        stationNumber=codes_get_array(bid,"stationNumber")
     if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG]   if len(blockNumber)==1 and len(stationNumber)==1:
            ident="{0:02d}{1:03d}".format(int(blockNumber),int(stationNumber))
    return   ident 

def copy_header(bid,obid):
    ''' elif len(blockNumber)==1 and len(stationNumber)!=1:
    this function copies the header keys  and avoids using the default values on the output message
    ''' 
    bhc=codes_get(bid,"bufrHeaderCentre")
    codes_set(obid,"bufrHeaderCentre",bhc)
    bhsc=codes_get(bid,"bufrHeaderSubCentre")
    codes_set(obid,"bufrHeaderSubCentre",bhsc)
    usn=codes_get(bid,"updateSequenceNumber")
    codes_set(obid,"updateSequenceNumber",usn)
    dc=codes_get(bid,"dataCategory")
    codes_set(obid,"dataCategory",dc)

    if codes_is_defined(bid, "internationalDataSubCategory"): blockNumber=np.repeat(blockNumber,len(stationNumber))
            ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
                   if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG] 
        idsc=codes_get(bid,"internationalDataSubCategory")
elif len(blockNumber)!=1 and len(stationNumber)!=1:
            codes_set(obid,"internationalDataSubCategory",idsc)
    dsc=codes_get(bid,"dataSubCategory")
    codes_set(obid,"dataSubCategory",dsc)
    year=codes_get(bid,"typicalYear")
    codes_set(obid,"typicalYear",year)
    month=codes_get(bid,"typicalMonth")
    codes_set(obid,"typicalMonth",month)
    day=codes_get(bid,"typicalDay")
    codes_set(obid,"typicalDay",day)
    hour=codes_get(bid,"typicalHour")
    codes_set(obid,"typicalHour",hour)
    
    tmin=codes_get(bid,"typicalMinute")
    codes_set(obid,"typicalMinute",tmin)
    sec=codes_get(bid,"typicalSecond")
    codes_set(obid,"typicalSecond",sec)
    return 


def copy_header(bid,obid):
    bhc=codes_get(bid,"bufrHeaderCentre")
    codes_set(obid,"bufrHeaderCentre",bhc)
    bhsc=codes_get(bid,"bufrHeaderSubCentre")
    codes_set(obid,"bufrHeaderSubCentre",bhsc)
    usn=codes_get(bid,"updateSequenceNumber")
    codes_set(obid,"updateSequenceNumber",usn)
    dc=codes_get(bid,"dataCategory")
    codes_set(obid,"dataCategory",dc)
   
    dsc=codes_get(bid,"dataSubCategory")
    codes_set(obid,"dataSubCategory",dsc)
    year=codes_get(bid,"typicalYear")
    codes_set(obid,"typicalYear",year)
    month=codes_get(bid,"typicalMonth")
    codes_set(obid,"typicalMonth",month)
    day=codes_get(bid,"typicalDay")
    codes_set(obid,"typicalDay",day)
    hour=codes_get(bid,"typicalHour")
    codes_set(obid,"typicalHour",hour)
    
    tmin=codes_get(bid,"typicalMinute")
    codes_set(obid,"typicalMinute",tmin)
    sec=codes_get(bid,"typicalSecond")
    codes_set(obid,"typicalSecond",sec)
    return 
    
    

def add_wigos_info(ident,bid,wdf,obid):
    '''
    add the wigos information to the message ident pointed by bid
    the wdf is the whole wigos dataframe and obid is the output bid
    '''
    
    
    if codes_is_defined(bid, "shortDelayedDescriptorReplicationFactor"):
        shortDelayed=codes_get_array(bid,"shortDelayedDescriptorReplicationFactor")
    elseident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
                   if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG]
        '''
        here only the first element of the list is returned to the main program
        this avoids lists being used in the dataframe query and breaking the logic
        '''
        if isinstance(ident,list):
            ident=ident[0]
    return ident 


    

def add_wigos_info(ident,bid,odf,obid):
    '''
    add the wigos information to the message ident pointed by bid
    the odf contains the WIGOS information for ident 
    obid is the output handle
    '''
   
    
    if codes_is_defined(bid, "shortDelayedDescriptorReplicationFactor"):
        shortDelayed=codes_get_array(bid,"shortDelayedDescriptorReplicationFactor")
    else:
        shortDelayed=None 

    if codes_is_defined(bid, "delayedDescriptorReplicationFactor"):
        delayedDesc=codes_get_array(bid,"delayedDescriptorReplicationFactor")
    else:
        delayedDesc=None 
        
    if codes_is_defined(bid, "extendedDelayedDescriptorReplicationFactor"):
        extDelayedDesc=codes_get_array(bid,"extendedDelayedDescriptorReplicationFactor")
    else:
        extDelayedDesc=None 

        
    nsubsets=codes_get(bid,"numberOfSubsets")
    compressed=codes_get(bid,"compressedData")
    
    masterTablesVersionNumber=codes_get(bid,"masterTablesVersionNumber")
    if masterTablesVersionNumber<28:
        shortDelayedmasterTablesVersionNumber=None28
 

    if codes_is_defined(bid, "delayedDescriptorReplicationFactor"):   
        delayedDescunexpandedDescriptors=codes_get_array(bid,"delayedDescriptorReplicationFactorunexpandedDescriptors")
    else:outUD=list(unexpandedDescriptors)
    outUD.insert(0,301150)
    delayedDesc=None 
   
     '''
    

only treat the uncompressed messages with 1 subset 
    nsubsets=codes_get(bid,"numberOfSubsets")
    compressed=codes_get(bid,"compressedData")
    for future add treatment of compressed messages with more than 1 subset
    masterTablesVersionNumber=codes_get(bid,"masterTablesVersionNumber")'''
    if masterTablesVersionNumber<28:
    if compressed==0 and  masterTablesVersionNumber=28nsubsets==1:
        '''
    unexpandedDescriptors=codes_get_array(bid,"unexpandedDescriptors")
    outUD=list(unexpandedDescriptors)
    outUD.insert(0,301150)IMPORTANT, takes into account delayed replications ( all possible cases) to accommodate
        
SYNOP + TEMP messages '''
    only  treat the uncompressed'''
 messages with 1 subset 
   if forshortDelayed futureis addnot treatmentNone:
 of compressed messages with more than 1 subset
    '''codes_set_array(obid,"inputShortDelayedDescriptorReplicationFactor",shortDelayed)
    
    if compressed==0 and nsubsets==1:
        if shortDelayeddelayedDesc is not None:
            codes_set_array(obid,"inputShortDelayedDescriptorReplicationFactorinputDelayedDescriptorReplicationFactor",shortDelayeddelayedDesc)
        if delayedDescextDelayedDesc is not None:
            codes_set_array(obid,"inputDelayedDescriptorReplicationFactorinputExtendedDelayedDescriptorReplicationFactor",delayedDescextDelayedDesc)
        copy_header(bid,obid)    

        codes_set(obid,"masterTablesVersionNumber",masterTablesVersionNumber)
        codes_set(obid,"numberOfSubsets",nsubsets)
         odf=wdf.query("oldID=='{0}'".format(ident))

            if not odf.empty:
            codes_set_array(obid, "unexpandedDescriptors",outUD)
            wis=odf["wigosIdentifierSeries"].values 
            if len(wis)!=1:
                wis=wis[0]
            codes_set(obid,"wigosIdentifierSeries",int(wis))
            wid=odf["wigosIssuerOfIdentifier"].values 
            if len(wid)!=1:
                wid=wid[0]
            codes_set(obid,"wigosIssuerOfIdentifier",int(wid))
            win=odf["wigosIssueNumber"].values 
            if lenif len(win)!=1:
                win=win[0]
            codes_set(obid,"wigosIssueNumber",int(win))            
            wlid=odf["wigosLocalIdentifierCharacter"].values 
            wlid="{0:5}".format(wlid[0])
            logging.info(" wlid here {0}".format(wlid))
            codes_set(obid,"wigosLocalIdentifierCharacter",str(wlid))
            codes_bufr_copy_data(bid,obid)
        else:
            logging.info(" wigos skipping compressed  message id {0} is empty for ident with {1} subsets ".format(ident,odf["wigosLocalIdentifierCharacter"].valuesnsubsets))
    else:
    return 
   logging.info(" skipping compressed  message id 
     

def main():
    print("ecCodes version {0} with {1} subsets ".format(ident,nsubsets(codes_get_api_version()))
    args=read_cmd_line()
    returnlogfile=args.logfile obid
    
     

def main():
    args=read_cmd_line()
    logfile=args.logfile 
    logging.basicConfig(filenamelogging.basicConfig(filename=logfile,level=logging.INFO,filemode="w")
    
    infile=args.input 
    
    outfile=args.output 
   
    mode=args.mode 
    if mode=="web":
        jtext=read_oscar_web()
        cdirectory=os.getcwd()
        oscarFile=os.path.join(cdirectory,"oscar.json")
        with open(oscarFile,"w") as f:
            json.dump(jtext,f)
    else:
        cdirectory=os.getcwd()
        oscarFile=os.path.join(cdirectory,"oscar.json")
        with open(oscarFile,"r") as f:
            jtext=json.load(f)
           
       
        
    wigosDf=parse_json_into_dataframe(jtext)
    
    f=open(infile,"rb")
    nmsg=codes_count_in_file(f)
    fout=open(outfile,"wb")
    for i in range(0,nmsg):
        obidbid=codes_bufr_new_from_samplesfile("BUFR4"f)
        bidobid=codes_bufr_new_from_file(fclone(bid)
        codes_set(bid,"unpack", 'skipExtraKeyAttributes', 1)
        codes_set(bid,"unpack",1)
        ident=get_ident(bid)
       
        if ident:
            logging.info (" \t message {0} ident {1} ".format(i+1,ident))

            add_wigos_info(ident,bid, wigosDf, obid)
odf=wigosDf.query("oldID=='{0}'".format(ident))            codes_write(obid,fout)
    
        else:
    if not odf.empty:
      logging.info ("message {0} rejected ".format(i+1))
        codes_release(obid)add_wigos_info(ident,bid, odf,obid)
        
        codes_releasewrite(bidobid,fout)
    f.close()        else:
        
       print logging.info(" finished")


if __name__ == '__main__':
    main()


wigos {0} is empty for ident {1}".format(ident,odf["wigosLocalIdentifierCharacter"].values))
    
        else:
            logging.info ("message {0} rejected ".format(i+1))
        codes_release(obid)        
        codes_release(bid)
    f.close()    
   
    print (" finished")


if __name__ == '__main__':
    main()

The program can be called with the following arguments

...

4) for each message, create the message identifier ( concatenation of blockNumber+stationNumber) and add the WIGOS information to the messagesthat are uncompressed and add the WIGOS information to the messages

that are uncompressed ( compressed =0) and single subset ( numberOfSubsets=1) if their ident matches the ones in wigosDf.

5) If get_ident function founds many idents on a message only returns the first one.

During program execution a log file is generated containing information about the processing.

At this point some caveats are needed

Only uncompressed messages (compressed =0)

...

and single subset (numberOfSubsets=1)

...

5) a new function ( copy_header) was added to avoid changing the header of the message. Now, it copies the keys from bid to obid except typicalDate which is read only

During program execution a log file is generated containing information about the processing.

...

are considered
The Oscar information retrieved from the web server has to be cleared for this program to work. This is the goal of the function parse_json_into_dataframe that uses regular expressions to filter out the WIGOS data.
When setting the WIGOS information It is important to preserve the data types , for example "wigosLocalIdentifierCharacter" is a character string.
The masterTablesVersionNumber must be above 28 otherwise no WIGOS ids can be added. This is done in the add_wigos_info function that updates the table version number key for each message processed.

Results

The output file contains 19543 SYNOP messages obtained from running the program on a input BUFR file containing raw SYNOP data received through GTS

View file

name	out_synop_wigos.bufr
height	250

This file contains 7 TEMP messages obtained running the program on a BUFR file containing raw TEMP messages.

View file

name	out_temp_wigos.bufr
height	250

Only uncompressed messages (compressed =0) and single subset (numberOfSubsets=1) are considered
The Oscar information retrieved from the web server has to be cleared for this program to work. This is the goal of the function parse_json_into_dataframe that uses regular expressions to filter out the WIGOS data.
When setting the WIGOS information It is important to preserve the data types , for example "wigosLocalIdentifierCharacter" is a character string.
The masterTablesVersionNumber must be above 28 otherwise no WIGOS ids can be added. This is done in the add_wigos_info function that updates the table version number key for each message processed.

Results

The output file contains 22724 messages

Page tree

Versions Compared

Old Version 8

New Version Current

Key

1) Description

2)Program description

Results

Results