You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

As providers start to  add WIGOS ids to their data, the need to test how the NWP model copes with the WIGOS ids raised.

To this aim, a Python3 program has been created to add WIGOS ids to current SYNOP messages received at ECMWF.

The outline of this page is

1) Problem description

2) Program flow

3) Test data file and caveats


1) Description


The WIGOS id contains four parts such as 0-2XXXX-0-YYYYY, 

wigosIdentifierSeriesIssuer of IdentifierIssue NumberLocalIdentifier
02XXXX0YYYYY


The OSCAR web REST  API interface was used to obtain a list of all the WIGOS Ids available at the moment ( ).  From this information only the surface observations 0-20000-0-YYYYY were used.

The last part of the WIGOS id, ( local Identifier) matches the current BUFR message identifier ( concatenation  of blockNumber and stationNumber) and is used to do the mapping between

old stations and their WIGOS ids.


2)Program description

'''
Created on 22 Oct 2019


# Copyright 2005-2018 ECMWF.
# This software is licensed under the terms of the Apache Licence Version 2.0
# which can be obtained at http://www.apache.org/licenses/LICENSE-2.0.
# In applying this licence, ECMWF does not waive the privileges and immunities
# granted to it by virtue of its status as an intergovernmental organisation
# nor does it submit to any jurisdiction
   
This is a test program to encode Wigos Synop
requires
   
1) ecCodes version 2.8 or above (available at https://confluence.ecmwf.int/display/ECC/Releases)
2) python2.7
   
To run the program
   
   ./wigosTemp.py  -i synop_multi_subset.bufr -o out_synop_multisubset.bufr  -w WIGOS_TEMP_IDENT.csv
      
Uses BUFR version 4 template  and adds the WIGOS Identifier 301150
REQUIRES TablesVersionNumber above 28
   
Author : Roberto Ribas Garcia ECMWF 12/09/2019

'''
from eccodes import *
import argparse 
import json 
import re 
import pandas as pd 
import numpy as np 
import logging 
import requests 
import os 

def read_cmd_line():
    p=argparse.ArgumentParser()
    p.add_argument("-i","--input",help="input bufr file")
    p.add_argument("-o","--output",help="output bufr file with wigos")
    p.add_argument("-m","--mode",choices=["web","json"],help=" wigos source [ json file or web ]")
    p.add_argument("-l","--logfile",help="log file ")
    args=p.parse_args()
    return args 
    
def read_oscar_json(jsonFile):
    with open(jsonFile,"r") as f:
        jtext=json.load(f)
    return jtext 

def read_oscar_web(oscarURL="https://oscar.wmo.int/surface/rest/api/search/station?"):
    r=requests.get(oscarURL)
    jtext=json.loads(r.text)
    return jtext 

def parse_json_into_dataframe(jtext):
    '''
    parses the JSON from the file wigosJsonFile
    filters the stations by wigosStationIdentifiers key in the dictionaries
    '''
    
    wigosStations=[]
    nowigosStations=[]
    for d in jtext:
        if "wigosStationIdentifiers" in d.keys():
            wigosStations.append(d)
        else:
            nowigosStations.append(d)
    
    '''
    uses only the wigos 0-20XXX-0-YYYYY (surface)
    '''
    p=re.compile("0-20\d{3}-0-\d{5}")

    fwigosStations=[]
    for d in wigosStations:
        wigosInfo=d["wigosStationIdentifiers"]
        for e in wigosInfo:
            if e["primary"]==True:
                wigosId=e["wigosStationIdentifier"]
                if p.match(wigosId):
                    wigosParts=wigosId.split("-")
                    d["wigosIdentifierSeries"]=wigosParts[0]
                    d["wigosIssuerOfIdentifier"]=wigosParts[1]
                    d["wigosIssueNumber"]=wigosParts[2]
                    d["wigosLocalIdentifierCharacter"]=wigosParts[3]
                    d["oldID"]=wigosParts[3][-5:]
                    fwigosStations.append(d)
                    
    df=pd.DataFrame(fwigosStations)
    df=df[["longitude","latitude","name","wigosStationIdentifiers","wigosIdentifierSeries","wigosIssuerOfIdentifier","wigosIssueNumber",
           "wigosLocalIdentifierCharacter","oldID"]]  
    return df

def get_ident(bid):
    '''
    gets the ident of the message by combining blockNumber and stationNumber keys from the input BUFR file
    the ident may be single valued or multivalued ( only single valued are considered further)
    '''
    ident=None 
    if ( codes_is_defined(bid, "blockNumber") and codes_is_defined(bid,"stationNumber") ):
        blockNumber=codes_get_array(bid,"blockNumber")
        stationNumber=codes_get_array(bid,"stationNumber")
        if len(blockNumber)==1 and len(stationNumber)==1:
            ident="{0:02d}{1:03d}".format(int(blockNumber),int(stationNumber))
        elif len(blockNumber)==1 and len(stationNumber)!=1:
            blockNumber=np.repeat(blockNumber,len(stationNumber))
            ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
                   if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG] 
        elif len(blockNumber)!=1 and len(stationNumber)!=1:
            ident=[str("{0:02d}{1:03d}".format(b,s)) for b,s in zip(blockNumber,stationNumber) 
                   if b!=CODES_MISSING_LONG and s!=CODES_MISSING_LONG]
            
    return ident 

def add_wigos_info(ident,bid,wdf,obid):
    '''
    add the wigos information to the message ident pointed by bid
    the wdf is the whole wigos dataframe and obid is the output bid
    '''
    
    
    if codes_is_defined(bid, "shortDelayedDescriptorReplicationFactor"):
        shortDelayed=codes_get_array(bid,"shortDelayedDescriptorReplicationFactor")
    else:
        shortDelayed=None 

    if codes_is_defined(bid, "delayedDescriptorReplicationFactor"):
        delayedDesc=codes_get_array(bid,"delayedDescriptorReplicationFactor")
    else:
        delayedDesc=None 
       
       
    nsubsets=codes_get(bid,"numberOfSubsets")
    compressed=codes_get(bid,"compressedData")
    
    masterTablesVersionNumber=codes_get(bid,"masterTablesVersionNumber")
    if masterTablesVersionNumber<28:
        masterTablesVersionNumber=28
        
    unexpandedDescriptors=codes_get_array(bid,"unexpandedDescriptors")
    outUD=list(unexpandedDescriptors)
    outUD.insert(0,301150)
        
    '''
    only treat the uncompressed messages with 1 subset 
    for future add treatment of compressed messages with more than 1 subset
    '''
    
    if compressed==0 and nsubsets==1:
        if shortDelayed is not None:
            codes_set_array(obid,"inputShortDelayedDescriptorReplicationFactor",shortDelayed)
        if delayedDesc is not None:
            codes_set_array(obid,"inputDelayedDescriptorReplicationFactor",delayedDesc)
        codes_set(obid,"masterTablesVersionNumber",masterTablesVersionNumber)
        codes_set(obid,"numberOfSubsets",nsubsets)
        odf=wdf.query("oldID=='{0}'".format(ident))
        if not odf.empty:
            codes_set_array(obid, "unexpandedDescriptors",outUD)
            wis=odf["wigosIdentifierSeries"].values 
            if len(wis)!=1:
                wis=wis[0]
            codes_set(obid,"wigosIdentifierSeries",int(wis))
            wid=odf["wigosIssuerOfIdentifier"].values 
            if len(wid)!=1:
                wid=wid[0]
            codes_set(obid,"wigosIssuerOfIdentifier",int(wid))
            win=odf["wigosIssueNumber"].values 
            if len(win)!=1:
                win=win[0]
            codes_set(obid,"wigosIssueNumber",int(win))            
            wlid=odf["wigosLocalIdentifierCharacter"].values 
            wlid="{0:5}".format(wlid[0])
            logging.info(" wlid here {0}".format(wlid))
            codes_set(obid,"wigosLocalIdentifierCharacter",str(wlid))
            codes_bufr_copy_data(bid,obid)
        else:
            logging.info(" wigos {0} is empty for ident {1}".format(ident,odf["wigosLocalIdentifierCharacter"].values))
    else:
        logging.info(" skipping compressed  message id {0} with {1} subsets ".format(ident,nsubsets))
    
    return obid
    
     

def main():
    args=read_cmd_line()
    logfile=args.logfile 
    logging.basicConfig(filename=logfile,level=logging.INFO,filemode="w")
    
    infile=args.input 
    
    outfile=args.output 
   
    mode=args.mode 
    if mode=="web":
        jtext=read_oscar_web()
        cdirectory=os.getcwd()
        oscarFile=os.path.join(cdirectory,"oscar.json")
        with open(oscarFile,"w") as f:
            json.dump(jtext,f)
    else:
        cdirectory=os.getcwd()
        oscarFile=os.path.join(cdirectory,"oscar.json")
        with open(oscarFile,"r") as f:
            jtext=json.load(f)
           
       
        
    wigosDf=parse_json_into_dataframe(jtext)
    
    f=open(infile,"rb")
    nmsg=codes_count_in_file(f)
    fout=open(outfile,"wb")
    for i in range(0,nmsg):
        obid=codes_bufr_new_from_samples("BUFR4")
        bid=codes_bufr_new_from_file(f)
        codes_set(bid,"unpack",1)
        ident=get_ident(bid)
        if ident:
            logging.info (" \t message {0} ident {1} ".format(i+1,ident))
            add_wigos_info(ident,bid, wigosDf, obid)
            codes_write(obid,fout)
    
        else:
            logging.info ("message {0} rejected ".format(i+1))
        codes_release(obid)        
        codes_release(bid)
    f.close()    
   
    print (" finished")


if __name__ == '__main__':
    main()

The program can be called with the following arguments

-i    input BUFR file containing SYNOP messages without WIGOS ids

-o output BUFR file  that will contain the SYNOP messages with Wigos Id.

-m  mode ( can be web to make the program connect to OSCAR server or json to make the program use a JSON file containing the same information as the OSCAR server) this was done to speed up the development avoiding reloading the Oscar data from the web

-l log file to write the progress of the conversion

The program flow is the following

1) read the command line arguments

2) read the OSCAR information from web or JSON file and store it in a pandas DataFrame that will help in the  mapping.

3) open the input BUFR file

4) for each message,  find the message identifier ( concatenation of blockNumber+stationNumber). With this information

the function add_wigos_info is called with the wigosDf dataframe ( the mapping) and the input and output BUFR handles ibid and obid.

4.a ) the add_wigos_info function has a complex task, for each message it has to

       check if the delayedDescriptors are present and set them in the output message if they are present.

      find in the wigosDf dataframe the ident  message identifier. If found, adds the wigos information retrieved from the wigosDf

      copy the rest of the data to the output message


At this point some caveats are needed

  • Only uncompressed messages  (compressed =0) and  single subset (numberOfSubsets=1) are considered
  • The Oscar information retrieved from the web server has to be cleared for this program to work.
  • The masterTablesVersionNumber must be above 28 otherwise no WIGOS ids can be added
  • No labels