The odb import
tool can import data in the "wide" format (option "-f wide
" of the sql
tool):
$ ./odb sql select \* -i 2000010106.1.0.odb -f wide -o 2000010106.1.0.csv $ head -n 1 2000010106.1.0.csv expver@desc:string andate@desc:integer antime@desc:integer seqno@hdr:integer obstype@hdr:integer obschar@hdr:Bitfield[codetype:9;instype:10;retrtype:6;geoarea:6] subtype@hdr:integer date@hdr:integer time@hdr:integer rdbflag@hdr:Bitfield[lat_humon:1;lat_qcsub:1;lat_override:1;lat_flag:2;lat_hqc_flag:1;lon_humon:1;lon_qcsub:1;lon_override:1;lon_flag:2;lon_hqc_flag:1;date_humon:1;date_qcsub:1;date_override:1;date_flag:2;date_hqc_flag:1;time_humon:1;time_qcsub:1;time_override:1;time_flag:2;time_hqc_flag:1;stalt_humon:1;stalt_qcsub:1;stalt_override:1;stalt_flag:2;stalt_hqc_flag:1] status@hdr:Bitfield[active:1;passive:1;rejected:1;blacklisted:1;monthly:1;constant:1;experimental:1;whitelist:1] ...
The header of the text format is a list of column descriptions, each in a format: <column-name>:<type>
The type can be:
- REAL
- DOUBLE
- INTEGER
- STRING
- BITFIELD
In the last case, BITFIELD, the list of fields and their sizes in bits follows, in square brackets, for example:
rdbflag@hdr:Bitfield[lat_humon:1;lat_qcsub:1;lat_override:1;lat_flag:2;lat_hqc_flag:1;lon_humon:1;lon_qcsub:1;lon_override:1;lon_flag:2;lon_hqc_flag:1;date_humon:1;date_qcsub:1;date_override:1;date_flag:2;date_hqc_flag:1;time_humon:1;time_qcsub:1;time_override:1;time_flag:2;time_hqc_flag:1;stalt_humon:1;stalt_qcsub:1;stalt_override:1;stalt_flag:2;stalt_hqc_flag:1]
So, importing CSV text data (TAB delimited similarly as the one produced using the odb sql
tool in the example above) to ODB can be done like follows:
$ ./odb import -d TAB 2000010106.1.0.csv 2000010106.1.0.imported.odb
Delimiter can be changed with option -d
, by default it is ','.
Regarding the data in CSV, one should remember that we have currently the limitation that STRINGS can be 8 characters only.
Converting from other binary formats like e.g. netcdf to ODB via an intermediate ASCII should be avoided, due to lose of precision (unless the data is printed with full precision).