Parasolascii : A simple parser for the POLDER/PARASOL format

25 July 2012

Language/Format: C
Application type(s): Data Conversion
Related project(s): PARASOL, POLDER

PARASOLASCII

Section: User Commands (1)

NAME

parasolascii – a data dumper of the POLDER/PARASOL format into a text stream

parasolascii displays the data from the PARASOL binary files in plain text. You specify the data you want to be displayed, in the order you wish, and they will be extracted into columns (separated by tabulations) in this very order. You are then free to redirect the data into the tool of your choice (a plotter, a data filter, or an hand-made program). Of course, calls to parasolascii are scriptable.

DOWNLOAD

parasolascii-1.13.2.tar.gz

https://www.icare.univ-lille.fr/parasol/ (for details on the PARASOL mission and data)

REQUIREMENTS

any Unix or Unix-like system

INSTALLATION

run the install.sh script and follow the instructions (most of the time, the choice 2, to install the tool in your personal directory, is sufficient. You must have your $HOME/bin directory set in your path to run the command from anywhere)

SYNOPSIS

parasolascii [OPTIONS] PARASOL_FILE [dataset1 dataset2 …]

DESCRIPTION

extracts datasets from PARASOL_FILE and displays them in readable form on the standard output (screen or file).

PARASOL_FILE must be a PARASOL DATA FILE (with a D at the end of the name) or a PARASOL LEADER FILE (with a L at the end of the name). The data file contains the actual data, and the leader file contains metadata. No matter which file you give as argument, both of them must be available in the same directory.

Example : Let’s assume that your PARASOL data are stored under directories such as PARASOL/YYYY_MM_DD.

To extract the aerosol optical thickness at 865 nm (total and fine mode) and the Angstrom Exponent between 865 and 670 nm from the file P3L2TOGC054111KD of the 9th april 2007 (a LEVEL-2 aerosol product on ocean), simply type (the name and significance of all the parameters are provided when you give no parameter in input) :

parasolascii -c -g -u -999 /PATH/TO/PARASOL/2007_04_09/P3L2TOGC054111KD tau_a865 tau_ap865 a

The option -c will display the pixel coordinates in the POLDER grid (line then column) in the first two columns of output.

The option -g will display the corresponding geographic coordinates (latitude then longitude) in the two next columns of output.

Up to the version 1.9.29, the option -u -999 was very important, since without this option the lines containing missing values were discarded by default. From the version 1.10.1, the missing data are by default displayed as -999. (this can be changed by giving -u another value, and the former behaviour can be retrieved with -u discard).

From the version 1.10.1, wildcards are supported for directional parameters:

parasolascii -c -g P3L1TBG1054111KD rad490P_dir* is a shorter and less tedious way to type:

parasolascii -c -g P3L1TBG1054111KD rad490P_dir01 rad490P_dir02 … rad490P_dir16

NB. Wildcard arguments may clash with some shells (especially csh and tcsh). A workaround in that case is to quote them like this :

parasolascii -c -g P3L1TBG1054111KD ‘rad490P_dir*’ (see also the BUGS section)

Other common usages :

parasolascii

Called without argument, the tool displays its inline help (that may be more up-to-date than this page).

parasolascii /PATH/TO/PARASOL/2007_04_09/P3L2TOGC054111KD

displays the available datasets of the file (their name and significance)

parasolascii -h P3L2TOGC

displays the datasets available in P3L2TOGC files (it is equivalent to the previous command, but no file is needed). See the inline help for the arguments accepted by -h.

parasolascii -d /PATH/TO/PARASOL/2007_04_09/P3L2TOGC054111KD

displays the UTC dates and times of the file :

2007-04-09 14:23:01 2007-04-09 14:41:31 2007-04-09 15:05:58

(date and time of the beginning of acquisition, of the passage above equator, then of the end of acquisition)

parasolascii -N /PATH/TO/PARASOL/2007_04_09/P3L2TOGC054111KD

-16.910

(node longitude of the file, that is to say, the longitude of the satellite when it crosses the equator, in degrees east of Greenwich)

OPTIONS

-V: displays the version, and nothing else
-c: displays the coordinates in the POLDER/PARASOL grid (row,column), in the fields (1,2)
-S: displays the coordinates in the POLDER/PARASOL grid (row, column) of the current pixel in the fields (1,2) and the superpixel (S for superpixel) it belongs to in the fields (3,4). Will overload the -c option if present.

Available for full resolution products only.

-C range: displays the coordinates and data only in the suitable zone

range = min_row[:min_column],max_row[:max_column] (for a zone)
= row:col (for exactly one pixel) (the -c option will be automatically enabled by -C)

-g: displays the geographic coordinates (latitude, longitude), in the fields (1,2)

the longitudes are counted positively towards the east of Greewich, and negatively towards the West.

If the options -c and -g are used altogether, the geographic coordinates will displayed in the fields (3,4)

-G range: displays geographic coordinates and data only in the suitable zone

range = min_lat[:min_lon],max_lat[:max_lon] (for a zone)
= lat:lon (for exactly one pixel) (the -g option will be automatically enabled by -G)

-t: displays the acquisition time for each pixel

CAUTIOUS: THIS OPTION IS (AND WILL STAY) EXPERIMENTAL, YOU SHOULD AVOID IT IF POSSIBLE Indeed, the time is not directly available in the products, it is currently estimated by linear interpolation on the rows between the first and last acquisition times. It is ONLY intended to be an estimate, since pixels contain data that are in fact directional and are not acquired at the same time. And the estimation method may be subject to changes.

It can’t be helped unless times are made available in the products. the -t option is valid only for LEVEL1 and LEVEL2 products unless the -T option is used, the default time format is “%H:%M:%S”

-d: displays the acquisition date for each pixel

unless the -T option is used, the default date format is “%F” with no parameter, the date and time of start of acquisition, reference date and time (date and time of passage above equator for LEVEL1 and LEVEL2) and date and time of end of acquisition are displayed in this order (these are UTC values)

-X: swaps (X for eXchanges) rows and columns and latitudes and longitudes in the output

by default, parasolascii displays rows before columns and latitudes before longitudes; some tools expect the reverse. This is what the -X option is for.

-N: displays the node longitude (between -180 and +180), disabled for LEVEL3 products
-T time_fmt: sets the format used to display times and/or dates
same syntax as the Unix date command.
-b: integer raw values (not scaled) from the POLDER/PARASOL data file
-F fmt: output format for the parameter values (in printf-style), default is “%-9g”
-F g:fmt: output format for the geographic coordinates (in printf-style);

default is “% .3f % .3f ” be aware that there must be two format specifiers, one for latitude and one for longitude

-m value: missing values are displayed as value
-r value: out of range values are displayed as value
-u value: undefined or unknown values (i.e. missing or out of range) are displayed as value

(by default, missing values are displayed as -999., the special value ‘discard’, without quotes, discards lines with missing values) it is the same as -m and -r together, with the same value

-h P3L1TBG1|P3L2TRGB|…: help on parameters (list of parameters for the specified thematic)

(see the inline help for the PARASOL naming scheme)

-B 1-16: dqx bit to print

all bits in dqx are printed by default, the most significant at left. LSB=1, MSB=16

-x: displays dqx in decimal form, instead of binary form
-D: debug mode

prints slope, offset and other debugging infos for each parameter

-P: displays the path to the parameter files (defined at the installation of the software)
-p param_descriptor_file: reads the POLDER/PARASOL file using a given parameter file. Intended mainly for debugging purposes.

The usage of this option is discouraged unless you do know what you’re doing, and won’t be documented further.

COPYRIGHT

This software is governed by the CeCILL license under French law and abiding by the rules of distribution of free software. You can use, modify and/ or redistribute the software under the terms of the CeCILL license as circulated by CEA, CNRS and INRIA at the following URL “cecill.info“.

The tool is free of charge and provided with NO GUARANTEE for any purpose.

BUGS

The names of lots of datasets are ugly and not always consistent, they were inherited from home naming scheme from the different teams developing the products. They will be maintained for backward compatibility, but can be obsoleted (not deleted !) in a future version by better chosen and more consistent names (e.g. aot865_fine for the aerosol optical thickness of the fine mode at 865 nm, instead of tau_ap865, which has only a meaning to french users)

The option -T affects as well the dates (option -d) than the times (option -t).

A major design flaw of the PARASOL data files is that they are not self describing. For that reason, if one of the PARASOL products changes of format, parasolascii may become unable to read it correctly (it would be true for any other tool trying to read the PARASOL data). This is unlikely to arrive for LEVEL1 products (which are stable) but can still happen for LEVEL2 files. In that case the only solution is to contact the author and to beg him for an update (I do not refuse fees !), or to update the code yourself (you have got the source code)

The output lines containing missing data were discarded by default, up to the version 1.9.29. The option -u allowed the user to display them with the value of ones choice, but that was an unfortunate choice, since the default behaviour was going against the principle of least surprise. From the version 1.10.1, the default behaviour is to display the missing values as -999. (this can still be changed with -u), except in the case where all the data on the line are missing (in that case the line will be discarded). This seems a good compromise.

The wildcards are supported since the version 1.10.1. They work well with Bourne and Korn derived shells, provided there is no file matching the parameters in the directory where the command is called. csh and tcsh look for files matching wildcarded arguments and will complain if they don’t find any. A workaround is to quote the wildcarded arguments (this should work with any shell). Another solution with csh and tcsh is to use the command : set noglob to prevent the shell from expanding the arguments unexpectedly (do this in a shell script only, and only if you don’t need the file expansion functionality in your script).

Please report any bug or issue you could find to fabrice.ducos@univ-lille.fr

Author(s): Fabrice Ducos (ICARE – LOA)