Skip to end of metadata
Go to start of metadata

csv-calc

csv-calc is an application to calculate statistics (such as mean, median, size, standard deviation...) on multiple fields of an input file. Input records can be grouped by id, block, or both.

One drawback of csv-calc is that it only outputs the statistics for each id and block. The input records themselves are not preserved. This means that you cannot use csv-calc as part of a pipeline.

csv-calc --append

The --append option to csv-calc passes through the input stream, adding to every record the relevant statistics for its id and block.

For example:

> echo -e "1,0\n2,0\n3,1\n4,1" | csv-calc mean --fields=a,id
Output (mean, id):
1.5,0
3.5,1
 
> echo -e "1,0\n2,0\n3,1\n4,1" | csv-calc mean --fields=a,id --append
Output (a, id, mean):
1,0,1.5
2,0,1.5
3,1,3.5
4,1,3.5

keeping track of fields and formats

Another challenge for csv-calc users is the large number of fields that it generates (it applies every operation to every indicated field).

There are now --output-fields and --output-format options to show what kind of output a given csv-calc command will produce.

Examples:

> csv-calc mean,diameter --fields=t,a,id,block --binary=t,d,ui,ui --output-fields
t/mean,a/mean,t/diameter,a/diameter,id,block
 
> csv-calc mean,diameter --fields=t,a,id,block --binary=t,d,ui,ui --output-format
t,d,d,d,ui,ui

With --append, these fields are appended to input fields:
id and block are not repeated 
> csv-calc mean,diameter --fields=t,a,id,block --binary=t,d,ui,ui --output-fields --append
t/mean,a/mean,t/diameter,a/diameter

> csv-calc mean,diameter --fields=t,a,id,block --binary=t,d,ui,ui --output-format --append
t,d,d,d
  • No labels