Skip to end of metadata
Go to start of metadata

math-array utility in snark is a trivial wrapper for a range of numpy array operations. the main purpose of math-array is to easily run array operations on streams of data compatible with the csv-style utilities in comma and snark.

math-array does not attempt to substitute numpy functionality. If you need something customised, just write your own python code as usual.

Currently, it exposes three operations:

  • split
  • transpose
  • (relatively) arbitrary numpy array operation



> ( echo some_other_stuff,0,1,2,3,4,5; echo more_other_stuff,6,7,8,9,10,11 ) | csv-to-bin s[32],6f | math-array split --shape 3,2 --header-size 32 | csv-from-bin s[32],2f


> # transpose
> ( echo 0,1,2,3,4,5; echo 6,7,8,9,10,11 ) | csv-to-bin 6f | math-array transpose --to-axes 1,0 --shape 3,2 | csv-from-bin 6f
> # the record has not only the array, but also other fields
> ( echo some_other_stuff,0,1,2,3,4,5; echo more_other_stuff,6,7,8,9,10,11 ) | csv-to-bin s[32],6f | math-array transpose --to-axes 1,0 --shape 3,2 --header-size 32 | csv-from-bin s[32],6f

(Relatively) arbitrary numpy array operation

> # swapaxes
> ( echo some_other_stuff,0,1,2,3,4,5; echo more_other_stuff,6,7,8,9,10,11 ) | csv-to-bin s[32],6f | math-array "np.swapaxes, axis1 = 0, axis2 = 1" --shape 3,2 --header-size 32 | csv-from-bin s[32],6f

See math-array --help for more details.


Among all, csv-paste can number lines of its output. Now, individualised parameters have been added, if there are several instances of line-number in command line parameters. Examples:

> # append single line number
> seq 0 11 | csv-paste - line-number
> # number blocks of records
> seq 0 12 | csv-paste - line-number --size 3
> # create multiple indices (e.g. if you need to express multidimensional array indices)
> seq 0 11 | csv-paste - "line-number;size=4" "line-number;size=4;index"
> # reverse indices (e.g. to use with csv-blocks down your pipeline)
> seq 0 11 | csv-paste - "line-number;size=4" "line-number;size=4;index;reverse"

As other comma utilities, all the operations csv-paste can operate on ascii or binary data. See csv-paste --help for more configuration possibilities.

csv-thin thins down high bandwidth data by a given rate.

A new option, --period, allows you to specify the period of output, regardless of the rate of the input data (assuming that it's at least as fast as the desired output rate).

Using csv-paste for a high-rate input source you can try it with:

csv-paste line-number | csv-time-stamp | csv-thin --period 0.1

By default it uses wall-clock time for clocking the data. Alternately, and useful with pre-captured data, you can use a time field in the data:

csv-paste line-number | csv-time-stamp | head -200000 > data.csv
cat data.csv | csv-thin --period 0.1 --fields t

Multiple rectangular regions can be specified in roi operation of cv-calc (like the draw operation), so that:

  • everything outside these regions in the input images is set to zero, or
  • these regions are cropped out of input images into separate images (the arguments prefixed to input will be removed).

All images in the input stream must have same number of regions. Any region with zero width or height (e.g. 0,0,0,0) will be ignored and, if needed, can be used so that all images have same number of regions.

If all the bounding boxes for an image have zero area, then the whole image will be set to zero

> # mask in 2 rectangles
> cat images.bin \
  | csv-paste "value=250,675,488,903,596,604,784,900;binary=8ui" "-;binary=t,3ui,s[5013504]" \
  | cv-calc roi  --fields=rectangles,t,rows,cols,type --rectangles=2 --binary=8ui,t,3ui \
  | csv-bin-cut '8ui,t,3ui,s[5013504]' --fields 9-13 \
  > masked.bin

> # crop out 2 rectangles ( csv-bin-cut not needed in this case )
> cat images.bin \
  | csv-paste "value=250,675,488,903,596,604,784,900;binary=8ui" "-;binary=t,3ui,s[5013504]" \
  | cv-calc roi --crop --fields=rectangles,t,rows,cols,type --rectangles=2 --binary=8ui,t,3ui \
  > cropped.bin



If you are putting together a training dataset for classification or object detection, you may need to create a uniformly distributed random selection of image crops from your image data.

The following pipeline helps you to do it. It picks random images, cuts 4 random patches of size 300x200 from each of them, and saves them as png files in the current directory.

(Note: index parameter in file=png,,index is required, because otherwise the filenames for the patches cut out of the same image would have different filenames.)

> cat your-image-data.bin | cv-calc thin --rate 0.01 | cv-calc random-crop --width 300 --height 200 --count 4 | cv-cat "file=png,index"

See cv-calc --help for more configuration options.


io-cat: now can wait

To recap: io-cat is a utility extending cat functionality towards merging live streams. io-cat semantics is the same as cat on files, but it can merge streams, too, e.g. merge three streams:

> cat some-file.csv | io-cat - tcp:localhost:12345 local:some/socket > merged.csv

It supports a couple of simple merge policies: first come first serve by default, or round robin: e.g. try:

> yes STDIN | io-cat -  <( yes ANOTHER-STREAM ) <( yes THIRD-STREAM ) --round-robin 1 | head

Now, io-cat also can wait for publishing servers to start, using io-cat --connect-attempts option, e.g:

> io-cat tcp:localhost:8888 --connect-attempts unlimited -v
io-cat: stream 0 (tcp:localhost:8888): connecting, attempt 1 of unlimited...
io-cat: stream 0 (tcp:localhost:8888): failed to connect
io-cat: stream 0 (tcp:localhost:8888): connecting, attempt 2 of unlimited...
io-cat: stream 0 (tcp:localhost:8888): failed to connect

See io-cat --help for more configuration options.

Last but not least, broadly, the right approach to persistent clients would be using a publish/subscribe middleware, of your liking. ZeroMQ is a light-weight choice (and comma zero-cat supports a core subset of it). However, if you just want to quickly cobble together simple merging of multiple streams, potentially from heterogeneous sources, io-cat is there for you.


control-speed utility sets the speed of each waypoint in the path based on its position in a curve.

turn operation calculates the angle at each waypoint with respect to its adjacent waypoints and assigns the speed according to given maximum lateral acceleration. By passing --stop-on-sharp-turn or --pivot, control-speed can implement spot turn by outputting an extra waypoint with relative heading and no speed, for each sharp turn in the trajectory.

$ ( echo '0.0,0.0'; echo '0.3,0.3'; echo '0.6,0.6'; echo '0.6,0.9'; echo '0.6,1.2'; echo '0.9,1.2'; echo '1.2,1.2'; echo '1.5,0.9'; echo '1.8,0.6' ) > trajectory.csv

# moderate speed
$ control-speed turn --max-acceleration=0.5 --approach-speed=0.2 --fields=x,y --speed=1 < trajectory.csv > speed-turn.csv

# stop on sharp turns
control-speed turn --max-acceleration=0.5 --approach-speed=0.2 --fields=x,y --speed=1 --pivot < trajectory.csv > speed-pivot.csv

# visualise with trajectory as blue and speed as z axis in yellow
$ view-points "trajectory.csv;fields=x,y;shape=lines;title=trajectory" <( echo 0,0,begin )";fields=x,y,label;weight=8;color=red;title=origin" "speed-pivot.csv;fields=x,y,z;shape=lines;color=yellow;title=turn"


control-speed decelerate operation moderates the sudden decrease in speed in the trajectory by a given deceleration.

$ control-speed decelerate --fields=x,y,speed --deceleration=0.5 < speed-pivot.csv > speed-decelerate.csv

# visualise with speed as z-axis and orange color as the decelerated speed
$ view-points "trajectory.csv;fields=x,y;shape=lines;title=trajectory" <( echo 0,0,begin )";fields=x,y,label;weight=8;color=red;title=origin" \
    "speed-pivot.csv;fields=x,y,z;shape=lines;color=yellow;title=turn" "speed-decelerate.csv;fields=x,y,z;shape=lines;color=orange;title=decelerate"

If you need to quickly deploy a bunch services for line-based or fixed-width data over TCP, local sockets, ZeroMQ, etc, now you can use io-topics, a utility in comma. You can deploy services that run continuously or start only in case if there is at least one client (e.g. if they are too resource greedy).

Perhaps, it is not a replacement for a more proper middleware like ROS or simply systemd, but the advantages of io-publish-topics are its light weight, ad-hoc nature, ability to run a mix of transport protocols.

Try the following toy example of io-topics publish:

> # run publisher with topics a and b, with b on demand
> io-topics publish --config <( echo "a/command=csv-paste line-number"; echo "a/port=8888"; echo "b/command=csv-paste line-number"; echo "b/port=9999"; echo "b/on_demand=1" )
io-topics: publish: will run 'comma_execute_and_wait --group' with commands:
io-topics: publish:    io-publish tcp:8888   -- csv-paste line-number
io-topics: publish:    io-publish tcp:9999  --on-demand -- csv-paste line-number
> # in a different shell, observe that topic a keeps running even if no-one is listening,
> # whereas topic b runs only if at least one client is connected:
> socat tcp:localhost:8888 | head -n5 # will output something like, since the service keeps running even if there are no clients connected:
> socat tcp:localhost:9999 - | head -n5 # whenever the first client connects, will start from 0, since it runs only if at least one client is connected

You also can create - on the fly, if you want - a light-weight subscriber, as in example below. Run publishing as in the example above and then run io-topics cat:

> io-topics cat --config <( echo "a/command=head -n5 > a.csv"; echo "a/address=tcp:localhost:8888"; echo "b/command=head -n5 > b.csv"; echo "b/address=tcp:localhost:9999" )
io-topics: cat: will run 'comma_execute_and_wait --group' with commands:
io-topics: cat:     bash -c io-cat tcp:localhost:8888   | head -n5 > a.csv
io-topics: cat:     bash -c io-cat tcp:localhost:9999   | head -n5 > b.csv
> # check output            
> cat a.csv 
> cat b.csv 

If you would like to suspend your log playback (e.g. for demo purposes, when, e.g. visualising point cloud stream - or any kind of CSV data - or while browsing your data), now you could use csv-play --interactive or csv-play -i, pressing <whitespace> to pause and resume. Try to run the example below:

> echo 0 | csv-repeat --period 0.1 --yes | csv-time-stamp | csv-play --interactive
csv-play: running in interactive mode; press <whitespace> to pause or resume
csv-play: paused
csv-play: resumed

Press left or down arrow keys to output one record at a time. (Keys for outputting one block at a time: todo.)

Sometimes, one may need to repeat the same record, just as linux yes does. The problem with yes is that you cannot tell it to repeat at a given time interval.

Now, csv-repeat --ignore-eof can do it for you, which is useful for example, if you need to quickly fudge a sort of heartbeat stream, a simulated data stream, or alike:

> echo hello | csv-repeat --period 0.1 --ignore-eof | head -n5
> echo hello | csv-repeat --period 0.1 --ignore-eof | csv-time-stamp | head -n5

Binary mode is supported as usual.

points-calc nearest-(min/max) and percentile operations search within a given radius around each input point. This can take a lot of time for large amount of input data.

One way to speed things up is to, instead of finding the nearest min to each point in a given radius, find the minimum for the 27 voxels in the neighbourhood of the voxel containing the point. That computed value is assigned to each point in that voxel.

This optimization is used when points-calc nearest-(min/max) or points-calc percentile is given --fast command line argument. For example

> points-calc nearest-min --full --fast --fields x,y,scalar --radius 1
> points-calc percentile --percentile=0.03 --fast --fields x,y,scalar --radius 1

On large point cloud, like that of rose street (*.csv.gz ), optimized operations were found to be 20 times faster for extremums and more than 100 time faster for percentiles.

Assume you would like to quickly find additive changes in the scene. For example you have a static point cloud of empty car park, and would like to extract the parked cars from a stream of lidar data. If the extraction does not have to be perfect, a quick way of doing it would be using points-join --not-matching. A simple example:

> # make sample point clouds
> for i in {20..30}; do for j in {0..50}; do for k in {0..50}; do echo $i,$j,$k; done; done; done > minuend.csv
> for i in {0..50}; do for j in {20..30}; do for k in {20..30}; do echo $i,$j,$k; done; done; done > subtrahend.csv
> cat minuend.csv | points-join subtrahend.csv --radius 0.51 --not-matching | view-points "minuend.csv;colour=red;hide" "subtrahend.csv;colour=yellow;hide" "-;colour=white;title=difference"

The described car park scenario would look like:

> cat carpark-with-cars.csv | points-join --fields x,y,z "empty-carpark.csv;fields=x,y,z" --radius 0.1 --not-matching > cars-only.csv

The crude part is of course in choosing --radius value: it should be such that the spheres of a given radius around the subtrahend point cloud sufficiently overlap to capture all the points belonging to it. But then the points that are closer than the radius to the subtrahend point cloud will be filtered out, too. E.g. in the car park example above, the wheels of the cars will be chopped off at 10cm above the ground. To avoid this problem, you could for example erode somehow the subtrahend point cloud by the radius.

The described approach may be crude, but it is quick and suitable for many practical purposes.

Of course, for more sophisticated change detection in point clouds, which is more accurate and takes into account view points, occlusions, additions and deletions of objects in the scene, etc, you could use points-detect-change.

Assume that you happened to know the coordinates of your sensor in some Cartesian coordinate system, and you want to derive the coordinates of your robot centre. For the robot configuration you know the offset of the sensor from the robot centre, but not the other way around. The solution:

get inverse offset
# assume this is the offset of the sensor from the robot centre

inversed_offset=$( echo "0,0,0,0,0,0" | points-frame --to="$offset" --fields="x,y,z,roll,pitch,yaw" --precision=16 )

Now inversed_offset is the position (and pose) of the robot centre in the coordinate system associated with the sensor.

Step by step demo:

  • Start with some coordinates (navigation data in the world frame; the specific coordinate system does not matter). A sample data file is attached to this page:

    get nav
    cat nav.bin | csv-from-bin t,6d | head -n 2

    The example uses binary, but this is up to you. The nav.bin file contains the trajectory of the robot centre (GPS unit) in the world frame.

  • Get the coordinates of the sensor in the world frame:

    get sensor trajectory
    cat nav.bin | csv-paste "-;binary=t,6d" "value=$offset;binary=6d" \
        | points-frame --from --fields=",frame,x,y,z,roll,pitch,yaw" --binary="t,6d,6d" \
        | csv-shuffle --fields="t,,,,,,,,,,,,,x,y,z,roll,pitch,yaw" --binary="t,6d,6d,6d" --output-fields="t,x,y,z,roll,pitch,yaw" > sensor.bin

    Now sensor.bin is the trajectory of the sensor in the world frame. We want to get the trajectory of the robot centre from these data.

  • Just do it:

    get centre coordinates back
    cat sensor.bin | csv-paste "-;binary=t,6d" "value=$inversed_offset;binary=6d" \
        | points-frame --from --fields=",frame,x,y,z,roll,pitch,yaw" --binary="t,6d,6d" \
        | csv-shuffle --fields="t,,,,,,,,,,,,,x,y,z,roll,pitch,yaw" --binary="t,6d,6d,6d" --output-fields="t,x,y,z,roll,pitch,yaw" > restored.bin

    Note the use of inversed_offset.

  • Verify by comparing to the original nav.bin:

    cat nav.bin \
        | csv-paste "-;binary=t,6d" "restored.bin;binary=t,6d" \
        | csv-eval --full-xpath --binary="t,6d,t,6d" --fields="f/t,f/x,f/y,f/z,f/roll,f/pitch,f/yaw,s/t,s/x,s/y,s/z,s/roll,s/pitch,s/yaw" \
            "dx = abs(f_x - s_x); dy = abs(f_y - s_y); dz = abs(f_z - s_z); droll = abs(f_roll - s_roll); dpitch = abs(f_pitch - s_pitch); dyaw = abs(f_yaw - s_yaw);" \
        | csv-shuffle --fields=",,,,,,,,,,,,,,dx,dy,dz,droll,dpitch,dyaw" --binary="t,6d,t,6d,6d" --output-fields="dx,dy,dz,droll,dpitch,dyaw" \
        | csv-calc --fields="dx,dy,dz,droll,dpitch,dyaw" --binary="6d" mean \
        | csv-from-bin 6d

    The output is on the order of \( 10^{-16} \) . The precision is defined by the accuracy of inversed_offset calculations above. If the --precision=16 option were not given, the comparison would be valid up to \( 10^{-12} \) or so.

Rabbit MQ


Rabbit MQ is an open source message queue service (

It implements AMQP 0-9-1 (

programming tutorials:


For Ubuntu:


sudo apt-get install rabbitmq-server
# check service is installed and running
service rabbitmq-server status
# for python clients
sudo pip install pika


(for other platforms see installation from the website)


rabbit-cat is a light rabbit MQ client in python available in comma.

see rabbit-cat -h for examples

example 1

For receiver run:


rabbit-cat listen localhost --queue="queue1"


For sender in a separate terminal:


echo "hello world!" | rabbit-cat send localhost --queue="queue1" --routing-key="queue1"



Suppose, the GPS unit on a vehicle is offset from the vehicle geometrical centre. Therefore, you most likely need to convert the GPS trajectory as 6DOF points (x,y,z,roll,pitch,yaw) to the trajectory of the vehicle centre.

The other (almost identical) use case: you have got a trajectory from Visual SLAM relative to a lidar and now want to convert it into the vehicle centre trajectory.

Now, it can be done as following:

> gps_unit_offset=1,2,3,0.1,0.2,0.3
> cat gps_unit_trajectory.csv | points-frame --position $gps_unit_offset --fields frame

In the past, in such a situation, one would need to jump through the hoops with points-frame as following:

> cat gps_unit_trajectory.csv | csv-paste - value=$gps_unit_offset | points-frame --fields frame,position | csv-shuffle --fields ,,,,,,,,,,,,x,y,z,roll,pitch,yaw --output-fields x,y,z,roll,pitch,yaw

When joining two point clouds, if you would like to output a few nearest points, now you can use points-join with --size option:

> # single nearest point (same as before):
> echo 0,0,0 | points-join <( echo 0,0,1; echo 0,0,2; echo 0,0,3 ) --radius 5
> # up to a given number of nearest points:
> echo 0,0,0 | points-join <( echo 0,0,1; echo 0,0,2; echo 0,0,3 ) --radius 5 --size 2

Suppose, you have two point clouds cloud 1 and cloud 2. Suppose, for each point P from cloud 1 you would like to get all the points from cloud 2 that are not farther then a given radius from P

Then, you could use points-join --all

> cat cloud-1.csv | points-join cloud-2.csv --radius 1.5 --all

Now, you also can specify variable radius for points in cloud 1. (E.g. your radius may vary depending on your point cloud density or structure, as it happened in our use case.)

Then you could run:

> cat cloud-1.csv | points-join --fields x,y,z,radius cloud-2.csv --radius 1.5 --all

(Note that, as a limitation, the point-specific radius should not exceed --radius value.)

Sometimes, when you run some slow processing of a point cloud and output the result in a file, you may want to monitor the progress. Then, the following trick may help you:

> some_slow_processing_script > points.csv &
> ( i=0; while true; do cat points.csv | csv-paste value=$i -; (( ++i )); done; sleep 30 ) | view-points --fields block,,,,x,y,z

Suppose, you need to go through a dataset to pick images for your classification training data.

cv-cat view can be used for basic browsing, selecting images, and assigning a numeric label to them:

cat images.bin | cv-cat "view=0,,png"

The command above will show the image and wait a key press:

Press whitespace to save the file as <timestamp>.png, e.g. 20170101T123456.222222.png

Press numerical keys 0-9: save the file as <timestamp>.<num>.png, e.g. if you press 5: 20170101T123456.222222.5.png

Press <Esc> to exit.

Press any other key to show the next frame without saving.


view parameters have the following meaning:

The first parameter is the wait in milliseconds for key press, 0 to wait indefinitely.

The second parameter is the window title (irrelevant for labelling).

The third parameter is the image extension e.g. png, jpg ...; default ppm.


A few new features have been added to cv-cat accumulate filter.

Before, it accumulated images as sliding window of a given size. Now, you could also ask for fixed layout of the accumulated image. It sounds confusing, but try to run the following commands (press any key to move to the next image)

> # make sense of the input
> ( yes 255 | csv-to-bin ub ) | cv-cat --input 'no-header;rows=64;cols=64;type=ub' 'count;view=0;null'
> # accumulate as sliding window of size 4
> ( yes 255 | csv-to-bin ub ) | cv-cat --input 'no-header;rows=64;cols=64;type=ub' 'count;accumulate=4;view=0;null'
> # accumulate as sliding window of size 4 in reverse order
> ( yes 255 | csv-to-bin ub ) | cv-cat --input 'no-header;rows=64;cols=64;type=ub' 'count;accumulate=4,,reverse;view=0;null'
> # accumulate images in fixed order
> ( yes 255 | csv-to-bin ub ) | cv-cat --input 'no-header;rows=64;cols=64;type=ub' 'count;accumulate=4,fixed;view=0;null'
> # accumulate images in fixed order (reverse)
> ( yes 255 | csv-to-bin ub ) | cv-cat --input 'no-header;rows=64;cols=64;type=ub' 'count;accumulate=4,fixed,reverse;view=0;null'

For example, if you want to create an image from fixed number of tiles, you could run something like this:

> ( yes 255 | csv-to-bin ub ) | cv-cat --fps 1 --input 'no-header;rows=64;cols=64;type=ub' 'count;accumulate=4,fixed;untile=2,2;view=0;null'

Say, you process images, but would like to view them in the middle of your pipeline in a different way (e.g. increase their brightness, resize, etc).

Now, you can do it with cv-cat tee. For example:

> # make a test image
> ( echo 20170101T000000,64,64,0 | csv-to-bin t,3ui; yes 255 | head -n $(( 64 * 64 )) | csv-to-bin ub ) > image.bin
> # observe that the images viewed in tee are passed unmodified down the main pipeline for further processing
> for i in {1..100}; do cat image.bin; done | cv-cat --fps 1 "count;tee=invert|view;resize=2" | cv-cat "view;null"

You could specify (almost) any pipeline in your tee filter, but viewing and, perhaps, saving intermediate images in files seem so far the main use cases.

Recently, we found that cv-cat view stopped working properly, when used several times in the same cv-cat call.

Something like

> cat images.bin | cv-cat "view;invert;view;null"

would either crash or behave in undefined way. All our debugging has pointed to some sort of race condition in the underlying cv::imshow() call or deeper in X-windows-related stuff, thus, at the moment, it seems to be out of our control.

Use the following instead:

> cat images.bin | cv-cat "view;invert" | cv-cat "view;null"

cv-cat is now able to perform pixel clustering by color using the k-means algorithm.

for example:

> cv-cat --file rippa.png "convert-to=f,0.0039;kmeans=4;view;null" --stay

input image:

output image (4 clusters):

A new convenience utility ros-from-csv is now available in snark. It reads CSV records and converts them into ROS messages with the usual conveniences of csv streams (customised fields, binary format, stream buffering/flushing, etc).

Disclaimer: ros-from-csv is a python application and therefore may not perform well streams that require high bandwidth or low latency.

You could try it out, using the ROS tutorial Understanding Topics (

Run ROS Tutorial nodes:

> # in a new shell
> roscore
> # in a new shell
> rosrun turtlesim turtle_teleop_key


Send your own messages on the topic, using ros-from-csv:

> echo 1,2,3,4,5,6 | ros-from-csv /turtle1/cmd_vel

Or do a dry run:

> echo 1,2,3,4,5,6 | ros-from-csv /turtle1/cmd_vel --dry
  x: 1.0
  y: 2.0
  z: 3.0
  x: 4.0
  y: 5.0
  z: 6.0

You also can explicitly specify message type:

> # dry run
> echo 1,2,3 | ros-from-csv --type geometry_msgs.msg.Point --dry
x: 1.0
y: 2.0
z: 3.0
> # send to a topic
> echo 1,2,3 | ros-from-csv --type geometry_msgs.msg.Point some-topic

A new convenience utility ros-to-csv is now available in snark. It allows to output as CSV the ROS messages from rosbags or from topics published online.

You could try it out, using the ROS tutorial Understanding Topics (

Run ROS Tutorial nodes:

> # in a new shell
> roscore
> # in a new shell
> rosrun turtlesim turtlesim_node
> # in a new shell
> rosrun turtlesim turtle_teleop_key

Run ros-to-csv; then In the shell where you run turtle_teleop_key, press arrow keys to observe something like:

> # in a new shell
> ros-to-csv /turtle1/cmd_vel --verbose
ros-to-csv: listening to topic '/turtle1/cmd_vel'...

If you log some data in a rosbag:

> # in a new shell
> rosbag record /turtle1/cmd_vel

You could convert it to csv with a command like:

> ros-to-csv /turtle1/cmd_vel --bag 2017-11-06-14-43-34.bag

Sometimes, you have a large file or input stream that is mostly sorted, which you would like to fully sort (e.g. in ascending order).

More formally, suppose, you know that for any record Rn in your stream and any records Rm such that m - n > N, Rn < Rm, where N is constant.

Now, you can sort such a stream, using csv-sort, --sliding-window=<N>:


> ( echo 3echo 1; echo 2; echo 5echo 4 ) | csv-sort --sliding-window 3 --fields a
> ( echo 4echo 5echo 2echo 1echo 3 ) | csv-sort --sliding-window 3 --fields a --reverse

As usual, you can sort by multiple key fields (e.g. csv-sort --sliding-window=10 --fields=a,b,c), sort block by block (e.g. csv-sort --sliding-window=10 --fields=t,block), etc.

Sometimes, you have a large file or input stream that is mostly sorted by some fields with just a few records out of order now and then. You may not care about those few outliers, all you want is most of your data sorted.

Now, you can discard the records out of order, using csv-sort, e.g:

> ( echo 0; echo 1; echo 2; echo 1; echo 3 ) | csv-sort --discard-out-of-order --fields a
> ( echo 3; echo 2; echo 1; echo 2; echo 0 ) | csv-sort --discard-out-of-order --fields a --reverse

As usual, you can sort by multiple key fields (e.g. csv-sort --discard-out-of-order --fields=a,b,c), sort block by block (e.g. csv-sort --discard-out-of-order --fields=t,block), etc.

The ratio and linear-combination operations of cv-cat have been extended to support assignment to multiple channels. Previously, these operations would take up to 4 input channels (symbolically always named r, g, b, and a, regardless of the actual contents of the data) and produce a single-channel, grey-scale output. Now you can assign up to four channels:

ratio syntax
... | cv-cat "ratio=(r-b)/(r+b),(r-g)/(r+g),r+b,r+g"

The right-hand side of the ratio / linear combination operations contains comma-separated expressions defining each of the output channels through the input channels. The number of output channels is the number of comma-separated fields, it may differ from the number of input channels. As a shortcut, an empty field, such as in

ratio syntax shortcut
... | cv-cat "ratio=,r+g+b,"

is interpreted as channel pass-through. In the example above the output has three channels, with channels 0 and 2 assigned verbatim to the input channels 0 and 2 (r and b, symbolically), and the channel 1 (symbolic g) assigned to the sum of all three channels.

As yet another shortcut, cv-cat provides a shuffle operation that re-arranges the input channels without changing their values:

shuffle syntax
... | cv-cat "shuffle=b,g,r,r"

In this case, the order of the first 3 channels is reversed, while the former channel r is also duplicated into channel 3 (alpha). Internally, shuffling is implemented as a restricted case of linear combination, and therefore, other usual rules apply: the number of output channels is up to 4, it does not depend on the number of input channels, and an empty field in the right-hand side is interpreted as channel pass-through.

When using view-points, there often is a need to quickly visualise or hide several point clouds or other graphic primitives.

Now, you can group data in view-points, using groups key word. A source can be assigned to one or more groups by using the groups arguments. Basic usage is:

view-points "...;groups=g1,g2"

For example if we have two graphs as follows:

$ cat <<EOF > edges01.csv

$ cat <<EOF > nodes01.csv

$ cat <<EOF > edges02.csv

$ cat <<EOF > nodes02.csv

We can separate the graphs as well as group together nodes and edges of different graphs as follows:

$ view-points "nodes01.csv;fields=x,y,z,label;colour=yellow;weight=5;groups=graph01,nodes,all" \
	"edges01.csv;fields=first/x,first/y,first/z,second/x,second/y,second/z;shape=line;colour=yellow;shape=line;groups=graph01,edges,all" \
	"nodes02.csv;fields=x,y,z,label;colour=green;weight=5;groups=graph02,nodes,all" \

Try to switch on/off checkboxes for various groups (e.g. "graph01", "nodes", etc) and observe the effect.

A quick note on new operations in cv-calc utility. Time does not permit to present proper examples, but hopefully, cv-calc --help would be sufficient to give you an idea.

cv-calc grep

Output only those input images that conform a certain condition. Currently, only min/max number or ratio of non-zero pixels is supported, but the condition can be any set of filters applied to the input image (see cv-cat --help --verbose for the list of the filters available).

Example: Output only images that have at least 60% of pixels darker than a given threshold:

> cat images.bin | cv-calc grep --filters="convert-to=f,0.0039;invert;threshold=0.555" --non-zero=ratio,0.6

cv-calc stride

Stride to the input image with a given kernel (just like a convolution stride), output resulting images.

cv-calc thin

Thin the image stream by a given rate or a desired frames-per-second number.



Space contributors


Blog Posts





  • No labels