This blog entry described a pretty subtle bug that leads to unexpected behaviour when handling PIPE signal in C++.
First, a brief reminder of how and when the PIPE signal is used. Assume we have a pipeline of commands:
The commands generate and process textual output, but we take only the first 2 lines. Once head received two lines of output, it terminates. On the next write standard output of command has no recipient, so the operating system sends a PIPE signal to command and terminates it. The point to note is that the signal is sent only when a command attempts to write something to standard output. No signal is sent and the command keeps running as long as it stays silent, e.g.
runs for 10 seconds, even if the "recipient" of the output exits after only 1 second.
With this pattern in mind, consider the following snippet of code:
The code is copied from the csv-to-bin utility at git revision c2521b3d83ee5f77cb1edf3fe7d42b767b4a392b. The exact details of the signal_flag class are not relevant, it suffices to say that on receipt of INT, TERM, and PIPE signals it would evaluate to logical "true" and then return to normal execution from the place where the signal was received. If you want to follow the problem hands-on, checkout the code as git checkout c2521b3. To return the code to the current (HEAD) revision, run git checkout master.
Now consider the following script using csv-to-bin:
Let us invoke the script in the following pattern:
The expected sequence of events is:
- initially we see lines "./count-bin.sh: output 0" from the script itself (on the standard error) and ">>> 0" from csv-from-bin on standard output
- after two iterations (two lines on standard output), head terminates
- when csv-from-bin attempts to write its output on the next iteration (counter n is 3), the pipe is closed and there is no recipient; therefore, csv-from-bin receives a PIPE signal and terminates; we shall see output from the script itself (on standard error) but no line ">>> 2" on standard output
- finally, on the next iteration there is no recipient for the output from the script itself, and therefore, csv-to-bin shall receive a PIPE signal and terminate with the "interrupted by signal" message, the script shall write the "output failed" message and exit
So far so good. The actual output, however, is:
The script keeps running, csv-to-bin apparently never receives SIGPIPE, although the head and csv-from-bin processes are gone (can be confirmed by looking at the process tree from a separate terminal).
So, what went wrong?
The standard output is (by default) buffered. Therefore, no actual write is made in the main loop of csv-to-bin (unless '–flush' option is used or the buffer is full, which does not happen in our example). Therefore, nothing is written to standard output within the loop itself, and no signal is sent.
Once all the input is processed, the main loop terminates and proceeds to the "return 0" line. Again, nothing is written yet and no signal sent.
Finally, the main function exits. At this point, C++ invokes the destructors of all the global objects including the output streams, and finally the output is written. This is the time when csv-to-bin encounters the lack of output recipient and gets a PIPE signal. However, by this time we are well out of the userland code. The signal is received but no action can be made out of it. For the end-user it looks like csv-to-bin receives a signal and ignores it, exiting with the status of 0, which is already set by "return 0" before receiving the signal.
From the point of view of count-bin.sh script, csv-to-bin call was a success, and therefore, the script keeps running contrary to what we expected to achieve by using "head -n 2".
Depending on your requirements, any of the following approaches can be used:
Do not handle PIPE signal
This is the simplest way and it has been implemented in the current version of csv-to-bin and other comma applications. If no user handler is set for SIGPIPE, the default behaviour applies and on receipt of SIGPIPE a program terminates with exit status of 141. Unless the user must do something really special on receiving the signal, e.g., write a log file, sync a database, and so on, there is no need to handle PIPE (or any other signal for that matter) explicitly.
Flush after yourself
Nuff said. If you do need to handle SIGPIPE, make sure that every output is flushed (or not buffered in the first place). The flush will trigger a PIPE signal if no-one reads your output. Note that performance may be badly affected by this approach.
Change the signal handler to perform the necessary last-minute action after receiving SIGPIPE, then re-send the signal to itself. In this case, the utility will also terminate with exit status of 141.
Restore the default signal handler
The custom signal handler is instantiated in the constructor of signal_flag object. Once the object is out of scope, it shall restore the default handler. This shall be the default implementation but has not been done yet. This approach is more appropriate for longer-running applications that must handle signals during some special sections of the code. Once out of the special section, the default handler shall apply. The special handler shall perform the necessary last-minute actions and then re-send the signal to the application.