## Ssh, tty, stdout and stderr

From: andrew cooke <andrew@...>

Date: Wed, 19 Nov 2014 20:32:11 -0300

The following code runs a calibration on a seismometer (this is the system I
developed at ISTI):

ssi_calconfig -n
ssi_caladdsignal -w random -D 60 -a 5 -s KIV1 -c SHZ
ssi_calrunat -n

It starts a new calibration, adds a random waveform (duration 60s, amplitude
5%) and then runs the calibration immediately.

Under the hood, the first two commands add entries to a database that
describes the calibration; the third command starts a scheduler that executes
the commands necessary to trigger the hardware, collect the data, calculate
the results, etc.

Separately, a co-worker developed a GUI for this system.  The GUI can control
calibrations on a remote machine.  When it does so, it executes the commands
above over ssh.

The following script (I've removed connection details) emulates that:

#!/bin/bash
ssh ... ssi_calconfig -n
ssh ... ssi_caladdsignal -w random -D 60 -a 5 -s KIV1 -c SHZ
ssh ... ssi_calrunat -n

Unfortunately, both the script and the GUI have a problem: one program, which
is forked from the scheduler, which is itself forked from the ssi_calrunat
program, crashes.

As far as I could tell, the only thing unusual about this program was that it
called a library that wrote an error message to stdout (this is just crappy
programming - everything else uses a logging system - but for whatever reason
we had to use this library).

After searching the net, I decided this was somehow related to ttys - that
seemed to be the only "real" difference between executing commands locally
(which worked) and remotely (which failed).

My hypothesis was supported by this script, which worked:

#!/bin/bash
ssh -t ... ssi_calconfig -n
ssh -t ... ssi_caladdsignal -w random -D 60 -a 5 -s KIV1 -c SHZ
ssh -t ... nohup ssi_calrunat -n

However, when we modified the GUI to include the changes above, the program
continued to crash!

I finally fixed the issue by replacing "-t" with "-tt".  The man page says:

Multiple -t options force tty allocation, even if ssh has no local tty.

so I guess there is some difference between a shell script and a Java GUI that
removes the "local tty".

Andrew

PS I was criticized for wasting time "doing computer science" because I
tracked this down rather than "just removing the print statement" from the
library that was crashing.  I can see the concern, but (1) be honest and say
we don't have time, that an ugly hack will have to do, and (2) just because
you don't understand doesn't mean it's "computer science".