Elixir/Ports and external process wiring: Difference between revisions

Line 112:

It's possible to send a signal by shelling out to unix <code>kill PID</code>, but BEAM doesn't expose the child process ID and doesn't include any built-in functions to send a signal to an OS process. Clearly we're expected to do this another way. Another problem with "kill" is that we want the external process to stop no matter how badly the BEAM is damaged, so we shouldn't rely on stored data or on running final clean-up logic before exiting.

To debug what happens during <code>port_close</code> and to eliminate variables, I tried ~~to spawn~~ <code>sleep 60</code> ~~using the same Port command,~~ and I found that it behaves exactly the same way, hanging until ~~the~~ sleep ends naturally regardless of what happened in Elixir or whether its pipes are still open. This happens to have been a lucky choice as I learned later: "sleep" is ~~unusual in the same way as~~ rsync but its behavior is much simpler to reason about.

To debug what happens during <code>port_close</code> and to eliminate variables, I tried spawning <code>sleep 60</code> instead of rsync and I found that it behaves in exactly the same way: hanging until <code>sleep</code> ends naturally regardless of what happened in Elixir or whether its pipes are still open. This happens to have been a lucky choice as I learned later: "sleep" is daemon-like so similar to rsync, but its behavior is much simpler to reason about.

== Bad assumption: pipe-like processes ==

A pipeline like <code>gzip</code> or <code>cat</code> it built to read from its input and write to its output. We can roughly group the different styles of command-line application into "pipeline" programs which read and write, "interactive" programs which require user input, and "daemon" programs which are designed to run in the background. Some programs support multiple modes depending on the arguments given at launch, or by detecting the terminal using <code>isatty</code><ref>[https://man.archlinux.org/man/isatty.3.en docs for <code>isatty</code>]</ref>. The BEAM is currently optimized to interface with pipeline programs and it assumes that the external process will stop when its "standard input" is closed.

A typical pipeline program will stop once it detects that input has ended, by ~~making regular C system calls to~~ <code>read</code><ref>[https://man.archlinux.org/man/read.2 libc <code>read</code> docs]</ref>:<syntaxhighlight lang="c">

A typical pipeline program will stop once it detects that input has ended, for example by calling <code>read</code><ref>[https://man.archlinux.org/man/read.2 libc <code>read</code> docs]</ref> in a loop:<syntaxhighlight lang="c">

~~ssize_t n_read~~ = read (input_desc, buf, bufsize);

size_read = read (input_desc, buf, bufsize);

if (~~n_read~~ < 0) { error... }

if (size_read < 0) { error... }

if (~~n_read~~ == 0) { end of file... }

if (size_read == 0) { end of file... }

</syntaxhighlight>When the program uses blocking I/O, reading zero bytes indicates the end of file. There are also programs which do asynchronous I/O using <code>O_NONBLOCK</code><ref>[https://man.archlinux.org/man/open.2.en#O_NONBLOCK O_NONBLOCK docs]</ref>, and these might rely on the <code>HUP</code> hang-up signal which is normally sent when input is closed.

</syntaxhighlight>

But here we'll focus on how processes can more generally affect each other through pipes. Surprising answer: without much effect! You can experiment with the <code>/dev/null</code> device which behaves like a closed pipe, for example compare these two commands:<syntaxhighlight lang="shell">

If the program does blocking I/O, then a zero-byte <code>read</code> indicates the end of file condition. A program which does asynchronous I/O with <code>O_NONBLOCK</code><ref>[https://man.archlinux.org/man/open.2.en#O_NONBLOCK O_NONBLOCK docs]</ref> might instead detect EOF by listening for the <code>HUP</code> hang-up signal which is normally sent when input is closed.

But here we'll focus on how processes can more generally affect each other through pipes. Surprising answer: without much effect! You can experiment with the <code>/dev/null</code> device which behaves like a closed pipe, for example compare these two commands:

cat < /dev/null

Line 138:

Line 142:

A small shim can adapt a daemon-like program to behave more like a pipeline. The shim is sensitive to stdin closing or SIGHUP, and when this is detected it converts this into a stronger signal like SIGTERM which it forwards to its own child. This is the idea behind a suggested shell script<ref>[https://hexdocs.pm/elixir/1.19.0/Port.html#module-orphan-operating-system-processes Elixir Port docs showing a shim script]</ref> for Elixir, and the <code>erlexec</code><ref name=":0">[https://hexdocs.pm/erlexec/readme.html <code>erlexec</code> library]</ref> library. The opposite adapter can be found in the [[w:nohup|nohup]] shell command and the grimsby<ref>[https://github.com/shortishly/grimsby <code>grimsby</code> library]</ref> library: these will keep standard in and/or standard out open for the child process even after the parent exits, so that a pipe-like program can behave more like a daemon.

I used the shim approach in my rsync library and it includes a small C program<ref>[https://gitlab.com/adamwight/rsync_ex/-/blob/main/src/main.c?ref_type=heads rsync_ex C shim program]</ref> which wraps rsync and makes it sensitive to BEAM <code>port_close</code>. It's featherweight, leaving pipes unchanged as it passes control to ~~rsync—its only real effect is~~ to ~~convert~~ SIGHUP ~~to SIGKILL~~ (~~but should have been~~ SIGTERM~~, see the sidebar discussion of different signals below~~).

I used the shim approach in my rsync library and it includes a small C program<ref>[https://gitlab.com/adamwight/rsync_ex/-/blob/main/src/main.c?ref_type=heads rsync_ex C shim program]</ref> which wraps rsync and makes it sensitive to BEAM <code>port_close</code>. It's featherweight, leaving pipes unchanged as it passes control to rsync, here are the business parts:<syntaxhighlight lang="c">// Set up a fail-safe to self-signal with HUP if the controlling process dies.

prctl(PR_SET_PDEATHSIG, SIGHUP);</syntaxhighlight><syntaxhighlight lang="c">

void handle_signal(int signum) {

if (signum == SIGHUP && child_pid > 0) {

// Send the child TERM so that rsync can perform clean-up such as shutting down a remote server.

kill(child_pid, SIGTERM);

}

</syntaxhighlight>

== Reliable clean up ==

@@ Line 112: / Line 112: @@
 It's possible to send a signal by shelling out to unix <code>kill PID</code>, but BEAM doesn't expose the child process ID and doesn't include any built-in functions to send a signal to an OS process.  Clearly we're expected to do this another way.  Another problem with "kill" is that we want the external process to stop no matter how badly the BEAM is damaged, so we shouldn't rely on stored data or on running final clean-up logic before exiting.
-To debug what happens during <code>port_close</code> and to eliminate variables, I tried to spawn  <code>sleep 60</code> using the same Port command, and I found that it behaves exactly the same way, hanging until the sleep ends naturally regardless of what happened in Elixir or whether its pipes are still open.  This happens to have been a lucky choice as I learned later: "sleep" is unusual in the same way as rsync but its behavior is much simpler to reason about.
+To debug what happens during <code>port_close</code> and to eliminate variables, I tried spawning  <code>sleep 60</code> instead of rsync and I found that it behaves in exactly the same way: hanging until <code>sleep</code> ends naturally regardless of what happened in Elixir or whether its pipes are still open.  This happens to have been a lucky choice as I learned later: "sleep" is daemon-like so similar to rsync, but its behavior is much simpler to reason about.
 == Bad assumption: pipe-like processes ==
 A pipeline like <code>gzip</code> or <code>cat</code> it built to read from its input and write to its output.  We can roughly group the different styles of command-line application into "pipeline" programs which read and write, "interactive" programs which require user input, and "daemon" programs which are designed to run in the background.  Some programs support multiple modes depending on the arguments given at launch, or by detecting the terminal using <code>isatty</code><ref>[https://man.archlinux.org/man/isatty.3.en docs for <code>isatty</code>]</ref>.  The BEAM is currently optimized to interface with pipeline programs and it assumes that the external process will stop when its "standard input" is closed.
-A typical pipeline program will stop once it detects that input has ended, by making regular C system calls to <code>read</code><ref>[https://man.archlinux.org/man/read.2 libc <code>read</code> docs]</ref>:<syntaxhighlight lang="c">
+A typical pipeline program will stop once it detects that input has ended, for example by calling <code>read</code><ref>[https://man.archlinux.org/man/read.2 libc <code>read</code> docs]</ref> in a loop:<syntaxhighlight lang="c">
-ssize_t n_read = read (input_desc, buf, bufsize);
+size_read = read (input_desc, buf, bufsize);
-if (n_read < 0) { error... }
+if (size_read < 0) { error... }
-if (n_read == 0) { end of file... }
+if (size_read == 0) { end of file... }
-</syntaxhighlight>When the program uses blocking I/O, reading zero bytes indicates the end of file.  There are also programs which do asynchronous I/O using <code>O_NONBLOCK</code><ref>[https://man.archlinux.org/man/open.2.en#O_NONBLOCK O_NONBLOCK docs]</ref>, and these might rely on the <code>HUP</code> hang-up signal which is normally sent when input is closed.
+</syntaxhighlight>
-But here we'll focus on how processes can more generally affect each other through pipes.  Surprising answer: without much effect!  You can experiment with the <code>/dev/null</code> device which behaves like a closed pipe, for example compare these two commands:<syntaxhighlight lang="shell">
+If the program does blocking I/O, then a zero-byte <code>read</code> indicates the end of file condition.  A program which does asynchronous I/O with <code>O_NONBLOCK</code><ref>[https://man.archlinux.org/man/open.2.en#O_NONBLOCK O_NONBLOCK docs]</ref> might instead detect EOF by listening for the <code>HUP</code> hang-up signal which is normally sent when input is closed.
+But here we'll focus on how processes can more generally affect each other through pipes.  Surprising answer: without much effect!  You can experiment with the <code>/dev/null</code> device which behaves like a closed pipe, for example compare these two commands:
+<syntaxhighlight lang="shell">
 cat < /dev/null
@@ Line 138: / Line 142: @@
 A small shim can adapt a daemon-like program to behave more like a pipeline.  The shim is sensitive to stdin closing or SIGHUP, and when this is detected it converts this into a stronger signal like SIGTERM which it forwards to its own child.  This is the idea behind a suggested shell script<ref>[https://hexdocs.pm/elixir/1.19.0/Port.html#module-orphan-operating-system-processes Elixir Port docs showing a shim script]</ref> for Elixir, and the <code>erlexec</code><ref name=":0">[https://hexdocs.pm/erlexec/readme.html <code>erlexec</code> library]</ref> library.  The opposite adapter can be found in the [[w:nohup|nohup]] shell command and the grimsby<ref>[https://github.com/shortishly/grimsby <code>grimsby</code> library]</ref> library: these will keep standard in and/or standard out open for the child process even after the parent exits, so that a pipe-like program can behave more like a daemon.
-I used the shim approach in my rsync library and it includes a small C program<ref>[https://gitlab.com/adamwight/rsync_ex/-/blob/main/src/main.c?ref_type=heads rsync_ex C shim program]</ref> which wraps rsync and makes it sensitive to BEAM <code>port_close</code>.  It's featherweight, leaving pipes unchanged as it passes control to rsync—its only real effect is to convert SIGHUP to SIGKILL (but should have been SIGTERM, see the sidebar discussion of different signals below).
+I used the shim approach in my rsync library and it includes a small C program<ref>[https://gitlab.com/adamwight/rsync_ex/-/blob/main/src/main.c?ref_type=heads rsync_ex C shim program]</ref> which wraps rsync and makes it sensitive to BEAM <code>port_close</code>.  It's featherweight, leaving pipes unchanged as it passes control to rsync, here are the business parts:<syntaxhighlight lang="c">// Set up a fail-safe to self-signal with HUP if the controlling process dies.
+prctl(PR_SET_PDEATHSIG, SIGHUP);</syntaxhighlight><syntaxhighlight lang="c">
+void handle_signal(int signum) {
+  if (signum == SIGHUP && child_pid > 0) {
+    // Send the child TERM so that rsync can perform clean-up such as shutting down a remote server.
+    kill(child_pid, SIGTERM);
+  }
+}
+</syntaxhighlight>
 == Reliable clean up ==