Elixir/Ports and external process wiring: Difference between revisions
c/e and move out some asides |
c/e, image, formatting and arrangement |
||
| Line 1: | Line 1: | ||
This | This deceivingly simple programming adventure veers unexpectedly into piping and signaling between unix processes. | ||
== Context: controlling "rsync" == | == Context: controlling "rsync" == | ||
{{Project|source=https://gitlab.com/adamwight/rsync_ex/|status=beta|url=https://hexdocs.pm/rsync/Rsync.html}} | |||
Starting rsync | My exploration begins while writing a beta-quality rsync library for Elixir which transfers files in the background and can monitor progress. I hoped to learn better how to interface with long-lived external processes—and I got more than I wished for. | ||
System.shell("rsync -a | |||
[[File:Monkey eating.jpg|alt=A Toque macaque (Macaca radiata) Monkey eating peanuts. Pictured in Bangalore, India|right|400x400px]] | |||
Starting rsync should be as easy as calling out to a shell:<syntaxhighlight lang="elixir"> | |||
System.shell("rsync -a source target") | |||
</syntaxhighlight> | </syntaxhighlight> | ||
This has a few shortcomings | This has a few shortcomings, starting with filename escaping so at a minimum we should use <code>System.cmd</code>:<syntaxhighlight lang="elixir"> | ||
System.find_executable(rsync_path) | |||
|> System.cmd([~w(-a), source, target]) | |||
</syntaxhighlight>However this job would block until the transfer is finished and we get no feedback until completion. | |||
Elixir's low-level <code>Port</code> | Elixir's low-level <code>Port.open</code> maps directly to ERTS <code>open_port</code><ref>https://www.erlang.org/doc/apps/erts/erlang.html#open_port/2</ref> which provides flexibility. Here we have a command turning some knobs:<syntaxhighlight lang="elixir"> | ||
Port.open( | Port.open( | ||
{:spawn_executable, rsync_path}, | {:spawn_executable, rsync_path}, | ||
| Line 26: | Line 33: | ||
] | ] | ||
) | ) | ||
</syntaxhighlight> | |||
Progress lines have a fairly self-explanatory format: | |||
<syntaxhighlight lang="text"> | |||
3,342,336 33% 3.14MB/s 0:00:02 | |||
</syntaxhighlight> | </syntaxhighlight> | ||
{{Aside|text= | {{Aside|text= | ||
rsync has a variety of progress options, we chose overall progress above so the meaning of the percentage is "overall percent complete". | |||
Here is the menu: | |||
; <code>--info=progress2</code> : report overall progress | |||
; <code>--progress</code> : report statistics per file | |||
; <code>--progress</code> : | |||
; <code>--itemize-changes</code> : list the operations taken on each file | |||
< | |||
</ | |||
}} | }} | ||
Each rsync output line is sent to the library callback <code>handle_info</code> as <code>{:data, line}</code>, and after transfer is finished it receives a conclusive <code>{:exit_status, status_code}</code>. | |||
Here we extract the percent_done column and strictly reject any other output: | |||
<syntaxhighlight lang="elixir"> | |||
with terms when terms != [] <- String.split(line, ~r"\s", trim: true), | |||
percent_done_text when is_binary(percent_done_text) <- Enum.at(terms, 1), | |||
{percent_done, "%"} <- Float.parse(percent_done_text) do | |||
percent_done | |||
else | |||
_ -> | |||
{:unknown, line} | |||
end | |||
</syntaxhighlight>The <code>trim</code> lets us ignore spacing and newline trickery—or the leading carriage return you can see in this line from rsync's source, | |||
<syntaxhighlight lang="c"> | |||
rprintf(FCLIENT, "\r%15s %3d%% %7.2f%s %s%s", ...); | rprintf(FCLIENT, "\r%15s %3d%% %7.2f%s %s%s", ...); | ||
</syntaxhighlight> | </syntaxhighlight> | ||
{{Aside|text= | |||
On the terminal, rsync progress lines are updated in-place by emitting the fun [[w:Carriage return|carriage return]] control character <code>0x0d</code> or <code>\r</code> as you see above. The character seems to be named after pushing the physical paper carriage of a typewriter backwards without feeding a new line. On the terminal this overwrites the current line! | |||
[[w:https://en.wikipedia.org/wiki/Newline#Issues_with_different_newline_formats|Disagreements about carriage return]] vs. newline have caused eye-rolling since the dawn of personal computing. | |||
}} | }} | ||
One more comment about this carriage return: it's a byte in the binary data coming over the pipe from rsync, but it plays a "control" function because of how it will be interpreted by the tty. A repeated theme is that data and control are leaky categories, | |||
This is where Erlang/OTP really starts to shine: by opening the port inside of a dedicated gen_server<ref>https://www.erlang.org/doc/apps/stdlib/gen_server.html</ref> we have a separate thread communicating with rsync, which receives an asynchronous message like <code>{:data, text_line}</code> for each progress line. It's easy to parse the line, update some internal state and optionally send a progress summary to the code calling the library. | This is where Erlang/OTP really starts to shine: by opening the port inside of a dedicated gen_server<ref>https://www.erlang.org/doc/apps/stdlib/gen_server.html</ref> we have a separate thread communicating with rsync, which receives an asynchronous message like <code>{:data, text_line}</code> for each progress line. It's easy to parse the line, update some internal state and optionally send a progress summary to the code calling the library. | ||