Elixir/Ports and external process wiring: Difference between revisions

Adamw (talk | contribs)
future directions section
Adamw (talk | contribs)
c/e first page
Line 4: Line 4:
{{Project|source=https://gitlab.com/adamwight/rsync_ex/|status=beta|url=https://hexdocs.pm/rsync/Rsync.html}}
{{Project|source=https://gitlab.com/adamwight/rsync_ex/|status=beta|url=https://hexdocs.pm/rsync/Rsync.html}}


My exploration begins while writing a beta-quality rsync library for Elixir which transfers files in the background while monitoring progress. Rsync is the best tool for this since it can resume incomplete transfers and synchronize directories efficiently and it's complex enough that nobody will reimplement it in pure Erlang. I had hoped that this project would teach me how to interface with long-lived external processes—and I learned more than I wished for.
My exploration begins while writing a beta-quality library for Elixir to transfer files in the background and monitor progress, using rsync.
 
{{Aside|text=[[w:rsync|Rsync]] is usually the best tool for file transfer, locally or over a network.  It can resume incomplete transfers and synchronize directories efficiently, and it's complex enough that nobody is reimplementing it in pure Erlang any time soon.}}
 
I was excited to learn how to interface with long-lived external processes—and this project offered more than I hoped for.


[[File:Monkey eating.jpg|alt=A Toque macaque (Macaca radiata) Monkey eating peanuts. Pictured in Bangalore, India|right|300x300px]]
[[File:Monkey eating.jpg|alt=A Toque macaque (Macaca radiata) Monkey eating peanuts. Pictured in Bangalore, India|right|300x300px]]
=== Naive shelling ===


Starting rsync should be as easy as calling out to a shell:<syntaxhighlight lang="elixir">
Starting rsync should be as easy as calling out to a shell:<syntaxhighlight lang="elixir">
System.shell("rsync -a source target")
System.shell("rsync -a source target")
</syntaxhighlight>
</syntaxhighlight>
This has a few shortcomings, such as the static filenames—it feels unsafe to even demonstrate how string interpolation like <code>#{source}</code> could be misused to make this dynamic so let's skip ahead to how to <code>System.cmd</code> which is safer because it doesn't expand its argv:<syntaxhighlight lang="elixir">
This has a few shortcomings, starting with how we pass the filenames.  It's possible to have a dynamic path coming from string interpolation like <code>#{source}</code> but this gets risky: consider what happens if the filenames include whitespace or even special shell characters such as ";".
 
=== Safe path handling ===
Skipping ahead to <code>System.cmd</code>, which takes a raw argv and can't be fooled special characters in the path arguments:<syntaxhighlight lang="elixir">
System.find_executable(rsync_path)
System.find_executable(rsync_path)
|> System.cmd([~w(-a), source, target])
|> System.cmd([~w(-a), source, target])
</syntaxhighlight>Better but the calling thread loses control and gets no feedback until the transfer is complete.
</syntaxhighlight>For a short job this would be fine, but during longer transfers our program loses control and we have to wait indefinitely for the monolithic command to finish.


To run a external process asynchronously we will reach for Elixir's low-level <code>Port.open</code> which maps directly to ERTS <code>open_port</code><ref>https://www.erlang.org/doc/apps/erts/erlang.html#open_port/2</ref>.  These functions are tremendously flexible, and here we demonstrate how to turn a few knobs:<syntaxhighlight lang="elixir">
=== Asynchronous call and communication ===
To run a external process asynchronously we will reach for Elixir's low-level <code>Port.open</code> which passes all of its parameters directly<ref>See the [https://github.com/elixir-lang/elixir/blob/809b035dccf046b7b7b4422f42cfb6d075df71d2/lib/elixir/lib/port.ex#L232 port.ex source code]</ref> to ERTS <code>open_port</code><ref>https://www.erlang.org/doc/apps/erts/erlang.html#open_port/2</ref>.  These functions are tremendously flexible, here we turn a few knobs:<syntaxhighlight lang="elixir">
Port.open(
Port.open(
   {:spawn_executable, rsync_path},
   {:spawn_executable, rsync_path},