Linking Elixir processes together
Intro#
If you actually don’t know much about Elixir processes but you already learned something about Elixir checkout [my post about processes]({% post_url 2015-10-22-elixir-pingpong-table %}) and come back afterwords. I assume that you know how spawn/1 and spawn/3 work and you know basic concepts around process communication.
Connecting processes together.#
It is very common that processes are somehow related to each other, we also know that they do not share state. Right now you should be asking: “How to make sure that life of process depends on another?”. Erlang provides mechanism called linking. In Elixir we can use it by calling spawn_link. Let’s check what docs have to say about that function.
In my opinion, that description is quite vage, so let’s put to the test.
We received 3, which is OK, but
Q: what if we call spawn instead of spawn_link?
Q: The result is exactly the same, so what is the exact purpose of spawn_link then?
A: The answer is: “Error handling”
Or to be more specific: “spawn_link will notify linked process about abnormal exit reason for the dependent process”. But before going any deeper we need to really understand what actually happens when process finishes its work.
When process ends#
When a process finishes its work, it exits. It is a different mechanism than exceptions and with it we can detect when something wrong (or unexpected) happens. When process finishes its work it implicitly calls exit(:normal) to communicate with its parent that job has been done. Every other argument to exit/1 than :normal is treated as an error. You should also know that Elixir shell is a process as well, so you should be able to link to it as well.
So far there are no changes between spawn/1 and spawn_link/1, that is because we exit the process with :normal reason. But what would happen for other reasons?
BINGO! Our shell processes does not catch :EXIT message, so link propagates the exit reason further to the process that observers elixir shell. The “observer” catches the error and restarts the shell process. But how to actually handle exit messages from linked processes?
Trapping exists#
Now that we know how to track exits from linked processes, the question is:
Q: How to actually react to failures of linked processes?
A: trap_exit
Each process can be flagged, meaning you can customize its properties like minimal heap size, priority level, trapping and many more advanced things. Erlang’s documentation lists them all. The most interesting part is this:
Setting trapping flag to true means that, exit signals arriving to a process are converted to {‘EXIT’, from, reason} messages, which can be received as ordinary messages. If trap_exit is set to false, the process exits if it receives an exit signal other than normal and the exit signal is propagated to its linked processes. Application processes should normally not trap exits.
So setting trapping flag is not a common thing, it is because OTP from Erlang provides special building blocks for managing failures of other processes - supervisors. I will describe them in the next blog post, but for the time being we will go against the wind which is what curious programmers like most. Let’s start with trapping exists from linked processes by setting the proper flag.
As it is said in the description, returned value is false which is a previous state of that flag. If you called it again it would return true. Now that we have trapping enabled let’s link to the process which calculates 1 + 1 and finishes its work. We call flush afterwords to receive incoming messages to the shell process.
Spawned process sends us {:EXIT, #PID<0.145.0>, :normal}. This means that process finishes its work without any problems. Now, let’s replicate that behavior more explicitly.
Now let’s try to exit the process with a message different then normal.
Great, we intercept exit call and statement such as
gives us a way to react to processes exits. But what about exceptions? They cause processes to die too!
Great, exceptions are trapped as well. We could react to them during processes exists.
Visualizing links#
Now that we have some knowledge about links, i’ll show a visualization of simple demo. LinksTest module creates a chain of linked processes and then tracks their exits.
Here is the code:
And here is the demo
The interesting fact is that after exit with :chain_breaks_here, next exists are :normal. It is because the first process that catches :chain_breaks_here exit code consumes it and then exits normally, so the error is swallowed by the first process that catches it. If we didn’t trap exits in the chain and exit in the last process normally (:normal), the process would not exit at all - links prevent this. In other words: When calling exit(:normal) the process will not finish, it will stay up. Let’s demo that as well!
Here is the code:
And here is the demo
Links are bidirectional#
So far we have been killing processes that are children of other processes. The question is:
Q: If I kill the parent, do links break then?
A: Yes, links are bidirectional
Let’s create a tree of processes and kill the root of that tree:
The code is fairly simple, create_graph function creates a kernel process, which spawns and link 2 node processes and each node creates 3 leafs. Then we kill the kernel PID and see what is going to happen. Hard to imagine? Let’s visualize that!
Everything happens as planned and the entire tree is killed. Death of the kernel process causes a chain reaction and everything dies. Even our shell which is linked with the kernel dies as well!