Debugging Applications in Production

This evening, I noticed that an application running in a cluster of machines I look after was behaving oddly. It was using huge amounts of RAM and then crashing. What made this even more unusual was the fact that the code was written in Elixir and had been in production use for several years without any issue.

I looked in the server logs but could not see anything out of the ordinary, so I connected to the remote Elixir console and poked around a bit. In the end I came up with a nice little code fragment to find processes that are using lots of memory and produce a stack trace to help with tracking down the issue.

Process.list() |> Enum.map(fn p -> {p, :erlang.process_info(p, :memory)} end) |> Enum.filter(fn {pid, {memory, v}} -> v > 10000000 end) |> Enum.map(fn {pid, {_, m}} -> {pid, m, pid |> Process.info(), pid |> Process.info(:current_stacktrace)} end)

With the output from the above line in hand fixing the issue was fairly simple 🙂

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.