How to debug deadlocks or fatal errors
Few weeks back I was working on parallel execution in gem called Dynflow and I ran into a deadlock. As you may know deadlock is an exception of class
fatal. This special exception is not rescuable in fact none of the rescue blocks is evaluated when
fatal is risen. This makes its debugging pretty hard.
Lets have a simple example generating deadlock.
require 'thread' # cannot join on current main thread, it would wait forever Thread.current.join
./fatal.rb:3:in `join': deadlock detected (fatal) from ./fatal.rb:3:in `<top (required)>' from -e:1:in `load' from -e:1:in `<main>'
Unfortunately event though rescue blocks are not evaluated ensure blocks are. I am using RubyMine and for some reason standard debugger breakpoint does not work in ensure block on line 6 at following example.
require 'thread' begin Thread.current.join ensure p $! if $! end
Produces following output without stopping on line 6.
#<fatal: deadlock detected> ./fatal.rb:4:in `join': deadlock detected (fatal) from ./fatal.rb:4:in `<top (required)>' from -e:1:in `load' from -e:1:in `<main>'
I could not google any solution but there is a nice trick. Pry can be used in ensure block.
require 'thread' begin Thread.current.join ensure binding.pry if $! end
It will start pry session right after deadlock was risen giving an opportunity to inspect still running Ruby process to find out what is wrong. It's also very useful to combine
pry with gem called pry-stack_explorer to able to inspect current stack like in debugger.
In the end it gave me enough information to find the problem. Hopefully it will save you some time if you run into similar issue.
Note: This examples are in Ruby 1.9.3. In Ruby 2.0.0
Thread.current.join raises nice
ThreadError which is subclass of
StandardError which can be debugged/inspected using usual means. Nevertheless similar deadlock like problem can rise in different situations in Ruby 2.0.0 too.