Tuesday, November 9, 2010

Help with Python threading deadlock

Hello,
some months ago I have started developing the mirror-selector daemon.
The main work is done, it is capable to run for a few hours serving some thousand requests but it then get's into a thread deadlock scenario (the web server stops responding, gdb shows all the threads waiting on a semaphore lock).
This was my first project using multi-threading with Python and I am having a bad time finding the bug. I have recently introduced some debug code that I hope will print the stacktrace for every thread and give me an hint on the cause. It is most likely related Queue management.
The mirror-selector code is not that complex, it's available from bzr "bzr branch lp:mirror-selector", if you are experienced with Python threading and can spare a few minutes reviewing the code for the possible cause, it would be helpful.

Thanks

3 comments:

  1. I have found the problem cause. There was an uncaught exception generated on the http client code, the exception was breaking the queue consuming threads, leading to a deadlock.

    ReplyDelete
  2. Don't use threading + python. The GIL (Global Interpreter Lock) ensures you only are really running 1 process at a time.

    Instead, use multiprocess.Pool and subprocesses. That way you run things concurrently.

    ReplyDelete
  3. I understand that we can't achieve real concurrency because of the GIL, but because most of the threads are expected to block on I/O it is providing acceptable performance.
    Anyway I will check the multiprocess.Pool, hopefully I will add an optiont o select between multithread / multiprocess. Thanks for the feedback.

    ReplyDelete