Crash at shutdown

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Crash at shutdown

Gennadiy Rozental-2
Hi,

I need help with a crash I observe both on windows and Linux. I admit my use
case is somewhat complicated, but that's reality. So, my problem consists form 3
components. Let's call them A, B, and C:

A is Python *extension* library (dll/so). This is really just an infrastructure
layer which helps with developing Python extensions like B+C. I am using Python
2.6 BTW.

B is entry point into huge extension library which consists of multiple dlls/sos
(one of them is C). This is also dll/so.

C is library within my project, which is responsible for communication with
Python *embedded* scripts. C is using Boost.Python to export some number of
symbols into Python, but the issue occurs with just one as well. C is linked
with Boost.Python library and Python interpreter

Now imagine that I have a Python script which does essentially this:

import A
B=A.load('B')
B.load('C')

I run this script and observe a crash at Python shutdown. Here are more details
about what going on in this script.

Line 1 loads Python extension
Line 2 loads library B using dlopen
Line 3 loads library C using dlopen
Object B is being released. This is done in two steps. First we invoke
B.shutdown()  which unloads C from memory using dlclose and next B is unloaded
from memory using dlclose
A is unloaded
Python interpreter is shut down

The crash occurs at the very last step in rather obscure place within Python
interpreter shutdown routines.

I found there are several "changes", which prevent the crash from happening:

1. I can stop exporting Boost.Python symbols in C
2. I can skip unloading C from B.shutdown()
3. I can link in C with B. This results in line 2 loading both B and C together.
Line 3 does nothing. B.shutdown() does nothing and C is unloaded along with B
when we call dlclose on B.

At this point I am inclined to believe that something like this is happening:

when I execute Boost.Python export statement in C it adds some records in
Boost.Python and Python interpreter. When C is unloaded from memory somehow
these records are not being cleaned up. By the time we get to clean this records
C is already unloaded from memory and either Boost.Python or Python interpreter
corrupt the memory.


Any help is appreciated,
Gennadiy

_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: Crash at shutdown

Jim Bosch-2
On 02/10/2012 04:57 PM, Gennadiy Rozental wrote:

> Hi,
>
> I need help with a crash I observe both on windows and Linux. I admit my use
> case is somewhat complicated, but that's reality. So, my problem consists form 3
> components. Let's call them A, B, and C:
>
> A is Python *extension* library (dll/so). This is really just an infrastructure
> layer which helps with developing Python extensions like B+C. I am using Python
> 2.6 BTW.
>
> B is entry point into huge extension library which consists of multiple dlls/sos
> (one of them is C). This is also dll/so.
>
> C is library within my project, which is responsible for communication with
> Python *embedded* scripts. C is using Boost.Python to export some number of
> symbols into Python, but the issue occurs with just one as well. C is linked
> with Boost.Python library and Python interpreter
>
> Now imagine that I have a Python script which does essentially this:
>
> import A
> B=A.load('B')
> B.load('C')
>
> I run this script and observe a crash at Python shutdown. Here are more details
> about what going on in this script.
>
> Line 1 loads Python extension
> Line 2 loads library B using dlopen
> Line 3 loads library C using dlopen
> Object B is being released. This is done in two steps. First we invoke
> B.shutdown()  which unloads C from memory using dlclose and next B is unloaded
> from memory using dlclose
> A is unloaded
> Python interpreter is shut down
>
> The crash occurs at the very last step in rather obscure place within Python
> interpreter shutdown routines.
>
> I found there are several "changes", which prevent the crash from happening:
>
> 1. I can stop exporting Boost.Python symbols in C
> 2. I can skip unloading C from B.shutdown()
> 3. I can link in C with B. This results in line 2 loading both B and C together.
> Line 3 does nothing. B.shutdown() does nothing and C is unloaded along with B
> when we call dlclose on B.
>
> At this point I am inclined to believe that something like this is happening:
>
> when I execute Boost.Python export statement in C it adds some records in
> Boost.Python and Python interpreter. When C is unloaded from memory somehow
> these records are not being cleaned up. By the time we get to clean this records
> C is already unloaded from memory and either Boost.Python or Python interpreter
> corrupt the memory.
>

This scenario sounds very possible, and a good candidate for those
"records" is the Boost.Python converter registry.  That's supposed to be
a global registry (in the boost_python shared library) that's shared by
all modules, and it mostly works quite well when the only dlopens
involved are the ones the Python interpreter uses when importing
modules.  I could definitely imagine it being corrupted by doing a
dlclose on the shared library that exported some classes using Boost.Python.

I'm pretty sure there's no programmatic way to remove something from the
registry, and to add such a feature you'd have to modify the
Boost.Python sources and recompile the shared library.  If you're
willing to do that, that might be a way out.

If it's at all possible, I think the safest bet would be to refactor
things so that everything that gets exported to Python happens within a
separate module that would be imported by the Python scripts, so you
only rely on Python's own dlopen calls when it involves Boost.Python
wrappers.  If that's not feasible, you might try putting the wrapper
code in a function in a library that never gets unloaded, even if that
function is called by some library that may be unloaded.

Good luck - sounds like a hard problem to solve!

Jim Bosch
_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: Crash at shutdown

Niall Douglas
On 10 Feb 2012 at 21:23, Jim Bosch wrote:

> > when I execute Boost.Python export statement in C it adds some records in
> > Boost.Python and Python interpreter. When C is unloaded from memory somehow
> > these records are not being cleaned up. By the time we get to clean this records
> > C is already unloaded from memory and either Boost.Python or Python interpreter
> > corrupt the memory.
>
> I'm pretty sure there's no programmatic way to remove something from the
> registry, and to add such a feature you'd have to modify the
> Boost.Python sources and recompile the shared library.  If you're
> willing to do that, that might be a way out.

I was about to say the same as Jim, except to add that this is really
another example of why BPL lacks Py_Finalize() support if I remember
correctly. BPL can set itself up, but it's a one way action - it
cannot unwind itself.

> If it's at all possible, I think the safest bet would be to refactor
> things so that everything that gets exported to Python happens within a
> separate module that would be imported by the Python scripts, so you
> only rely on Python's own dlopen calls when it involves Boost.Python
> wrappers.  If that's not feasible, you might try putting the wrapper
> code in a function in a library that never gets unloaded, even if that
> function is called by some library that may be unloaded.

The other thing to try is to force an immediate exit without
unwinding the DLL list :) e.g. TerminateProcess(self). Obviously, do
this after all buffers and such have been flushed, but just before
DLL unload.

The other thing, on Windows, is to manually hack increment the DLL
reference count to ensure the DLL never gets kicked out :) This works
well for this type of situation. Sometimes when working with other
people's broken libraries it's just easiest.

BTW I agree that code which cannot shut itself down cleanly is broken
and its authors should hang their heads appropriately and treat it as
a bug to be fixed rather than a feature to be added. That said, it
can be hard to anticipate every possible shutdown use case. I've
certainly been wowed by some user reports regarding my own libraries
coming in at times.

Niall

--
Technology & Consulting Services - ned Productions Limited.
http://www.nedproductions.biz/. VAT reg: IE 9708311Q. Company no:
472909.



_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig