[Boost.Python] How to adjust Python reference counts for hybrid objects?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[Boost.Python] How to adjust Python reference counts for hybrid objects?

Per Knudsgaard

   Hi,

 

   I have run into a small problem with my understanding of Python reference counts with Python/C++ hybrid objects.  I have an abstract interface called A and an associated factory interface called AFactory.  I am looking to implement various A+AFactory combinations in both C++ and Python.  I am using boosts intrusive smart pointers to make life a little more interesting.

   I have attached two files that illustrate my problem (they are a little long, sorry about that but I couldn’t come up with a better example).  On linux, compile the cpp file with:

 

$ g++ foo.cpp -g -fPIC -shared -I/usr/include/python2.7 -lpython2.7 -lboost_python -o module.so

 

   Run the py file with a single argument (‘crash’ to produce a segfault, anything else to have it work).

 

   What am I doing?  The abstract factory interface looks roughly like this (with most of the pointer stuff removed):

 

class AFactory {

  public:

    virtual A::Ptr newA() = 0;

};

 

   A single function that returns an intrusive pointer to the A interface.  I have an AFactoryWrap class that will forward the call to a python implementation.  I derive it in Python as AFactoryDerived (where ADerived is the python implementation of the A interface):

 

tmp = ADerived()

class AFactoryDerived( module.AFactory ):

    def __init__( self ):

        module.AFactory.__init__( self )

 

    def newA( self ):

        if( sys.argv[1] == 'crash' ):

            return ADerived() # segfaults

        else:

            return tmp        # works

 

   I then trigger the whole thing from a simple C++ function (that I also call from python, just to add to the fun):

 

void f2( const AFactory::Ptr &factory ) {

    A::Ptr a = factory->newA();

    a->f();

}

 

   Get a new instance of A from the factory.  Call a virtual function, implemented in Python, on it that prints “In f” on stdout.

 

   It works if the instance of ADerived is persistent (global variable) and segfaults if it is temporary (local to the newA call).  That makes me suspect that the problem is with Python reference counts and that I somehow need to maintain a reference from the c++ side so it doesn’t get garbage collected.  The boost::python::handle<> class would seem like the way to go, except I can’t seem to find any decent examples of how to use it.

 

   Am I on the right track?  I would assume that the handle<> should be added to the Wrapper constructors, and the ptrRelease function enhanced to count C++ and Python references, but the syntax of creating a handle and obtaining python reference counts is escaping me.

 

   Thanks,

 

   -- Per.


_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig

foo.cpp (2K) Download Attachment
foo.py (724 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [Boost.Python] How to adjust Python reference counts for hybrid objects?

Per Knudsgaard

   After playing around with this some more, I seem to be right in thinking that this is a problem with Python reference count.  I have tried 3 different approaches to solving the problem:

 

1.       My first thought was to get a hold of a PyObject *self pointer in the constructor and incrementing its reference count forcing the Python object to persist despite going out of scope in Python.  I found two ways to do that, described in [1] and [2].  They seem to fail for reasons described in the answers to [2].

2.       My second thought was to increment the reference count the moment the object enters into C++.  This stopped the crash, but if I instrument the __del__ method in the Python object, it gets called before I increment the reference count.  I think that is a strong indication that this method left me with memory corruption so I am hesitant to follow this path.

3.       The third choice is that I implement a C++ method that marks the object as “shared” and increments its reference count.  I have to make an explicit call to that method in the Python constructor and I check if the object has been marked shared before passing it into C++.  I have added the appropriate smart pointer logic to maintain the extra reference count as long as there are C++ pointers around.

 

   The third option seems to work.  Constructors and destructors are called in reasonable ways indicating that objects are neither get deleted prematurely nor do they stick around for too long.  I just don’t like it because I would prefer the integration to be transparent, the person implementing the derived class in python should not need to know that (s)he is deriving it from a C++ class.

 

   Which returns me to the question of how one is supposed to handle this case?  My current solution is neither crashing nor leaking memory, but it isn’t very elegant.  It kind of stands out against the elegance of Boost and Boost.Python.

 

   Thanks,

 

   -- Per.

 

   [1] http://wiki.python.org/moin/boost.python/HowTo#ownership_of_C.2B-.2B-_object_extended_in_Python

   [2] http://mail.python.org/pipermail/cplusplus-sig/2007-March/011790.html

 

From: cplusplus-sig-bounces+pknudsgaard=[hidden email] [mailto:cplusplus-sig-bounces+pknudsgaard=[hidden email]] On Behalf Of Per Knudsgaard
Sent: Thursday, November 10, 2011 6:53 PM
To: [hidden email]
Subject: [C++-sig] [Boost.Python] How to adjust Python reference counts for hybrid objects?

 

   Hi,

 

   I have run into a small problem with my understanding of Python reference counts with Python/C++ hybrid objects.  I have an abstract interface called A and an associated factory interface called AFactory.  I am looking to implement various A+AFactory combinations in both C++ and Python.  I am using boosts intrusive smart pointers to make life a little more interesting.

 

   I have attached two files that illustrate my problem (they are a little long, sorry about that but I couldn’t come up with a better example).  On linux, compile the cpp file with:

 

$ g++ foo.cpp -g -fPIC -shared -I/usr/include/python2.7 -lpython2.7 -lboost_python -o module.so

 

   Run the py file with a single argument (‘crash’ to produce a segfault, anything else to have it work).

 

   What am I doing?  The abstract factory interface looks roughly like this (with most of the pointer stuff removed):

 

class AFactory {

  public:

    virtual A::Ptr newA() = 0;

};

 

   A single function that returns an intrusive pointer to the A interface.  I have an AFactoryWrap class that will forward the call to a python implementation.  I derive it in Python as AFactoryDerived (where ADerived is the python implementation of the A interface):

 

tmp = ADerived()

class AFactoryDerived( module.AFactory ):

    def __init__( self ):

        module.AFactory.__init__( self )

 

    def newA( self ):

        if( sys.argv[1] == 'crash' ):

            return ADerived() # segfaults

        else:

            return tmp        # works

 

   I then trigger the whole thing from a simple C++ function (that I also call from python, just to add to the fun):

 

void f2( const AFactory::Ptr &factory ) {

    A::Ptr a = factory->newA();

    a->f();

}

 

   Get a new instance of A from the factory.  Call a virtual function, implemented in Python, on it that prints “In f” on stdout.

 

   It works if the instance of ADerived is persistent (global variable) and segfaults if it is temporary (local to the newA call).  That makes me suspect that the problem is with Python reference counts and that I somehow need to maintain a reference from the c++ side so it doesn’t get garbage collected.  The boost::python::handle<> class would seem like the way to go, except I can’t seem to find any decent examples of how to use it.

 

   Am I on the right track?  I would assume that the handle<> should be added to the Wrapper constructors, and the ptrRelease function enhanced to count C++ and Python references, but the syntax of creating a handle and obtaining python reference counts is escaping me.

 

   Thanks,

 

   -- Per.


_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: [Boost.Python] How to adjust Python reference counts for hybrid objects?

Per Knudsgaard

   I have gotten (3) to work, but I am wondering about two things.  Since I couldn’t get the constructor to work, I went for a modifier that returns the object.  That allows me to do the following (where NewObject is a Python object that derives from a C++ object):

 

class Factory:

   def create( self ):

       return NewObject().share()

 

   The share function looks like this:

 

PyObject *share() {

    PyObject *owner = boost::python::detail::wrapper_base_::get_owner( *this );

    Py_INCREF(owner);

    Py_INCREF(owner);

    // Fix the C++ reference count

 

    return owner;

}

 

   So, two questions:

·         Is this a legal/recommended way to do a modifier?

·         If I only do a single INCREF, then the object is destroyed while executing the NewObject().share() call.  Why do I need two INCREFs?

 

   Anyway, the unittests are passing, I now have intrusive pointers with shared ownership between C++ and Python.  Being forced to call the share method is a little annoying, but it works.  Thanks for making it possible.

 

   -- Per.

 

From: cplusplus-sig-bounces+pknudsgaard=[hidden email] [mailto:cplusplus-sig-bounces+pknudsgaard=[hidden email]] On Behalf Of Per Knudsgaard
Sent: Friday, November 11, 2011 2:41 PM
To: Development of Python/C++ integration
Subject: Re: [C++-sig] [Boost.Python] How to adjust Python reference counts for hybrid objects?

 

   After playing around with this some more, I seem to be right in thinking that this is a problem with Python reference count.  I have tried 3 different approaches to solving the problem:

 

1.       My first thought was to get a hold of a PyObject *self pointer in the constructor and incrementing its reference count forcing the Python object to persist despite going out of scope in Python.  I found two ways to do that, described in [1] and [2].  They seem to fail for reasons described in the answers to [2].

2.       My second thought was to increment the reference count the moment the object enters into C++.  This stopped the crash, but if I instrument the __del__ method in the Python object, it gets called before I increment the reference count.  I think that is a strong indication that this method left me with memory corruption so I am hesitant to follow this path.

3.       The third choice is that I implement a C++ method that marks the object as “shared” and increments its reference count.  I have to make an explicit call to that method in the Python constructor and I check if the object has been marked shared before passing it into C++.  I have added the appropriate smart pointer logic to maintain the extra reference count as long as there are C++ pointers around.

 

   The third option seems to work.  Constructors and destructors are called in reasonable ways indicating that objects are neither get deleted prematurely nor do they stick around for too long.  I just don’t like it because I would prefer the integration to be transparent, the person implementing the derived class in python should not need to know that (s)he is deriving it from a C++ class.

 

   Which returns me to the question of how one is supposed to handle this case?  My current solution is neither crashing nor leaking memory, but it isn’t very elegant.  It kind of stands out against the elegance of Boost and Boost.Python.

 

   Thanks,

 

   -- Per.

 

   [1] http://wiki.python.org/moin/boost.python/HowTo#ownership_of_C.2B-.2B-_object_extended_in_Python

   [2] http://mail.python.org/pipermail/cplusplus-sig/2007-March/011790.html

 

From: [hidden email] [[hidden email]] On Behalf Of Per Knudsgaard
Sent: Thursday, November 10, 2011 6:53 PM
To: [hidden email]
Subject: [C++-sig] [Boost.Python] How to adjust Python reference counts for hybrid objects?

 

   Hi,

 

   I have run into a small problem with my understanding of Python reference counts with Python/C++ hybrid objects.  I have an abstract interface called A and an associated factory interface called AFactory.  I am looking to implement various A+AFactory combinations in both C++ and Python.  I am using boosts intrusive smart pointers to make life a little more interesting.

 

   I have attached two files that illustrate my problem (they are a little long, sorry about that but I couldn’t come up with a better example).  On linux, compile the cpp file with:

 

$ g++ foo.cpp -g -fPIC -shared -I/usr/include/python2.7 -lpython2.7 -lboost_python -o module.so

 

   Run the py file with a single argument (‘crash’ to produce a segfault, anything else to have it work).

 

   What am I doing?  The abstract factory interface looks roughly like this (with most of the pointer stuff removed):

 

class AFactory {

  public:

    virtual A::Ptr newA() = 0;

};

 

   A single function that returns an intrusive pointer to the A interface.  I have an AFactoryWrap class that will forward the call to a python implementation.  I derive it in Python as AFactoryDerived (where ADerived is the python implementation of the A interface):

 

tmp = ADerived()

class AFactoryDerived( module.AFactory ):

    def __init__( self ):

        module.AFactory.__init__( self )

 

    def newA( self ):

        if( sys.argv[1] == 'crash' ):

            return ADerived() # segfaults

        else:

            return tmp        # works

 

   I then trigger the whole thing from a simple C++ function (that I also call from python, just to add to the fun):

 

void f2( const AFactory::Ptr &factory ) {

    A::Ptr a = factory->newA();

    a->f();

}

 

   Get a new instance of A from the factory.  Call a virtual function, implemented in Python, on it that prints “In f” on stdout.

 

   It works if the instance of ADerived is persistent (global variable) and segfaults if it is temporary (local to the newA call).  That makes me suspect that the problem is with Python reference counts and that I somehow need to maintain a reference from the c++ side so it doesn’t get garbage collected.  The boost::python::handle<> class would seem like the way to go, except I can’t seem to find any decent examples of how to use it.

 

   Am I on the right track?  I would assume that the handle<> should be added to the Wrapper constructors, and the ptrRelease function enhanced to count C++ and Python references, but the syntax of creating a handle and obtaining python reference counts is escaping me.

 

   Thanks,

 

   -- Per.


_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: [Boost.Python] How to adjust Python reference counts for hybrid objects?

Jim Bosch-2
On 11/14/2011 09:54 PM, Per Knudsgaard wrote:
> I have gotten (3) to work, but I am wondering about two things.
> Since I couldn't get the constructor to work, I went for a modifier
> that returns the object.

Sorry no one responded to your questions earlier. I havne't looked too
closely yet, but I thought I'd point out that you have basically written
a big workaround for the fact that you're using intrusive_ptr instead of
shared_ptr. Pretty much all of the things you're trying to do work
out-of-the-box with shared_ptr, and you don't have to worry about the
reference count yourself at all.

Unfortunately, similar support for intrusive_ptr isn't really present in
Boost.Python, but that's because intrusive_ptr isn't nearly as flexible.

>
> So, two questions:
>
> *         Is this a legal/recommended way to do a modifier?

Legal, yes.  I think.  If switching to shared_ptr is a possibility, I'd
consider that the recommended way to deal with the entire problem.
Without it, I think you're in rather uncharted territory.

> *         If I only do a single INCREF, then the object is destroyed
> while executing the NewObject().share() call.  Why do I need two
> INCREFs?

I'm a bit stumped by this, too, but I haven't looked as closely as I
might have if you didn't seem to have a satisfactory solution or if I
didn't think shared_ptr was really the way to go here.  Part of the
explanation might be the fact that Boost.Python doesn't hold a reference
to what you get out of get_owner(), so you don't have one either, and
that might do funny things when the only Python reference is a temporary
object you're calling the share() method on.  But I still would have
expected that to be safe.

HTH

Jim
_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: [Boost.Python] How to adjust Python reference counts for hybrid objects?

Per Knudsgaard
   Thanks for your answer.

   I generally prefer intrusive pointers for a number of reasons:  They are safer (no problems converting between raw and managed).  They are generally faster (no cache misses because the reference count is embedded in your object).  They don't require extra allocations (saving a few bytes for every object).  None of those are really that significant here, but since I have no libraries or third party objects in my application, I don't encounter the most obvious downside.

   That said, after writing the email last night, I realized that I could make a minor change to PtrBase which eliminates the need for the share function.  It has a slight performance cost (one indirection but it is also in the pure C++ case which bugs me a little) and it made the code much cleaner.  The strange thing is that I no longer need to call the Py_INCREF function twice.  Ah, well.

   I have mostly used SWIG in the past so I am on a learning curve with Boost.Python.  Apart from somewhat cryptic error messages, I like it.  At some point I will start digging in the code to see how the details are working.  Maybe I can make some charts as I go along :)

   -- Per.

-----Original Message-----
From: Jim Bosch [mailto:[hidden email]]
Sent: Tuesday, November 15, 2011 3:43 PM
To: Development of Python/C++ integration
Cc: Per Knudsgaard
Subject: Re: [C++-sig] [Boost.Python] How to adjust Python reference counts for hybrid objects?

On 11/14/2011 09:54 PM, Per Knudsgaard wrote:
> I have gotten (3) to work, but I am wondering about two things.
> Since I couldn't get the constructor to work, I went for a modifier
> that returns the object.

Sorry no one responded to your questions earlier. I havne't looked too closely yet, but I thought I'd point out that you have basically written a big workaround for the fact that you're using intrusive_ptr instead of shared_ptr. Pretty much all of the things you're trying to do work out-of-the-box with shared_ptr, and you don't have to worry about the reference count yourself at all.

Unfortunately, similar support for intrusive_ptr isn't really present in Boost.Python, but that's because intrusive_ptr isn't nearly as flexible.

>
> So, two questions:
>
> *         Is this a legal/recommended way to do a modifier?

Legal, yes.  I think.  If switching to shared_ptr is a possibility, I'd consider that the recommended way to deal with the entire problem.
Without it, I think you're in rather uncharted territory.

> *         If I only do a single INCREF, then the object is destroyed
> while executing the NewObject().share() call.  Why do I need two
> INCREFs?

I'm a bit stumped by this, too, but I haven't looked as closely as I might have if you didn't seem to have a satisfactory solution or if I didn't think shared_ptr was really the way to go here.  Part of the explanation might be the fact that Boost.Python doesn't hold a reference to what you get out of get_owner(), so you don't have one either, and that might do funny things when the only Python reference is a temporary object you're calling the share() method on.  But I still would have expected that to be safe.

HTH

Jim
_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig