Writing to numpy array: good practices?

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Writing to numpy array: good practices?

Jonas Einarsson
Dear list,

First, sorry if this is a double-post, I got confused with the subscription. Anyhow, I seek an opinion on good practice.

I'd like to write simple programs that
1) (In Python) allocates numpy array,
2) (In C/C++) fills said numpy array with data.

To this end I use Boost.Python to compile an extension module. I use the (possibly obsolete?) boost/python/numeric.hpp to allow passing an ndarray to my C-functions. Then I use the numpy C API directly to extract a pointer to the underlying data.

This seemingly works very well, and I can check for correct dimensions and data types, etcetera.

As documentation is scarce, I ask you if this is an acceptable procedure? Any pitfalls nearby?

Sample code: C++

void fill_array(numeric::array& y)
{
const int ndims = 2;

// Get pointer to np array
PyArrayObject* a = (PyArrayObject*)PyArray_FROM_O(y.ptr());
if (a == NULL) {
              throw std::exception("Could not get NP array.");
      }
if (a->descr->elsize != sizeof(double))
{
throw std::exception("Must be double ndarray");
}
if (a->nd != ndims)
{
throw std::exception("Wrong dimension on array.");
}
int rows = *(a->dimensions);
int cols = *(a->dimensions+1);
double* data = (double*)a->data;

for (int i = 0; i < rows; i++)
{
for (int j = 0; j < cols; j++)
{
*(data + i*cols + j) = really_cool_function(i,j);
}
}
}

BOOST_PYTHON_MODULE(Practical01)
{
import_array();
boost::python::numeric::array::set_module_and_type("numpy", "ndarray");

def("fill_array",&fill_array);
}



And in python this could be used such as:

import Practical01
import numpy
import matplotlib.pyplot as plt
import matplotlib.cm as colormaps
import time


w=500
h=500
large_array = numpy.ones( (h,w) );

t1 = time.time()
Practical01.fill_array(large_array)
t2 = time.time()
print 'Horrible calculation took %0.3f ms' % ((t2-t1)*1000.0)

plt.imshow(large_array,cmap=colormaps.gray)
plt.show()


Simplicity is a major factor for me. I don't want a complete wrapper for ndarrays, I just want to compute and shuffle data to Python for further processing. Letting Python handle allocation and garbage collection also seems like a good idea.

Sincerely,
Jonas Einarsson
_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig
Reply | Threaded
Open this post in threaded view
|

Re: Writing to numpy array: good practices?

Jim Bosch-2
On 10/11/2011 10:39 AM, Jonas Einarsson wrote:

> Dear list,
>
> First, sorry if this is a double-post, I got confused with the
> subscription. Anyhow, I seek an opinion on good practice.
>
> I'd like to write simple programs that
> 1) (In Python) allocates numpy array,
> 2) (In C/C++) fills said numpy array with data.
>
> To this end I use Boost.Python to compile an extension module. I use the
> (possibly obsolete?) boost/python/numeric.hpp to allow passing an
> ndarray to my C-functions. Then I use the numpy C API directly to
> extract a pointer to the underlying data.
>
> This seemingly works very well, and I can check for correct dimensions
> and data types, etcetera.
>
> As documentation is scarce, I ask you if this is an acceptable
> procedure? Any pitfalls nearby?

This is very much an acceptable procedure.  It is a fairly low-level
one, so you may want to be a little more careful in some respects (see
below, and take a closer look at the Numpy C-API documentation).  But
the principal is fine.

>
> Sample code: C++
>
> void fill_array(numeric::array& y)

I'd recommend just passing boost::python::object, and using
PyArray_Check() to ensure that it is indeed an array; I really don't
know how good the old numeric interface is at matching the right types.
  But maybe I'm unnecessarily distrustful on that point.  Alternately,
you could use one of the Numpy C-API functions to get an array from just
about anything.

> {
> const int ndims = 2;
>
> // Get pointer to np array
> PyArrayObject* a = (PyArrayObject*)PyArray_FROM_O(y.ptr());

You might be leaking memory by throwing exceptions after this point; I'd
suggest making "a" a boost::python::handle<>, which will automatically
propagate a raised Python exception if you pass it a null pointer.

You should probably use something other than PyArray_FROM_O
(PyArray_FROM_ANY or PyArray_FROM_OTF, for instance), to ensure that the
flags on the numpy array are what you're expecting.  You can also have
numpy do a check on the number of dimensions and the data type at the
same time.

> if (a == NULL) {
>                throw std::exception("Could not get NP array.");
>        }
> if (a->descr->elsize != sizeof(double))
> {
> throw std::exception("Must be double ndarray");
> }
> if (a->nd != ndims)
> {
> throw std::exception("Wrong dimension on array.");
> }
> int rows = *(a->dimensions);
> int cols = *(a->dimensions+1);
> double* data = (double*)a->data;
>
> for (int i = 0; i < rows; i++)
> {
> for (int j = 0; j < cols; j++)
> {
> *(data + i*cols + j) = really_cool_function(i,j);

This works for most ndarrays (those that are C_CONTIGUOUS), but it won't
work for all of them.  It will fail if you pass in an array you've
called transpose() on, for instance.  What you really want to do is
multiply the indices by the strides.  There are macros to do this in the
Numpy C-API (PyArray_GETPTR).  I'd recommend you use those.

> }
> }
> }
>

<snip>

>
>
> Simplicity is a major factor for me. I don't want a complete wrapper for
> ndarrays, I just want to compute and shuffle data to Python for further
> processing. Letting Python handle allocation and garbage collection also
> seems like a good idea.
>

This may be the best approach for you now in that case.  There are also
efforts underway to make the Numpy C-API available through a
boost::python interface (https://svn.boost.org/svn/boost/sandbox/numpy),
but it's not entirely stable yet.

Jim
_______________________________________________
Cplusplus-sig mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/cplusplus-sig