The answer is no – it is not really possible in a useful way: newsgroup post of essentially the same question
It wouldn’t be possible to have a direct array (allocated in a single chunk) of Child
s. Partly because, if somewhere else ever gets a reference to a Child
in the array, that Child
has to be kept alive (but not the whole array) which wouldn’t be possible to ensure if they were all allocated in the same chunk of memory. Additionally, were the array to be resized (if this is a requirement) then it would invalidate any other references to the objects within the array.
Therefore you’re left with having an array of pointers to Child
. Such a structure would be fine, but internally would look almost exactly like a Python list (so there’s really no benefit to doing something more complicated in Cython…).
There are a few sensible workarounds:
-
The workaround suggested in the newsgroup post is just to use a python list. You could also use a numpy array with
dtype=object
. If you need to to access a cdef function in the class you can do a cast first:cdef Child c = <Child?>a[0] # omit the ? if you don't want # the overhead of checking the type. c.some_cdef_function()
Internally both these options are stored as an C array of
PyObject
pointers to yourChild
objects and so are not as inefficient as you probably assume. -
A further possibility might be to store your data as a C struct (
cdef struct ChildStruct: ....
) which can be readily stored as an array. When you need a Python interface to that struct you can either defineChild
so it contains a copy ofChildStruct
(but modifications won’t propagate back to your original array), or a pointer toChildStruct
(but you need to be careful with ensuring that the memory is not freed which theChild
pointing to it is alive). -
You could use a Numpy structured array – this is pretty similar to using an array of C structs except Numpy handles the memory, and provides a Python interface.
-
The memoryview syntax in your question is valid:
cdef Child[:] array_of_child
. This can be initialized from a numpy array of dtypeobject
:array_of_child = np.array([(Child() for i in range(100)])
In terms of data-structure, this is an array of pointers (i.e. the same as a Python list, but can be multi-dimensional). It avoids the need for
<Child>
casting. The important thing it doesn’t do is any kind of type-checking – if you feed an object that isn’tChild
into the array then it won’t notice (because the underlyingdtype
isobject
), but will give nonsense answers or segmentation faults.In my view this approach gives you a false sense of security about two things: first that you have made a more efficient data structure (you haven’t, it’s basically the same as a list); second that you have any kind of type safety. However, it does exist. (If you want to use memoryviews, e.g. for multi-dimensional arrays, it would probably be better to use a memoryview of type
object
– this is honest about the underlying dtype)