The main problem is that your objects are stored out of memory, with an attribute in each object dictionary. But for the array to work, the values ​​must be stored in a continuous databuffer.
I studied this on other SO issues, but the ones you found were before. Nevertheless, I have nothing to add.
np.array([a_stub.x for a_stub in stubs])
Alternatives using itertools or fromiter should not change the speed much, because the a_stub.x time a_stub.x gets access not so much to the iteration mechanism. You can verify that by checking for something simpler, like
np.array([1 for _ in range(len(stubs))]
I suspect the best option is to use one or more arrays as the main repository and refactor your class so that the attribute is retrieved from that repository.
If you know that you will have 10 objects, then create an empty array of this size. When you create an object, you assign it a unique index. The x attribute can be property , which getter / setter refers to the data[i] element of this array. Having made the x property instead of the main attribute, you should be able to store most of the object's machinery. And you can experiment with different storage methods by simply changing a few methods.
I tried to sketch this using the class attribute as the main storage of the array, but I still have some errors.
A class with an x attribute that accesses an array:
class MyObj(object): xdata = np.zeros(10) def __init__(self,idx, x): self._idx = idx self.set_x(x) def set_x(self,x): self.xdata[self._idx] = x def get_x(self): return self.xdata[self._idx] def __repr__(self): return "<obj>x=%s"%self.get_x() x = property(get_x, set_x) In [67]: objs = [MyObj(i, 3*i) for i in range(10)] In [68]: objs Out[68]: [<obj>x=0.0, <obj>x=3.0, <obj>x=6.0, ... <obj>x=27.0] In [69]: objs[3].x Out[69]: 9.0 In [70]: objs[3].xdata Out[70]: array([ 0., 3., 6., 9., 12., 15., 18., 21., 24., 27.]) In [71]: objs[3].xdata += 3 In [72]: [ox for o in objs] Out[72]: [3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0, 27.0, 30.0]
In place of changing the array is the easiest. But you can also replace the array itself (and thus "grow" a set of classes)
In [79]: MyObj.xdata=np.ones((20,)) In [80]: a = MyObj(11,25) In [81]: a Out[81]: <obj>x=25.0 In [82]: MyObj.xdata Out[82]: array([ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 25., 1., 1., 1., 1., 1., 1., 1., 1.]) In [83]: [ox for o in objs] Out[83]: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
We must be careful about attribute modification. For example, I tried
objs[3].xdata += 3
intending to change xdata for the whole class. But this led to the appointment of a new xdata array only for this object. We should also be able to automatically increase the index of the object (these days I am more familiar with numpy methods than Python class structures).
If I replaced getter with one that retrieves the slice:
def get_x(self): return self.xdata[self._idx:self._idx+1] In [107]: objs=[MyObj(i,i*3) for i in range(10)] In [109]: objs Out[109]: [<obj>x=[ 0.], <obj>x=[ 3.], ... <obj>x=[ 27.]]
np.info (or .__array_interface__ ) gives me information about the xdata array, including its pointer to the databuffer:
In [110]: np.info(MyObj.xdata) class: ndarray shape: (10,) strides: (8,) itemsize: 8 aligned: True contiguous: True fortran: True data pointer: 0xabf0a70 byteorder: little byteswap: False type: float64
The slice for the first object points to the same place:
In [111]: np.info(objs[0].x) class: ndarray shape: (1,) strides: (8,) itemsize: 8 .... data pointer: 0xabf0a70 ...
The following object points to the following float (another 8 bytes):
In [112]: np.info(objs[1].x) class: ndarray shape: (1,) ... data pointer: 0xabf0a78 ....
I'm not sure if access with slice / view is worth it or not.