Size naming in python?

There is something that I would really appreciate is the ability to name the sizes in an array in python. For example, I have a numpy array with 3 dimensions, and I will regularly sum them up with specific sizes.

So, I can do with ndarray a :

 sum(a, axis=2) 

if my relevant dimension is the last, but I want to make it "position independent", that is, the user can provide any array if he sets "this dimension to" DI "(for example, for" Dimension is of interest "). So basically I I would like to write:

 sum(a, axis="DI") 

Up to NETCDF, but I do not want to realize the full potential of netcdf.

+4
source share
2 answers

@ The idea of โ€‹โ€‹M456 is smart, but if you have the same naming scheme for multiple arrays, I think a simpler solution would be to just use a dictionary:

 axes = { 'DA': 0, 'DB':1 } a.sum(axes['DA']) 

or even just variables:

 DA, DB, DC = range(3) a.sum(DA) 

If this is your last (or penultimate, etc.) axis, just use -1 (or -2 , etc.):

 a.shape #(2,3,4) np.all(a.sum(2) == a.sum(-1)) #True np.all(a.sum(0) == a.sum(-3)) #True 
+3
source

You can write a thinly wrapped subclass to np.ndarray . But keeping the correspondence between dimensions and names can be tricky.

 class NamedArray(np.ndarray): def __new__(cls, *args, **kwargs): obj = np.ndarray(args[0], **kwargs).view(cls) return obj def __init__(self, *args, **kwargs): self.dim_names = None if len(args) == 2: self.dim_names = args[1] def sum(self, *args, **kwargs): if (self.dim_names is not None) and (type(kwargs['axis']) == str): axis_name = kwargs.pop('axis') axis_ind = self.dim_names.index(axis_name) kwargs['axis'] = axis_ind return super().sum(*args, **kwargs) #regular ndarray a = NamedArray([1,2,3], dtype=np.float32) #ndarray with dimension names b = NamedArray([1,2,3], ('d1', 'd2', 'd3'), dtype=np.float32) 

Edit: Pandas DataFrame is currently pretty close to what the OP asked.

+3
source

Source: https://habr.com/ru/post/1479382/


All Articles