Entering incomplete nested lists in a rectangular ndarray

In Python (also using numpy), I have a list of lists of lists, each of which has a different length.

[ [ ["header1","header2"], ["---"], [], ["item1","value1"] ], [ ["header1","header2","header3"], ["item2","value2"], ["item3","value3","value4","value5"] ] ] 

I want to make this data structure rectangular: i.e. ensure that len(list[x]) is constant for all x , len(list[x][y]) is constant for all x, y, etc.

(This is because I want to import the data structure into numpy)

I can think of various non-pitonistic ways to do such a thing (iterate over the structure, record the maximum length at each level, have a second pass and pad values ​​with None , but there should be a better way.

(I would also like the solution not to depend on the dimension of the structure, i.e. it should work on lists of such structures ...)

Is there an easy way to do this that I am missing?

+2
source share
1 answer

You can create ndarray with the necessary sizes and easily read your list. Since your list is not complete, you should catch an IndexError that can be done in a try / exception block.

Using numpy.ndenumerate makes it easy to expand to a larger size (by adding more indices i,j,k,l,m,n,... to the for loop below):

 import numpy as np test = [ [ ["header1","header2"], ["---"], [], ["item1","value1"] ], [ ["header1","header2","header3"], ["item2","value2"], ["item3","value3","value4","value5"] ] ] collector = np.empty((2,4,4),dtype='|S20') for (i,j,k), v in np.ndenumerate( collector ): try: collector[i,j,k] = test[i][j][k] except IndexError: collector[i,j,k] = '' print collector #array([[['header1', 'header2', '', ''], # ['---', '', '', ''], # ['', '', '', ''], # ['item1', 'value1', '', '']], # [['header1', 'header2', 'header3', ''], # ['item2', 'value2', '', ''], # ['item3', 'value3', 'value4', 'value5'], # ['', '', '', '']]], dtype='|S10') 
+1
source

Source: https://habr.com/ru/post/945782/


All Articles