The pythonic way of calculating array offsets

Question

The pythonic way of calculating array offsets

I am trying to calculate the beginning and offset of variable sized arrays and store them in a dictionary. Here is the likely non-pythonic way that I achieve this. I'm not sure if I should look for a map, lambda function or list to make the code more pythonic.

Essentially, I need to cut out the pieces of the array based on the total size and store the words xstart, ystart, x_number_of_rows_to_read, y_number_of_columns_to_read in the dictionary. The total size is variable. I cannot load the entire array into memory and use numpy indexing, or I would definitely like to. The beginning and offset are used to get the array in numpy.

intervalx = xsize / xsegment #Get the size of the chunks intervaly = ysize / ysegment #Get the size of the chunks #Setup to segment the image storing the start values and key into a dictionary. xstart = 0 ystart = 0 key = 0 d = defaultdict(list) for y in xrange(0, ysize, intervaly): if y + (intervaly * 2) < ysize: numberofrows = intervaly else: numberofrows = ysize - y for x in xrange(0, xsize, intervalx): if x + (intervalx * 2) < xsize: numberofcolumns = intervalx else: numberofcolumns = xsize - x l = [x,y,numberofcolumns, numberofrows] d[key].append(l) key += 1 return d

I understand that xrange is not ideal for port up to 3.

+6

python arrays numpy

Jzl5325 Jul 18 '12 at 20:38

source share

4 answers

Although this does not change your algorithm, a more pythonic way to write your if / else statements is:

 numberofrows = intervaly if y + intervaly * 2 < ysize else ysize - y

instead of this:

 if y + (intervaly * 2) < ysize: numberofrows = intervaly else: numberofrows = ysize - y

(and similarly for another if / else statement).

0

kamek Jul 18 '12 at 20:51

source share

Have you considered using np.memmap to load chunks dynamically? Then you just need to determine the offsets that you need on the fly, instead of blocking the array that stores the offsets.

http://docs.scipy.org/doc/numpy/reference/generated/numpy.memmap.html

0

Joshdel Jul 18 '12 at 20:52

source share

This is a long liner:

 d = [(x,y,min(x+xinterval,xsize)-x,min(y+yinterval,ysize)-y) for x in xrange(0,xsize,xinterval) for y in xrange(0,ysize,yinterval)]

0

Marco de wit Jul 18 '12 at 21:03

source share

mgilson · Accepted Answer · 2012-07-18T20:46:34+0000

This code looks great except for using defaultdict . A list seems like a much better data structure because:

Your keys are consistent
you save a list whose only element is another list in your dict.

One thing you could do:

use the ternary operator (I'm not sure if this will be an improvement, but there will be fewer lines of code)

Here's a modified version of your code with my little suggestions.

 intervalx = xsize / xsegment #Get the size of the chunks intervaly = ysize / ysegment #Get the size of the chunks #Setup to segment the image storing the start values and key into a dictionary. xstart = 0 ystart = 0 output = [] for y in xrange(0, ysize, intervaly): numberofrows = intervaly if y + (intervaly * 2) < ysize else ysize -y for x in xrange(0, xsize, intervalx): numberofcolumns = intervalx if x + (intervalx * 2) < xsize else xsize -x lst = [x, y, numberofcolumns, numberofrows] output.append(lst) #If it doesn't make any difference to your program, the above 2 lines could read: #tple = (x, y, numberofcolumns, numberofrows) #output.append(tple) #This will be slightly more efficient #(tuple creation is faster than list creation) #and less memory hungry. In other words, if it doesn't need to be a list due #to other constraints (eg you append to it later), you should make it a tuple.

Now, to get your data, you can do offset_list=output[5] instead of offset_list=d[5][0]

The pythonic way of calculating array offsets

More articles: