Replace list of rooms with flat subbands

Question

Replace list of rooms with flat subbands

Given a list of numbers, for example:

lst = [0, 10, 15, 17]

I need a list containing elements from i -> i + 3 for all i in lst . If there are overlapping ranges, I would like them to be combined.

So, in the above example, we first get:

 [0, 1, 2, 3, 10, 11, 12, 13, 15, 16, 17, 18, 17, 18, 19, 20]

But for the last two groups, the ranges overlap, so when you combine them, you have:

 [0, 1, 2, 3, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20]

This is my desired result.

This is what I was thinking:

 from collections import OrderedDict res = list(OrderedDict.fromkeys([y for x in lst for y in range(x, x + 4)]).keys()) print(res) = [0, 1, 2, 3, 10, 11, 12, 13, 15, 16, 17, 18, 19, 20]

However, this is slow ( 10000 loops, best of 3: 56 µs per loop ). I would like it to be possible with a numpy solution or a python solution that would be faster than that.

+5

python list numpy

cᴏʟᴅsᴘᴇᴇᴅ Aug 28 '17 at 16:58

source share

1 answer

Divakar · Accepted Answer · 2017-08-28T17:01:06+0000

Approach # 1: One approach based on broadcasted summation, and then using np.unique to get unique numbers -

 np.unique(np.asarray(lst)[:,None] + np.arange(4))

Approach No. 2: Another, based on broadcast summation and then masking -

 def mask_app(lst, interval_len = 4): arr = np.array(lst) r = np.arange(interval_len) ranged_vals = arr[:,None] + r a_diff = arr[1:] - arr[:-1] valid_mask = np.vstack((a_diff[:,None] > r, np.ones(interval_len,dtype=bool))) return ranged_vals[valid_mask]

Runtime test

Original approach -

 from collections import OrderedDict def org_app(lst): list(OrderedDict.fromkeys([y for x in lst for y in range(x, x + 4)]).keys())

Dates -

 In [409]: n = 10000 In [410]: lst = np.unique(np.random.randint(0,4*n,(n))).tolist() In [411]: %timeit org_app(lst) ...: %timeit np.unique(np.asarray(lst)[:,None] + np.arange(4)) ...: %timeit mask_app(lst, interval_len = 4) ...: 10 loops, best of 3: 32.7 ms per loop 1000 loops, best of 3: 1.03 ms per loop 1000 loops, best of 3: 671 µs per loop In [412]: n = 100000 In [413]: lst = np.unique(np.random.randint(0,4*n,(n))).tolist() In [414]: %timeit org_app(lst) ...: %timeit np.unique(np.asarray(lst)[:,None] + np.arange(4)) ...: %timeit mask_app(lst, interval_len = 4) ...: 1 loop, best of 3: 350 ms per loop 100 loops, best of 3: 14.7 ms per loop 100 loops, best of 3: 9.73 ms per loop

The bottleneck with the two hosted approaches seems to be related to the conversion to array , although this seems to pay off well afterwards. Just to give an idea of the time taken to convert for the last data set -

 In [415]: %timeit np.array(lst) 100 loops, best of 3: 5.6 ms per loop

Replace list of rooms with flat subbands

More articles: