Algorithm for partitioning an array into "semi-equal", uniform submatrices

Given an array with N elements, I look for M (M <N) consecutive submatrices with the same length or with lengths that differ mainly 1. For example, if N = 12 and M = 4, all sub-arrays will have the same length N / M = 3. If N = 100 and M = 12, I expect subarrays with lengths of 8 and 9, and both sizes should be evenly distributed inside the original array. This simple task turned out to be a little subtle to implement. I came up with an adaptation of the linear Bresenham algorithm that looks like this when encoded in C ++:

/// The function suggests how an array with num_data-items can be /// subdivided into successively arranged groups (intervals) with /// equal or "similar" length. The number of intervals is specified /// by the parameter num_intervals. The result is stored into an array /// with (num_data + 1) items, each of which indicates the start-index of /// an interval, the last additional index being a sentinel item which /// contains the value num_data. /// /// Example: /// /// Input: num_data ........... 14, /// num_intervals ...... 4 /// /// Result: result_start_idx ... [ 0, 3, 7, 10, 14 ] /// void create_uniform_intervals( const size_t num_data, const size_t num_intervals, std::vector<size_t>& result_start_idx ) { const size_t avg_interval_len = num_data / num_intervals; const size_t last_interval_len = num_data % num_intervals; // establish the new size of the result vector result_start_idx.resize( num_intervals + 1L ); // write the pivot value at the end: result_start_idx[ num_intervals ] = num_data; size_t offset = 0L; // current offset // use Bresenham line algorithm to distribute // last_interval_len over num_intervals: intptr_t error = num_intervals / 2; for( size_t i = 0L; i < num_intervals; i++ ) { result_start_idx[ i ] = offset; offset += avg_interval_len; error -= last_interval_len; if( error < 0 ) { offset++; error += num_intervals; } // if } // for } 

This code calculates interval lengths for N = 100, M = 12: 8 9 8 8 9 8 8 9 8 8 9 8

The actual question is that I don’t know how to precisely name my problem, so it was difficult for me to find it.

  • Are there other algorithms to accomplish such a task?
  • What are their names? Perhaps names will come if I recognize other uses.

I need an algorithm as part of a larger data clustering algorithm. I think this can also be useful for implementing parallel sorting (?).

+5
source share
2 answers

If your language has integer division that truncates, a simple way to calculate the size of partition i is via (N*i+N)/M - (N*i)/M For example, a python program

  N=100;M=12 for i in range(M): print (N*i+N)/M - (N*i)/M 

displays the numbers 8 8 9 8 8 9 8 8 9 8 8 9. With N=12;M=5 it gives 2 2 3 2 3. With N=12;M=3 it gives 4 4 4.

If your section numbers are based on 1, not 0, the expression instead is (N*i)/M - (N*iN)/M

+6
source

Spatial-filling curves and fractals subdivide the plane and reduce complexity. There is, for example, a z-curve, a Hilbert curve, a mortality curve.

0
source

Source: https://habr.com/ru/post/901975/


All Articles