Given an array with N elements, I look for M (M <N) consecutive submatrices with the same length or with lengths that differ mainly 1. For example, if N = 12 and M = 4, all sub-arrays will have the same length N / M = 3. If N = 100 and M = 12, I expect subarrays with lengths of 8 and 9, and both sizes should be evenly distributed inside the original array. This simple task turned out to be a little subtle to implement. I came up with an adaptation of the linear Bresenham algorithm that looks like this when encoded in C ++:
/// The function suggests how an array with num_data-items can be /// subdivided into successively arranged groups (intervals) with /// equal or "similar" length. The number of intervals is specified /// by the parameter num_intervals. The result is stored into an array /// with (num_data + 1) items, each of which indicates the start-index of /// an interval, the last additional index being a sentinel item which /// contains the value num_data. /// /// Example: /// /// Input: num_data ........... 14, /// num_intervals ...... 4 /// /// Result: result_start_idx ... [ 0, 3, 7, 10, 14 ] /// void create_uniform_intervals( const size_t num_data, const size_t num_intervals, std::vector<size_t>& result_start_idx ) { const size_t avg_interval_len = num_data / num_intervals; const size_t last_interval_len = num_data % num_intervals; // establish the new size of the result vector result_start_idx.resize( num_intervals + 1L ); // write the pivot value at the end: result_start_idx[ num_intervals ] = num_data; size_t offset = 0L; // current offset // use Bresenham line algorithm to distribute // last_interval_len over num_intervals: intptr_t error = num_intervals / 2; for( size_t i = 0L; i < num_intervals; i++ ) { result_start_idx[ i ] = offset; offset += avg_interval_len; error -= last_interval_len; if( error < 0 ) { offset++; error += num_intervals; } // if } // for }
This code calculates interval lengths for N = 100, M = 12: 8 9 8 8 9 8 8 9 8 8 9 8
The actual question is that I donβt know how to precisely name my problem, so it was difficult for me to find it.
- Are there other algorithms to accomplish such a task?
- What are their names? Perhaps names will come if I recognize other uses.
I need an algorithm as part of a larger data clustering algorithm. I think this can also be useful for implementing parallel sorting (?).
source share