Date Range Merge / Merge Algorithm

I am trying to find a better way to combine date ranges into a single database record (array element).

This is the data that I have:

Array ( [0] => Array ( [id] => 18298 [start_date] => 2011-07-09 [end_date] => 2011-10-01 ) [1] => Array ( [id] => 18297 [start_date] => 2011-06-01 [end_date] => 2011-06-30 ) [2] => Array ( [id] => 17113 [start_date] => 2011-03-31 [end_date] => 2011-05-31 ) [3] => Array ( [id] => 20555 [start_date] => 2011-01-03 [end_date] => 2011-03-31 ) ) 

And after we combine them, the array (or database) should look like this:

 Array ( [0] => Array ( [merged_ids] => 18298 [start_date] => 2011-07-09 [end_date] => 2011-10-01 ) [1] => Array ( [merged_ids] => 18297, 17113, 20555 [start_date] => 2011-01-03 [end_date] => 2011-06-30 ) ) 

Is there any algorithm for going through all the elements / ranges and combining them? Which way is better / easier to do - through a database (MYSQL) or coding (PHP)?

Any recommendations are greatly appreciated.

Thanks!

UPDATE: Sorry, I did not provide enough information: we must combine any continuous and overlapping date ranges.

+4
source share
4 answers

Sort by start date.

Then go and check if there is a start date for the next item before or immediately after the current end date. If so, merge the next into the current one. Then continue.

+11
source

I wrote a function that merges / merges a list of ranges. It is written in Python, but it is easy to rewrite it in PHP. Here's the full code: https://gist.github.com/barszczmm/8447665 and here's a simplified algorithm (still in Python):

 list_of_ranges.sort() # sort input list new_list_of_ranges = [] # output list new_range_item_start = None new_range_item_end = None length = len(list_of_ranges) for i, range_item in enumerate(list_of_ranges): if new_range_item_start is None: new_range_item_start = range_item[0] new_range_item_end = range_item[1] elif new_range_item_end >= range_item[0]: new_range_item_end = max(range_item[1], new_range_item_end) else: new_list_of_ranges.append((new_range_item_start, new_range_item_end)) new_range_item_start = range_item[0] new_range_item_end = range_item[1] # save if this is last item if i + 1 == length: new_list_of_ranges.append((new_range_item_start, new_range_item_end)) 
0
source

The implementation is similar to:

 function mergeDateTimeRanges($ranges) { $retVal = []; //sort date ranges by begin time usort($ranges, function ($a, $b) { return strcmp($a['begin_at'], $b['begin_at']); }); $currentRange = []; foreach ($ranges as $range) { // bypass invalid value if ($range['begin_at'] >= $range['end_at']) { continue; } //fill in the first element if (empty($currentRange)) { $currentRange = $range; continue; } if ($currentRange['end_at'] < $range['begin_at']) { $retVal[] = $currentRange; $currentRange = $range; } elseif ($currentRange['end_at'] < $range['end_at']) { $currentRange ['end_at'] = $range['end_at']; } } if ($currentRange) { $retVal[] = $currentRange; } return $retVal; } 
0
source

My method will generate a combined array where, by overlapping or sequential dates, are grouped together, and the group identifiers are stored as a string of comma-separated values.

I use the modern "spaceship operator" ( <=> ) to compare usort() . If your code runs below version php7, you can use:

 usort($array,function($a,$b){ return strcmp($a['start_date'],$b['start_date']); }); 

See inline comments for a step-by-step explanation.

Code: ( Demo )

 function mergeRanges($array){ usort($array,function($a,$b){ return $a['start_date']<=>$b['start_date']; }); // order by start_date ASC foreach($array as $i=>$row){ if($i && $row['start_date']<=date('Ym-d',strtotime("{$result[$x]['end_date']} +1 day"))){ // not the first iteration and dates are within current group range if($row['end_date']>$result[$x]['end_date']){ // only if current end_date is greater than existing end_date $result[$x]['end_date']=$row['end_date']; // overwrite end_date with new end_date in group } $result[$x]['merged_ids'][]=$row['id']; // append id to merged_ids subarray }else{ // first iteration or out of range; start new group if($i){ // if not first iteration $result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']); // convert previous group id elements to csv string }else{ // first iteration $x=-1; // declare $x as -1 so that it becomes 0 when incremented with ++$x } $result[++$x]=['merged_ids'=>[$row['id']],'start_date'=>$row['start_date'],'end_date'=>$row['end_date']]; // declare new group } } $result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']); // convert final merged_ids subarray to csv string return $result; } $array=[ ['id'=>18298,'start_date'=>'2011-07-09','end_date'=>'2011-10-01'], ['id'=>18297,'start_date'=>'2011-06-01','end_date'=>'2011-06-30'], ['id'=>17113,'start_date'=>'2011-03-31','end_date'=>'2011-05-31'], // tests that 17113 and 18297 belong in same group ['id'=>20556,'start_date'=>'2011-02-03','end_date'=>'2011-02-13'], // tests that "fully overlapped" date range is included ['id'=>20555,'start_date'=>'2011-01-03','end_date'=>'2011-03-31'] ]; print_r(mergeRanges($array)); 

Output:

 Array ( [0] => Array ( [merged_ids] => 20555, 20556, 17113, 18297 [start_date] => 2011-01-03 [end_date] => 2011-06-30 ) [1] => Array ( [merged_ids] => 18298 [start_date] => 2011-07-09 [end_date] => 2011-10-01 ) ) 
0
source

Source: https://habr.com/ru/post/1339391/


All Articles