I am trying to find the best way on how to merge date ranges into one database record (array element).
This is the data I have:
Array
(
[0] => Array
(
[id] => 18298
[start_date] => 2011-07-09
[end_date] => 2011-10-01
)
[1] => Array
(
[id] => 18297
[start_date] => 2011-06-01
[end_date] => 2011-06-30
)
[2] => Array
(
[id] => 17113
[start_date] => 2011-03-31
[end_date] => 2011-05-31
)
[3] => Array
(
[id] => 20555
[start_date] => 2011-01-03
[end_date] => 2011-03-31
)
)
And af开发者_如何学编程ter we combine them, array (or database) should look like this:
Array
(
[0] => Array
(
[merged_ids] => 18298
[start_date] => 2011-07-09
[end_date] => 2011-10-01
)
[1] => Array
(
[merged_ids] => 18297, 17113, 20555
[start_date] => 2011-01-03
[end_date] => 2011-06-30
)
)
Is there any algorithm to go through all elements/ranges and combine them? Which way is better/easier to do - through database (MYSQL) or coding (PHP)?
Any advise is highly appreciated.
Thanks!
UPDATE: Sorry, I didn't provide enough info: we should merge any continuous and overlapping date ranges.
Sort by start date.
Then iterate through and check for if the next item's start date is before or directly after the current one's end date. If it is, then merge the next one into the current one. Then continue.
I've written function which combines/merges list of ranges. It is written in Python, but it should be easy to rewrite it in PHP. Here's full code: https://gist.github.com/barszczmm/8447665 and here's simplified algorithm (still in Python):
list_of_ranges.sort() # sort input list
new_list_of_ranges = [] # output list
new_range_item_start = None
new_range_item_end = None
length = len(list_of_ranges)
for i, range_item in enumerate(list_of_ranges):
if new_range_item_start is None:
new_range_item_start = range_item[0]
new_range_item_end = range_item[1]
elif new_range_item_end >= range_item[0]:
new_range_item_end = max(range_item[1], new_range_item_end)
else:
new_list_of_ranges.append((new_range_item_start, new_range_item_end))
new_range_item_start = range_item[0]
new_range_item_end = range_item[1]
# save if this is last item
if i + 1 == length:
new_list_of_ranges.append((new_range_item_start, new_range_item_end))
The implementation is like:
function mergeDateTimeRanges($ranges)
{
$retVal = [];
//sort date ranges by begin time
usort($ranges, function ($a, $b) {
return strcmp($a['begin_at'], $b['begin_at']);
});
$currentRange = [];
foreach ($ranges as $range) {
// bypass invalid value
if ($range['begin_at'] >= $range['end_at']) {
continue;
}
//fill in the first element
if (empty($currentRange)) {
$currentRange = $range;
continue;
}
if ($currentRange['end_at'] < $range['begin_at']) {
$retVal[] = $currentRange;
$currentRange = $range;
} elseif ($currentRange['end_at'] < $range['end_at']) {
$currentRange ['end_at'] = $range['end_at'];
}
}
if ($currentRange) {
$retVal[] = $currentRange;
}
return $retVal;
}
My method will generate a merged array where by overlapping or consecutive dates are grouped together and the group's ids are stored as a string of comma-space separated values.
I am using the modern "spaceship operator" (<=>
) for usort()
s comparison. If your code is running on a version below php7, you can use:
usort($array,function($a,$b){ return strcmp($a['start_date'],$b['start_date']); });
See inline comments for step-by-step explanation.
Code: (Demo)
function mergeRanges($array){
usort($array,function($a,$b){ return $a['start_date']<=>$b['start_date']; }); // order by start_date ASC
foreach($array as $i=>$row){
if($i && $row['start_date']<=date('Y-m-d',strtotime("{$result[$x]['end_date']} +1 day"))){ // not the first iteration and dates are within current group's range
if($row['end_date']>$result[$x]['end_date']){ // only if current end_date is greater than existing end_date
$result[$x]['end_date']=$row['end_date']; // overwrite end_date with new end_date in group
}
$result[$x]['merged_ids'][]=$row['id']; // append id to merged_ids subarray
}else{ // first iteration or out of range; start new group
if($i){ // if not first iteration
$result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']); // convert previous group's id elements to csv string
}else{ // first iteration
$x=-1; // declare $x as -1 so that it becomes 0 when incremented with ++$x
}
$result[++$x]=['merged_ids'=>[$row['id']],'start_date'=>$row['start_date'],'end_date'=>$row['end_date']]; // declare new group
}
}
$result[$x]['merged_ids']=implode(', ',$result[$x]['merged_ids']); // convert final merged_ids subarray to csv string
return $result;
}
$array=[
['id'=>18298,'start_date'=>'2011-07-09','end_date'=>'2011-10-01'],
['id'=>18297,'start_date'=>'2011-06-01','end_date'=>'2011-06-30'],
['id'=>17113,'start_date'=>'2011-03-31','end_date'=>'2011-05-31'], // tests that 17113 and 18297 belong in same group
['id'=>20556,'start_date'=>'2011-02-03','end_date'=>'2011-02-13'], // tests that "fully overlapped" date range is included
['id'=>20555,'start_date'=>'2011-01-03','end_date'=>'2011-03-31']
];
print_r(mergeRanges($array));
Output:
Array
(
[0] => Array
(
[merged_ids] => 20555, 20556, 17113, 18297
[start_date] => 2011-01-03
[end_date] => 2011-06-30
)
[1] => Array
(
[merged_ids] => 18298
[start_date] => 2011-07-09
[end_date] => 2011-10-01
)
)
精彩评论