Find all intersections of all range sets in PostgreSQL

I am looking for an effective way to find all intersections between sets of time intervals. It should work with PostgreSQL 9.2.

Say, ranges are times when a person can meet. Each person can have one or more time ranges when they are available. I want to find all the time periods when a meeting can take place (i.e. when all people are available).

This is what I still have. This seems to work, but I don't think it is very effective as it takes into account the availability of one person at a time.

WITH RECURSIVE td AS ( -- Test data. Returns: -- ["2014-01-20 00:00:00","2014-01-31 00:00:00") -- ["2014-02-01 00:00:00","2014-02-20 00:00:00") -- ["2014-04-15 00:00:00","2014-04-20 00:00:00") SELECT 1 AS entity_id, '2014-01-01'::timestamp AS begin_time, '2014-01-31'::timestamp AS end_time UNION SELECT 1, '2014-02-01', '2014-02-28' UNION SELECT 1, '2014-04-01', '2014-04-30' UNION SELECT 2, '2014-01-15', '2014-02-20' UNION SELECT 2, '2014-04-15', '2014-05-05' UNION SELECT 3, '2014-01-20', '2014-04-20' ) , ranges AS ( -- Convert to tsrange type SELECT entity_id, tsrange(begin_time, end_time) AS the_range FROM td ) , min_max AS ( SELECT MIN(entity_id), MAX(entity_id) FROM td ) , inter AS ( -- Ranges for the lowest ID SELECT entity_id AS last_id, the_range FROM ranges r WHERE r.entity_id = (SELECT min FROM min_max) UNION ALL -- Iteratively intersect with ranges for the next higher ID SELECT entity_id, r.the_range * i.the_range FROM ranges r JOIN inter i ON r.the_range && i.the_range WHERE r.entity_id > i.last_id AND NOT EXISTS ( SELECT * FROM ranges r2 WHERE r2.entity_id < r.entity_id AND r2.entity_id > i.last_id ) ) -- Take the final set of intersections SELECT * FROM inter WHERE last_id = (SELECT max FROM min_max) ORDER BY the_range; 
+6
source share
3 answers

I created the tsrange_interception_agg aggregate

 create function tsrange_interception ( internal_state tsrange, next_data_values tsrange ) returns tsrange as $$ select internal_state * next_data_values; $$ language sql; create aggregate tsrange_interception_agg (tsrange) ( sfunc = tsrange_interception, stype = tsrange, initcond = $$[-infinity, infinity]$$ ); 

Then this request

 with td (id, begin_time, end_time) as ( values (1, '2014-01-01'::timestamp, '2014-01-31'::timestamp), (1, '2014-02-01', '2014-02-28'), (1, '2014-04-01', '2014-04-30'), (2, '2014-01-15', '2014-02-20'), (2, '2014-04-15', '2014-05-05'), (3, '2014-01-20', '2014-04-20') ), ranges as ( select id, row_number() over(partition by id) as rn, tsrange(begin_time, end_time) as tr from td ), cr as ( select r0.tr tr0, r1.tr as tr1 from ranges r0 cross join ranges r1 where r0.id < r1.id and r0.tr && r1.tr and r0.id = (select min(id) from td) ) select tr0 * tsrange_interception_agg(tr1) as interseptions from cr group by tr0 having count(*) = (select count(distinct id) from td) - 1 ; interseptions ----------------------------------------------- ["2014-02-01 00:00:00","2014-02-20 00:00:00") ["2014-01-20 00:00:00","2014-01-31 00:00:00") ["2014-04-15 00:00:00","2014-04-20 00:00:00") 
+7
source

If you have a fixed number of objects that you want to cross-reference, you can use the cross-connection for each of them and construct the intersection (using the * operator on ranges).

Using a cross join like this is probably less efficient. The following example is more about explaining a more complex example below.

 WITH td AS ( SELECT 1 AS entity_id, '2014-01-01'::timestamp AS begin_time, '2014-01-31'::timestamp AS end_time UNION SELECT 1, '2014-02-01', '2014-02-28' UNION SELECT 1, '2014-04-01', '2014-04-30' UNION SELECT 2, '2014-01-15', '2014-02-20' UNION SELECT 2, '2014-04-15', '2014-05-05' UNION SELECT 4, '2014-01-20', '2014-04-20' ) ,ranges AS ( -- Convert to tsrange type SELECT entity_id, tsrange(begin_time, end_time) AS the_range FROM td ) SELECT r1.the_range * r2.the_range * r3.the_range AS r FROM ranges r1 CROSS JOIN ranges r2 CROSS JOIN ranges r3 WHERE r1.entity_id=1 AND r2.entity_id=2 AND r3.entity_id=4 AND NOT isempty(r1.the_range * r2.the_range * r3.the_range) ORDER BY r 

In this case, multiple cross-connects are probably less efficient, because in reality you do not need to have all the possible combinations of each range in reality, since isempty(r1.the_range * r2.the_range) enough to make isempty(r1.the_range * r2.the_range * r3.the_range) true.

I do not think that you can avoid access to each person on time, since you want all of them to meet in any case.

Which can help create a phased set of intersections by cross-connecting the accessibility of each person with the previous subset that you calculated using another recursive CTE ( intersections in the example below). Then you create the intersections step by step and get rid of empty ranges, like stored arrays:

 WITH RECURSIVE td AS ( SELECT 1 AS entity_id, '2014-01-01'::timestamp AS begin_time, '2014-01-31'::timestamp AS end_time UNION SELECT 1, '2014-02-01', '2014-02-28' UNION SELECT 1, '2014-04-01', '2014-04-30' UNION SELECT 2, '2014-01-15', '2014-02-20' UNION SELECT 2, '2014-04-15', '2014-05-05' UNION SELECT 4, '2014-01-20', '2014-04-20' ) ,ranges AS ( -- Convert to tsrange type SELECT entity_id, tsrange(begin_time, end_time) AS the_range FROM td ) ,ranges_arrays AS ( -- Prepare an array of all possible intervals per entity SELECT entity_id, array_agg(the_range) AS ranges_arr FROM ranges GROUP BY entity_id ) ,numbered_ranges_arrays AS ( -- We'll join using pos+1 next, so we want continuous integers -- I've changed the example entity_id from 3 to 4 to demonstrate this. SELECT ROW_NUMBER() OVER () AS pos, entity_id, ranges_arr FROM ranges_arrays ) ,intersections (pos, subranges) AS ( -- We start off with the infinite range. SELECT 0::bigint, ARRAY['[,)'::tsrange] UNION ALL -- Then, we unnest the previous intermediate result, -- cross join it against the array of ranges from the -- next row in numbered_ranges_arrays (joined via pos+1). -- We take the intersection and remove the empty array. SELECT r.pos, ARRAY(SELECT x * y FROM unnest(r.ranges_arr) x CROSS JOIN unnest(i.subranges) y WHERE NOT isempty(x * y)) FROM numbered_ranges_arrays r INNER JOIN intersections i ON r.pos=i.pos+1 ) ,last_intersections AS ( -- We just really want the result from the last operation (with the max pos). SELECT subranges FROM intersections ORDER BY pos DESC LIMIT 1 ) SELECT unnest(subranges) r FROM last_intersections ORDER BY r 

I'm not sure if this is likely to work better, unfortunately. You will probably need a larger dataset to have meaningful tests.

+1
source

OK, I wrote and tested this in TSQL, but it should work, or at least be close enough for you to translate backwards, these are all pretty vanilla constructs. Except maybe between, but it can be broken down into <and . (thanks @Horse)

 WITH cteSched AS ( --Schedule for everyone -- Test data. Returns: -- ["2014-01-20 00:00:00","2014-01-31 00:00:00") -- ["2014-02-01 00:00:00","2014-02-20 00:00:00") -- ["2014-04-15 00:00:00","2014-04-20 00:00:00") SELECT 1 AS entity_id, '2014-01-01' AS begin_time, '2014-01-31' AS end_time UNION SELECT 1, '2014-02-01', '2014-02-28' UNION SELECT 1, '2014-04-01', '2014-04-30' UNION SELECT 2, '2014-01-15', '2014-02-20' UNION SELECT 2, '2014-04-15', '2014-05-05' UNION SELECT 3, '2014-01-20', '2014-04-20' ), cteReq as ( --List of people to schedule (or is everyone in Sched required? Not clear, doesn't hurt) SELECT 1 as entity_id UNION SELECT 2 UNION SELECT 3 ), cteBegins as ( SELECT distinct begin_time FROM cteSched as T WHERE NOT EXISTS (SELECT entity_id FROM cteReq as R WHERE NOT EXISTS (SELECT * FROM cteSched as X WHERE X.entity_id = R.entity_id AND T.begin_time BETWEEN X.begin_time AND X.end_time )) ) SELECT B.begin_time, MIN(S.end_time ) as end_time FROM cteBegins as B cross join cteSched as S WHERE B.begin_time between S.begin_time and S.end_time GROUP BY B.begin_time -- NOTE: This assume users do not have schedules that overlap with themselves! That is, nothing like -- John is available 2014-01-01 to 2014-01-15 and 2014-01-10 to 2014-01-20. 

EDIT: adding output from above (when running on SQL-Server 2008R2)
begin_time end_time
2014-01-20 2014-01-31
2014-02-01 2014-02-20
2014-04-15 2014-04-20

0
source

Source: https://habr.com/ru/post/972846/


All Articles