T-SQL to update a table to remove overlapping timelines

I was wondering if anyone can help me with this SQL statement?

Let's say I have a SQL Server 2008 table as follows:

id -- INT PRIMARY KEY dtIn -- DATETIME2 dtOut -- DATETIME2 type -- INT id dtIn dtOut type 1 05:00 10:00 1 2 08:00 16:00 2 3 02:00 08:00 1 4 07:30 11:00 1 5 07:00 12:00 2 

I need to remove any time overlaps in the above table. This can be illustrated in this diagram: enter image description here

So, I came up with this SQL:

 UPDATE [table] AS t SET dtOut = (SELECT MIN(dtIn) FROM [table] WHERE type = t.type AND t.dtIn >= dtIn AND t.dtIn < dtOut) WHERE type = t.type AND t.dtIn >= dtIn AND t.dtIn < dtOut 

But that will not work. Any idea what I'm doing wrong here?

**** **** EDIT

Well, it took me a while to get to this. This seems to be working SQL for what I need:

 --BEGIN TRANSACTION; --delete identical dtIn DELETE dT1 FROM tbl dT1 WHERE EXISTS ( SELECT * FROM tbl dT2 WHERE dT1.Type = dT2.Type AND dT1.dtIn = dT2.dtIn AND ( dT1.dtOut < dT2.dtOut OR (dT1.dtOut = dT2.dtOut AND dT1.id < dT2.id) ) ); --adjust dtOuts to the max dates for overlapping section UPDATE tbl SET dtOut = COALESCE(( SELECT MAX(dtOut) FROM tbl as t1 WHERE t1.type = tbl.type AND t1.dtIn < tbl.dtOut AND t1.dtOut > tbl.dtIn ), dtOut); -- Do the actual updates of dtOut UPDATE tbl SET dtOut = COALESCE(( SELECT MIN(dtIn) FROM tbl as t2 WHERE t2.type = tbl.type AND t2.id <> tbl.id AND t2.dtIn >= tbl.dtIn AND t2.dtIn < tbl.dtOut ), dtOut); --COMMIT TRANSACTION; 
+6
source share
2 answers

On top of my head, I think one of Joe Selcoโ€™s books had this as an example of a problem. You can find the excerpt available on Google.

It could be closer. I think you really did not do the subquery in the right way.

 UPDATE table SET dtOut = ( SELECT MIN(t2.dtIn) FROM [table] as t2 WHERE t2.id <> table.id AND t2.type = table.type AND table.dtIn < t2.dtIn AND t2.dtIn < table.dtOut AND table.dtOut <= t2.dtOut ) WHERE EXISTS ( SELECT 1 FROM [table] as t3 WHERE t3.type = table.type AND t3.id <> table.id AND table.dtIn < t3.dtIn AND t3.dtIn < table.dtOut AND table.dtOut <= t3.dtOut ) 

EDIT I missed the id column at the top of the page, so itโ€™s obvious that itโ€™s better to check than to make sure that the endpoints do not match. The solution is probably simpler if you can assume that two lines of the same type have dtIn.

Btw, there is no reason to use CROSS APPLY when the subquery will do exactly the same job.

EDIT 2 I did a little testing, and I think my request is processing the script in your diagram. There is one case where he may not do what you want.

For this type, think of the last two segments S1 and S2 in order of start time. S2 begins after S1, but also imagine that it ends before S1. S2 is completely contained in the interval S1, so it is either insignificant or the information for two segments must be divided into the third segment and that the problem becomes more complicated.

So, this decision simply assumes that they can be ignored.


EDIT 3 based on update merge comment

SQLFiddle sent by OP

 -- eliminate redundant rows DELETE dT1 /* FROM tbl dT1 -- unnecessary */ WHERE EXISTS ( SELECT * FROM tbl dT2 WHERE dT1.Type = dT2.Type AND dT1.dtIn = dT2.dtIn AND ( dT1.dtOut < dT2.dtOut OR (dT1.dtOut = dT2.dtOut AND dT1.id < dT2.id) ) ); --adjust dtOuts to the max dates UPDATE tbl SET dtOut = COALESCE(( SELECT MAX(dtOut) FROM tbl as t1 WHERE t1.type = tbl.type ), dtOut); -- Do the actual updates of dtOut UPDATE tbl SET dtOut = COALESCE(( SELECT MIN(dtIn) FROM tbl as t2 WHERE t2.type = tbl.type AND t2.id <> tbl.id AND t2.dtIn >= tbl.dtIn AND t2.dtIn < tbl.dtOut ), dtOut); 

Or one of the two updates below should replace the two updates above.

 UPDATE tbl SET dtOut = ( SELECT COALESCE( MIN(dtIn), /* as long as there no GROUP BY, there always one row */ (SELECT MAX(dtOut) FROM tbl as tmax WHERE tmax.type = tbl.type) ) FROM tbl as tmin WHERE tmin.type = tbl.type AND tmin.dtIn > tbl.dtIn /* regarding the original condition in the second update: t2.dtIn >= tbl.dtIn AND t2.dtIn < tbl.dtOut dtIns can't be equal because you already deleted those and if dtIn was guaranteed to be less than dtOut it's also automatically always less than max(dtOut) */ ); UPDATE tbl SET dtOut = COALESCE( ( SELECT MIN(dtIn) FROM tbl as tmin WHERE tmin.type = tbl.type AND tmin.dtIn > tbl.dtIn ), ( SELECT MAX(dtOut) FROM tbl as tmax WHERE tmax.type = tbl.type ) ); 
+1
source

I think CROSS APPLY can do the trick:

 DECLARE @T TABLE (ID INT, DTIn DATETIME2, dtOut DATETIME2, Type INT) INSERT @T VALUES (1, '05:00', '10:00', 1), (2, '08:00', '16:00', 2), (3, '02:00', '08:00', 1), (4, '07:30', '11:00', 1), (5, '07:00', '12:00', 2) UPDATE @T SET DtOut = T3.DtOut FROM @T T1 CROSS APPLY ( SELECT MIN(DtIn) [DtOut] FROM @T T2 WHERE T2.Type = T1.Type AND T2.DtIn > T1.dtIn AND T2.DtIn < T1.dtOut ) T3 WHERE T3.dtOut IS NOT NULL SELECT * FROM @T 
+2
source

Source: https://habr.com/ru/post/920923/


All Articles