The following query identifies all possible rearrangements of orders within 60 minutes of each other:
DECLARE @orders TABLE (CustomerId INT, OrderId INT, OrderDate DATETIME) INSERT INTO @orders VALUES (1, 1, '2012-03-01 01:58:00'), (1, 2, '2012-03-01 02:01:00'), (1, 3, '2012-03-01 02:03:00'), (2, 4, '2012-03-01 02:15:00'), (3, 5, '2012-03-01 02:18:00'), (3, 6, '2012-03-01 04:30:00'), (4, 7, '2012-03-01 04:35:00'), (5, 8, '2012-03-01 04:38:00'), (6, 9, '2012-03-01 04:58:00'), (6, 10, '2012-03-01 04:59:00'); with ProximityOrderCascade(CustomerId, OrderId, ProximateOrderId, MinutesDifference, OrderDate, ProximateOrderDate) as ( select o.customerid, o.orderid, null, null, o.orderdate, o.orderdate from @orders o union all select o.customerid, o.orderid, p.orderid, datediff(minute, p.OrderDate, o.OrderDate), o.OrderDate, p.OrderDate from ProximityOrderCascade p inner join @orders o on p.customerid = o.customerid and abs(datediff(minute, p.OrderDate, o.OrderDate)) between 0 and 60 and o.orderid <> p.orderid where proximateorderid is null ) select * from ProximityOrderCascade where not ProximateOrderId is null
From there, you can convert the results to a query of your choice. The results of this function only identify customers 1 and 6 as having โduplicateโ orders.
CustomerId OrderId ProximateOrderId MinutesDifference OrderDate ProximateOrderDate ----------- ----------- ---------------- ----------------- ----------------------- ----------------------- 6 9 10 -1 2012-03-01 04:58:00.000 2012-03-01 04:59:00.000 6 10 9 1 2012-03-01 04:59:00.000 2012-03-01 04:58:00.000 1 1 3 -5 2012-03-01 01:58:00.000 2012-03-01 02:03:00.000 1 2 3 -2 2012-03-01 02:01:00.000 2012-03-01 02:03:00.000 1 1 2 -3 2012-03-01 01:58:00.000 2012-03-01 02:01:00.000 1 3 2 2 2012-03-01 02:03:00.000 2012-03-01 02:01:00.000 1 2 1 3 2012-03-01 02:01:00.000 2012-03-01 01:58:00.000 1 3 1 5 2012-03-01 02:03:00.000 2012-03-01 01:58:00.000 (8 row(s) affected)