SQL Server 2005 recursive query with loops in data - is this possible?

Question

SQL Server 2005 recursive query with loops in data - is this possible?

I have a standard table of subordinate employees / subordinates. I need to select the boss (the specified ID) and all his subordinates (and their substrings, etc.). Unfortunately, the data in the real world contain some cycles (for example, both owners of the company asked each other as their boss). A simple recursive query with a CTE throttles (maximum recursion level of 100 exceeded). Can employees still be selected? I don't care about the order in which they are selected, just so that each one is selected once.

Added: Do you want my request? Umm ... Well ... I, although it's pretty obvious, but ... here it is:

with UserTbl as -- Selects an employee and his subordinates. ( select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = @UserID union all select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID]) ) select * from UserTbl

Added 2: Oh, if it was not clear - this is a production system, and I need to update a bit (basically add some kind of report). Thus, I would prefer not to modify the data if it can be avoided.

+3

sql loops sql-server recursion common-table-expression

Vilx- Jul 28 '09 at 9:22

source share

10 answers

Hossein · Answer 1 · 2011-12-16T03:51:52+0000

I know that some time has passed, but I thought that I should share my experience when I tried each solution, and here is a summary of my conclusions (maybe this message?):

Adding a column with the current path really worked, but it turned out that this is not an option for me.
I could not find a way to do this using CTE.
I wrote a recursive SQL function that adds employeeIds to the table. To bypass the circular link, you need to make sure that duplicate identifiers are not added to the table. Performance was average, but undesirable.

Having done all this, I came up with the idea of dumping the entire subset of [suitable] employees into code (C #) and filtering them there using the recursive method. Then I wrote a filtered list of employees to a datatable and exported it to my stored procedure as a temporary table. To my disbelief, this turned out to be the fastest and most flexible method for both small and relatively large tables (I tried tables with up to 35,000 rows).

Adriaan stander · Answer 2 · 2009-07-28T10:10:31+0000

this will work for the initial recursive link, but may not work for longer links

 DECLARE @Table TABLE( ID INT, PARENTID INT ) INSERT INTO @Table (ID,PARENTID) SELECT 1, 2 INSERT INTO @Table (ID,PARENTID) SELECT 2, 1 INSERT INTO @Table (ID,PARENTID) SELECT 3, 1 INSERT INTO @Table (ID,PARENTID) SELECT 4, 3 INSERT INTO @Table (ID,PARENTID) SELECT 5, 2 SELECT * FROM @Table DECLARE @ID INT SELECT @ID = 1 ;WITH boss (ID,PARENTID) AS ( SELECT ID, PARENTID FROM @Table WHERE PARENTID = @ID ), bossChild (ID,PARENTID) AS ( SELECT ID, PARENTID FROM boss UNION ALL SELECT t.ID, t.PARENTID FROM @Table t INNER JOIN bossChild b ON t.PARENTID = b.ID WHERE t.ID NOT IN (SELECT PARENTID FROM boss) ) SELECT * FROM bossChild OPTION (MAXRECURSION 0)

I would recommend using a while loop and inserting links only into the temporary table if the identifier does not already exist, thereby removing endless loops.

van · Answer 3 · 2009-07-28T10:40:01+0000

Not a general solution, but may work for your case: in your request, select this:

 select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID])

to become:

 select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID]) and a.[User_ID] <> @UserID

Phil factor · Answer 4 · 2009-07-28T18:01:22+0000

You do not have to do this recursively. This can be done in a WHILE loop. I guarantee that it will be faster: it was for me every time I did timings using two methods. This sounds inefficient, but it is not, because the number of loops is the level of recursion. At each iteration, you can check the loop and fix where this happens. You can also put a constraint on the temporary table to trigger an error if a loop occurs, although you seem to prefer something that deals with the loop more elegantly. You can also trigger an error when the while loop repeats at a certain number of levels (to catch an undetected loop? - oh, the boy sometimes happens.

The trick is to re-insert into the temporary table (which is loaded with root elements), including the column with the current iteration number, and perform an inner join between the last results in the temporary table and the children in the original table. Just exit the loop when @@ rowcount = 0! Simple eh?

Jose Chama · Answer 5 · 2009-12-31T17:13:33+0000

I know you asked this question a while ago, but here is a solution that can work to detect infinite recursive loops. I create a path and I checked in the CTE state if the user ID is in the path and if he will not process it again. Hope this helps.

Jose

 DECLARE @Table TABLE( USER_ID INT, MANAGER_ID INT ) INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 1, 2 INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 2, 1 INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 3, 1 INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 4, 3 INSERT INTO @Table (USER_ID,MANAGER_ID) SELECT 5, 2 DECLARE @UserID INT SELECT @UserID = 1 ;with UserTbl as -- Selects an employee and his subordinates. ( select '/'+cast( a.USER_ID as varchar(max)) as [path], a.[User_ID], a.[Manager_ID] from @Table a where [User_ID] = @UserID union all select b.[path] +'/'+ cast( a.USER_ID as varchar(max)) as [path], a.[User_ID], a.[Manager_ID] from @Table a inner join UserTbl b on (a.[Manager_ID]=b.[User_ID]) where charindex('/'+cast( a.USER_ID as varchar(max))+'/',[path]) = 0 ) select * from UserTbl

Mladen prajdic · Answer 6 · 2009-07-28T09:41:33+0000

basicaly, if you have loops like in the data, you have to do the return logic yourself. you can use one cit to get only subordinates and others to get bosses.

Another idea is to have a dummy line as a boss for both owners so that they are not each other's boss, which is ridiculous. this is my prefferd option.

AK · Answer 7 · 2009-07-28T13:27:37+0000

The preferred solution is to clear the data and ensure that in the future you will not have any cycles - this can be done using a trigger or UDF wrapped in a control constraint.

However, you can use multidisciplinary UDF, as I showed here: Avoid infinite loops. Part one

You can add the NOT IN () clause to the union to filter the loops.

Matijs · Answer 8 · 2009-07-28T13:52:58+0000

You need some kind of method so that your recursive query doesn't add the user id to the set already. However, since subqueries and duplicate references to the recursive table are not allowed (thanks van ), you need another solution to delete users already on the list.

The solution is to use EXCEPT to delete these lines. This should work as directed. Several recursive operators associated with operators of type union are allowed. Removing users already on the list means that after a certain number of iterations, the recursive result set returns empty and the recursion stops.

 with UserTbl as -- Selects an employee and his subordinates. ( select a.[User_ID], a.[Manager_ID] from [User] a WHERE [User_ID] = @UserID union all ( select a.[User_ID], a.[Manager_ID] from [User] a join UserTbl b on (a.[Manager_ID]=b.[User_ID]) where a.[User_ID] not in (select [User_ID] from UserTbl) EXCEPT select a.[User_ID], a.[Manager_ID] from UserTbl a ) ) select * from UserTbl;

Another option is a hardcode level variable that will stop the query after a fixed number of iterations or use the MAXRECURSION query option hint, but I think this is not what you want.

Shannon severance · Answer 9 · 2009-07-28T17:11:57+0000

I can imagine two approaches.

1) Create more lines than you want, but enable validation to make sure it is not too deep. Then delete the duplicate user entries.

2) Use the string to store already visited users. Like not in the subquery, the idea did not work.

Approach 1:

 ; with TooMuchHierarchy as ( select "User_ID" , Manager_ID , 0 as Depth from "User" WHERE "User_ID" = @UserID union all select U."User_ID" , U.Manager_ID , M.Depth + 1 as Depth from TooMuchHierarchy M inner join "User" U on U.Manager_ID = M."user_id" where Depth < 100) -- Warning MAGIC NUMBER!! , AddMaxDepth as ( select "User_ID" , Manager_id , Depth , max(depth) over (partition by "User_ID") as MaxDepth from TooMuchHierarchy) select "user_id", Manager_Id from AddMaxDepth where Depth = MaxDepth

The where Depth < 100 is what prevents you from getting a maximum recursion error. Make this number smaller and fewer records will be created that need to be thrown away. Make it too small and employees will not be returned, so make sure it is no less than the depth of the org chart stored. The battle of an accompanying nightmare as the company grows. If it should be larger, add option (maxrecursion ... number ...) to everything to allow recursion.

Approach 2:

 ; with Hierarchy as ( select "User_ID" , Manager_ID , '#' + cast("user_id" as varchar(max)) + '#' as user_id_list from "User" WHERE "User_ID" = @UserID union all select U."User_ID" , U.Manager_ID , M.user_id_list + '#' + cast(U."user_id" as varchar(max)) + '#' as user_id_list from Hierarchy M inner join "User" U on U.Manager_ID = M."user_id" where user_id_list not like '%#' + cast(U."User_id" as varchar(max)) + '#%') select "user_id", Manager_Id from Hierarchy

Myitchychin · Answer 10 · 2009-07-28T20:02:16+0000

This is the code I used in the project to chase hierarchical relationship trees up and down.

Custom function to capture subordinates:

 CREATE FUNCTION fn_UserSubordinates(@User_ID INT) RETURNS @SubordinateUsers TABLE (User_ID INT, Distance INT) AS BEGIN IF @User_ID IS NULL RETURN INSERT INTO @SubordinateUsers (User_ID, Distance) VALUES ( @User_ID, 0) DECLARE @Distance INT, @Finished BIT SELECT @Distance = 1, @Finished = 0 WHILE @Finished = 0 BEGIN INSERT INTO @SubordinateUsers SELECT S.User_ID, @Distance FROM Users AS S JOIN @SubordinateUsers AS C ON C.User_ID = S.Manager_ID LEFT JOIN @SubordinateUsers AS C2 ON C2.User_ID = S.User_ID WHERE C2.User_ID IS NULL IF @@RowCount = 0 SET @Finished = 1 SET @Distance = @Distance + 1 END RETURN END

Custom function to capture managers:

 CREATE FUNCTION fn_UserManagers(@User_ID INT) RETURNS @User TABLE (User_ID INT, Distance INT) AS BEGIN IF @User_ID IS NULL RETURN DECLARE @Manager_ID INT SELECT @Manager_ID = Manager_ID FROM UserClasses WITH (NOLOCK) WHERE User_ID = @User_ID INSERT INTO @UserClasses (User_ID, Distance) SELECT User_ID, Distance + 1 FROM dbo.fn_UserManagers(@Manager_ID) INSERT INTO @User (User_ID, Distance) VALUES (@User_ID, 0) RETURN END

SQL Server 2005 recursive query with loops in data - is this possible?

More articles: