I hope I understood correctly:
You start with type = "start" and go down the hierarchy, where out-data is the identifier of the next node. This line has an undefined depth and must end with node with type = "user".
In your second example, there are 2 Tasks with type = "user", but only one of them is referred to as out-data in a higher node chain.
My example will filter the second with an optional EXISTS clause.
The first CTE DerivedTable consists of a query, which you can also use in isolation. It will open full information in a tabular format.
The second CTE is recursive, starts from the beginning and crosses the chain. The Rank column is the order of the chains.
The third CTE adds the inverse rank since you seem to be only interested in the latter. You can get it through WHERE RevRank=1
DECLARE @process TABLE(ID INT IDENTITY, workflowXML XML); INSERT INTO @process(workflowXML) VALUES ('<process> <Event type="start" id="StartEvent_1"> <outgoing>SequenceFlow_0h5l5vu</outgoing> </Event> <Flow type="sequence" id="SequenceFlow_0h5l5vu" sourceRef="StartEvent_1" targetRef="Task_1qc93ha"/> <Flow type="sequence" id="SequenceFlow_120gi3p" sourceRef="Task_1qc93ha" targetRef="Task_0x1pjee"/> <Task type="service" id="Task_1qc93ha"> <incoming>SequenceFlow_0h5l5vu</incoming> <outgoing>SequenceFlow_120gi3p</outgoing> </Task> <Task type="user" id="Task_0x1pjee"> <incoming>SequenceFlow_120gi3p</incoming> </Task> </process>') ,('<process id="Process_1" isExecutable="false"> <Event type="start" id="StartEvent_142xowk"> <outgoing>SequenceFlow_03yocm5</outgoing> </Event> <Flow type="sequence" id="SequenceFlow_03yocm5" sourceRef="StartEvent_142xowk" targetRef="Task_12g1q69"/> <Task type="user" id="Task_0x1pjee"> <incoming>SequenceFlow_120gi3p</incoming> </Task> <Task type="user" id="Task_12g1q69"> <incoming>SequenceFlow_03yocm5</incoming> </Task> </process>');
is a request:
WITH DerivedTable AS ( SELECT prTbl.ID AS tblID ,nd.value('local-name(.)','nvarchar(max)') AS [Name] ,nd.value('@type','nvarchar(max)') AS [Type] ,nd.value('@id','nvarchar(max)') AS Id ,COALESCE(nd.value('@sourceRef','nvarchar(max)') ,nd.value('(incoming)[1]','nvarchar(max)')) AS [In] ,COALESCE(nd.value('@targetRef','nvarchar(max)') ,nd.value('(outgoing)[1]','nvarchar(max)')) AS [Out] FROM @process AS prTbl CROSS APPLY prTbl.workflowXML.nodes('process') AS A(pr) CROSS APPLY pr.nodes('*') AS B(nd) ) ,recCTE AS ( SELECT tblID,[Name],[Type],Id,[In],[Out],1 AS [Rank] FROM DerivedTable WHERE [Type]='start' UNION ALL SELECT x.tblID,x.[Name],x.[Type],x.Id,x.[In],x.[Out],r.[Rank]+1 FROM recCTE AS r INNER JOIN DerivedTable AS x ON x.[Id]=r.[Out] AND EXISTS(SELECT 1 FROM DerivedTable AS y WHERE y.tblID=x.tblID AND y.[Out]=x.[Id]) ) ,ReverseRank AS ( SELECT * ,ROW_NUMBER() OVER(PARTITION BY tblID ORDER BY [Rank] DESC) AS RevRank FROM recCTE ) SELECT * FROM ReverseRank ORDER BY tblID,[Rank]
Result (your expected result is RevRank = 1):
+-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | tblID | Rank | RevRank | Name | Type | Id | In | Out | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 1 | 1 | 5 | Event | start | StartEvent_1 | NULL | SequenceFlow_0h5l5vu | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 1 | 2 | 4 | Flow | sequence | SequenceFlow_0h5l5vu | StartEvent_1 | Task_1qc93ha | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 1 | 3 | 3 | Task | service | Task_1qc93ha | SequenceFlow_0h5l5vu | SequenceFlow_120gi3p | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 1 | 4 | 2 | Flow | sequence | SequenceFlow_120gi3p | Task_1qc93ha | Task_0x1pjee | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 1 | 5 | 1 | Task | user | Task_0x1pjee | SequenceFlow_120gi3p | NULL | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 2 | 1 | 3 | Event | start | StartEvent_142xowk | NULL | SequenceFlow_03yocm5 | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 2 | 2 | 2 | Flow | sequence | SequenceFlow_03yocm5 | StartEvent_142xowk | Task_12g1q69 | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+ | 2 | 3 | 1 | Task | user | Task_12g1q69 | SequenceFlow_03yocm5 | NULL | +-------+------+---------+-------+----------+----------------------+----------------------+----------------------+
UPDATE: Your comment
I checked my request with XML from your comment:
INSERT INTO @process(workflowXML) VALUES ('<process> <Event type="start" id="e1"> <outgoing>s1</outgoing> </Event> <Flow type="sequence" id="s1" sourceRef="e1" targetRef="t1" /> <Flow type="sequence" id="s3" sourceRef="t1" targetRef="t2" /> <Task type="user" id="t3"> <incoming>s2</incoming> </Task> <Task type="user" id="t1"> <incoming>s1</incoming> <outgoing>s3</outgoing> </Task> <Flow type="sequence" id="s2" sourceRef="t2" targetRef="t3" /> <Task type="service" id="t2"> <incoming>s3</incoming> <outgoing>s2</outgoing> </Task> </process>');
This is the result.
+-------+-------+----------+----+------+------+------+---------+ | tblID | Name | Type | Id | In | Out | Rank | RevRank | +-------+-------+----------+----+------+------+------+---------+ | 1 | Event | start | e1 | NULL | s1 | 1 | 7 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Flow | sequence | s1 | e1 | t1 | 2 | 6 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Task | user | t1 | s1 | s3 | 3 | 5 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Flow | sequence | s3 | t1 | t2 | 4 | 4 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Task | service | t2 | s3 | s2 | 5 | 3 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Flow | sequence | s2 | t2 | t3 | 6 | 2 | +-------+-------+----------+----+------+------+------+---------+ | 1 | Task | user | t3 | s2 | NULL | 7 | 1 | +-------+-------+----------+----+------+------+------+---------+
If I understand the logic correctly, my query works very well:
- event id = e1 points to s1
- Thread s1 points to t1
- Task t1 points to s3
- Stream s3 points to t2
- Task t2 points to s2
- Stream s2 points to t3
- Task t3 is the end
The only thing I see differently is the fact that Task t1 was already type = "user". If you want - in any case - the highest task of the user, you can remove ReverseRank -CTE and set the final SELECT as
SELECT t.* FROM recCTE AS t WHERE t.[Rank]<=ISNULL((SELECT MIN(x.[Rank]) FROM recCTE AS x WHERE x.tblID=t.tblID AND x.[Type]='user' AND x.[Name]='Task'),999) ORDER BY t.tblID,t.[Rank]
Now task t1 will be the last result, since all subsequent ranks are filtered out.