I think this problem is related to the query optimization that Azure Data Lake Analytics does; but let's see ...
I have 2 separate queries (TVF) that do aggregation and then the final query to join 2 together for final results. So...
Table > Header Query
Table > Detail Query
Result = Header Query + Detail Query
To check all the logic, I run the secondary queries separately with a filter, saving the results to a file, and then use the hard files as sources for the final query; this is the total duration (minutes).
Header Query 1.4 (408 rows)
Detail Query 0.9 (3298 rows)
Final Query 0.9 (408 rows)
Therefore, I know that the maximum, I can get my result in about 3.5 minutes. However, I do not want to create new intermediary files. I want to use TDF directly to submit the final request.
With TDF in the final query, the work schedule reaches approximately 97% of the progress in about 1.5 minutes. But then all hell is torn! The last node is a collection with 2500 vertices, in which the calculation time is 16 minutes. So my question is ... WHY?
Is this the case when I don’t understand some fundamental concepts about how Azure works?
So can anyone explain what is going on? Any help was appreciated.
Final request:
@Header =
SELECT [CTNNumber],
[CTNCycleNo],
[SeqStart],
[SeqEnd],
[StartUTC],
[EndUTC],
[StartLoc],
[StartType],
[EndLoc],
[EndType],
[Start Step],
[Start Ctn Status],
[Start Fill Status],
[EndStep],
[End Ctn Status],
[End Fill Status]
FROM [Play].[getCycles3]
("") AS X;
@Detail =
SELECT [CTNNumber],
[SeqNo] AS [SeqNo],
[LocationType],
[LocationID],
[BizstepDescription],
[ContainerStatus],
[FillStatus],
[UTCTimeStampforEvent]
FROM [Play].[getRaw]
("") AS Z;
@result =
SELECT
H.[CTNNumber], H.[CTNCycleNo], H.[SeqStart], H.[SeqEnd]
,COUNT([D].[SeqNo]) AS [SeqCount]
//, COUNT(DISTINCT [LocationID]) AS [
FROM
@Header AS [H]
INNER JOIN
@Detail AS [D]
ON
[H].[CTNNumber] == [D].[CTNNumber]
WHERE
[D].[SeqNo] >= [H].[SeqStart] AND
[D].[SeqNo] <= [H].[SeqEnd]
GROUP BY
H.[CTNNumber], H.[CTNCycleNo], H.[SeqStart], H.[SeqEnd]
;
