SQL Challenge / Puzzle: stack trace set. How to find the top element at every moment in time?

  • In my real life, nested range merging was used. I drew a few sketches and then saw the similarities between the ranges starting and ending for PUSH and POP operations. I realized that solving this problem would also solve the original problem.

  • The op column may be removed from the question. When val is NULL, then this is a POP operation, otherwise it is a PUSH operation.

Puzzle

The stack_trace table contains the following columns:

  • I am an integer value that represents a point in time.
  • op - 2 possible actions: I ("in" / "push") and O ("out" / "pop").
  • val - the value entered by the "in" / "push" or NULL operation for the "out" / "pop" operation.

    The goal is to find what was the value at the top of the stack at each point in time ( i ).

eg.

(NULL values ​​are presented here as empty spaces)

Data:

i op val -- -- -- 1 IA 2 IB 3 O 4 IC 5 O 6 O 

desired result:

 i top_of_stack_val -- ---------------- 1 A 2 B 3 A 4 C 5 A 6 

Requirements

  • The solution should be the only SQL query (subqueries in order).
  • Only the following sentences are allowed: SELECT , FROM , WHERE , GROUP , HAVING , ORDER BY .
  • Using WITH (CTE - Common Table Expression) is not allowed .
  • Using T-SQL, PL / SQL, etc. not allowed .
  • The use of UDF (Custom Functions) is not permitted .
  • The use of variables is not allowed .

Sample data

 create table stack_trace ( i int ,op char(1) ,val char(1) ) ; insert into stack_trace (i,op,val) values (1,'I','A'); insert into stack_trace (i,op,val) values (2,'I','B'); insert into stack_trace (i,op,val) values (3,'I','C'); insert into stack_trace (i,op,val) values (4,'I','D'); insert into stack_trace (i,op,val) values (5,'I','E'); insert into stack_trace (i,op) values (6,'O'); insert into stack_trace (i,op) values (7,'O'); insert into stack_trace (i,op) values (8,'O'); insert into stack_trace (i,op,val) values (9,'I','F'); insert into stack_trace (i,op) values (10,'O'); insert into stack_trace (i,op,val) values (11,'I','G'); insert into stack_trace (i,op,val) values (12,'I','H'); insert into stack_trace (i,op) values (13,'O'); insert into stack_trace (i,op) values (14,'O'); insert into stack_trace (i,op,val) values (15,'I','I'); insert into stack_trace (i,op,val) values (16,'I','J'); insert into stack_trace (i,op,val) values (17,'I','K'); insert into stack_trace (i,op,val) values (18,'I','L'); insert into stack_trace (i,op,val) values (19,'I','M'); insert into stack_trace (i,op) values (20,'O'); insert into stack_trace (i,op,val) values (21,'I','N'); insert into stack_trace (i,op) values (22,'O'); insert into stack_trace (i,op,val) values (23,'I','O'); insert into stack_trace (i,op) values (24,'O'); insert into stack_trace (i,op,val) values (25,'I','P'); insert into stack_trace (i,op) values (26,'O'); insert into stack_trace (i,op) values (27,'O'); insert into stack_trace (i,op,val) values (28,'I','Q'); insert into stack_trace (i,op,val) values (29,'I','R'); insert into stack_trace (i,op) values (30,'O'); insert into stack_trace (i,op) values (31,'O'); insert into stack_trace (i,op) values (32,'O'); insert into stack_trace (i,op) values (33,'O'); insert into stack_trace (i,op) values (34,'O'); insert into stack_trace (i,op) values (35,'O'); insert into stack_trace (i,op,val) values (36,'I','S'); insert into stack_trace (i,op) values (37,'O'); insert into stack_trace (i,op) values (38,'O'); insert into stack_trace (i,op,val) values (39,'I','T'); insert into stack_trace (i,op,val) values (40,'I','U'); insert into stack_trace (i,op) values (41,'O'); insert into stack_trace (i,op,val) values (42,'I','V'); insert into stack_trace (i,op,val) values (43,'I','W'); insert into stack_trace (i,op,val) values (44,'I','X'); insert into stack_trace (i,op) values (45,'O'); insert into stack_trace (i,op) values (46,'O'); insert into stack_trace (i,op,val) values (47,'I','Y'); insert into stack_trace (i,op) values (48,'O'); insert into stack_trace (i,op) values (49,'O'); insert into stack_trace (i,op,val) values (50,'I','Z'); insert into stack_trace (i,op) values (51,'O'); insert into stack_trace (i,op) values (52,'O'); 

Necessary Results

 i top_of_stack_val -- ---------------- 1 A 2 B 3 C 4 D 5 E 6 D 7 C 8 B 9 F 10 B 11 G 12 H 13 G 14 B 15 I 16 J 17 K 18 L 19 M 20 L 21 N 22 L 23 O 24 L 25 P 26 L 27 K 28 Q 29 R 30 Q 31 K 32 J 33 I 34 B 35 A 36 S 37 A 38 39 T 40 U 41 T 42 V 43 W 44 X 45 W 46 V 47 Y 48 V 49 T 50 Z 51 T 52 
+5
source share
5 answers

Personally, I doubt that you will eventually find SQL that you can only use in all SQL Server, Teradata, Postgres, and Oracle, and which has acceptable performance if there is a table at all.

The SQL Server ( demo ) solution will be as follows:

 SELECT i, SUBSTRING(MAX(FORMAT(i, 'D10') + val) OVER (PARTITION BY Pos ORDER BY i ROWS UNBOUNDED PRECEDING), 11, 8000) FROM (SELECT st.*, sum(CASE WHEN op = 'I' THEN 1 ELSE -1 END) OVER (ORDER BY i ROWS UNBOUNDED PRECEDING) AS pos FROM stack_trace st) t1 ORDER BY i; 
+3
source

This is a good puzzle.

As the main Teradata DBMS, I wrote a solution for it using analytical functions (TD14.10 +):

 SELECT dt.*, -- find the last item in the stack with the same position Last_Value(val IGNORE NULLS) Over (PARTITION BY pos ORDER BY i) AS top_of_stack_val FROM ( SELECT st.*, -- calculate the number of items in the stack Sum(CASE WHEN op = 'I' THEN 1 ELSE -1 end) Over (ORDER BY i ROWS Unbounded Preceding) AS pos FROM stack_trace AS st ) AS dt; 

This solution also works for Oracle, but PostgreSQL and SQL Server do not support the IGNORE NULLS parameter for LAST_VALUE , and emulating it is quite complicated, for example, see Itzk Ben-Gan. The last non-NULL puzzle

Edit: Actually this is not that complex, I forgot Itzik 2nd solution, the old piggyback trick ;-)

Martin Smith's approach will work for all four DBMSs.

+5
source

Although this requires an additional step -

Generic solution for Hive , Teradata strong>, Oracle , SQL Server and PostgreSQL . >

 select si ,min (s.val) over ( partition by s.depth ,s.depth_val_seq ) as top_of_stack_val from (select si ,s.val ,s.depth ,count (s.val) over ( partition by s.depth order by si rows between unbounded preceding and current row ) as depth_val_seq from (select si ,s.val ,sum (case s.op when 'I' then 1 else -1 end) over ( order by si rows between unbounded preceding and current row ) as depth from stack_trace s ) s ) s order by i ; 
+3
source

This is a really interesting problem. I would do this to keep track of each item position on the stack. You can do this using the total amount:

 select st.*, sum(case when op = 'I' then 1 else -1 end) over (order by i) as pos from stack_trace st; 

Alas, at the moment, I think you need a rather complicated join or subquery to figure out the most recent value that relates to pos . Here is one way:

 with st as ( select st.*, sum(case when op = 'I' then 1 else -1 end) over (order by i) as pos from stack_trace st ) select st.*, (select val from st st2 where st2.i <= st.id and st2.pos = st.pos and st2.val is not null order by i desc fetch first 1 row only ) as top_of_stack_val from st; 
+1
source

Teradata h1>

 select si ,first_value (s.val) over ( partition by s.depth order by si reset when s.op = 'I' ) as top_of_stack_val from (select si ,s.val ,s.op ,sum (case s.op when 'I' then 1 else -1 end) over ( order by si rows unbounded preceding ) as depth from stack_trace s ) s order by i ; 
0
source

Source: https://habr.com/ru/post/1257901/


All Articles