SQL - search for missing int values ​​in most ordered sequential rows

I run a message-based system in which a sequence of unique integer identifiers will be fully represented at the end of the day, although they will not necessarily arrive in order.

I am looking for help finding missing identifiers in this series using SQL. If the column values ​​have a lower value, how can I find which identifiers I skip in this sequence, in this case 6 ?

The sequence starts and ends at an arbitrary point every day, so min and max will be different for each run. Coming from the background of Perl through some regular expression.

 ids 1 2 3 5 4 7 9 8 10 

Help would be greatly appreciated.

Edit: we run the oracle

Edit2: Thanks everyone. I will work through your decisions next week in the office.

Edit3: I currently dwell on something like below: ORIG_ID is the original id column, and MY_TABLE is the original column. If you take a closer look at my data, there are many cases in the line, except for the number of data. In some cases, there is a prefix or suffix for non-numeric characters. In other cases, dashes or spaces are placed in the numeric identifier. In addition, identifiers periodically appear several times, so I included different ones.

I would appreciate further input, especially regarding the best way to remove non-zero characters.

 SELECT CASE WHEN NUMERIC_ID + 1 = NEXT_ID - 1 THEN TO_CHAR( NUMERIC_ID + 1 ) ELSE TO_CHAR( NUMERIC_ID + 1 ) || '-' || TO_CHAR( NEXT_ID - 1 ) END MISSING_SEQUENCES FROM ( SELECT NUMERIC_ID, LEAD (NUMERIC_ID, 1, NULL) OVER ( ORDER BY NUMERIC_ID ASC ) AS NEXT_ID FROM ( SELECT DISTINCT TO_NUMBER( REGEXP_REPLACE(ORIG_ID,'[^[:digit:]]','') ) AS NUMERIC_ID FROM MY_TABLE ) ) WHERE NEXT_ID != NUMERIC_ID + 1 
+2
source share
6 answers

I was there.

FOR ORACLE:

I found this extremely useful query on the net a while ago and noted, however I don’t remember the site now, you can search for "GAP ANALYSIS" on Google.

 SELECT CASE WHEN ids + 1 = lead_no - 1 THEN TO_CHAR (ids +1) ELSE TO_CHAR (ids + 1) || '-' || TO_CHAR (lead_no - 1) END Missing_track_no FROM (SELECT ids, LEAD (ids, 1, NULL) OVER (ORDER BY ids ASC) lead_no FROM YOURTABLE ) WHERE lead_no != ids + 1 

Here is the result:

 MISSING _TRACK_NO ----------------- 6 

If there were a few spaces, say 2,6,7,9, then this would be:

 MISSING _TRACK_NO ----------------- 2 6-7 9 
+5
source

This is sometimes called an exception. That is, try making a connection and return only the strings in which there is no match.

 SELECT t1.value-1 FROM ThisTable AS t1 LEFT OUTER JOIN ThisTable AS t2 ON t1.id = t2.value+1 WHERE t2.value IS NULL 

Note that this will always indicate at least one line, which will be MIN value .

In addition, if there are spaces of two or more numbers, it will only report any missing value.

+3
source

You did not specify your DBMS, so I assume PostgreSQL:

 select aid as missing_id from generate_series( (select min(id) from message), (select max(id) from message)) as aid left join message m on m.id = aid where m.id is null; 

This will report any missing value in the sequence between the minimum and maximum id in your table (including spaces exceeding one)

  psql (9.1.1)
 Type "help" for help.

 postgres => select * from message;
  id
 ----
   1
   2
   3
   4
   5
   7
   8
   9
  eleven
  14
 (10 rows)


 postgres => select aid as missing_id
 postgres-> from generate_series ((select min (id) from message), (select max (id) from message)) as aid
 postgres-> left join message m on m.id = aid
 postgres-> where m.id is null;
  missing_id
 ------------
           6
          10
          12
          thirteen
 (4 rows)
 postgres => 
+1
source

I applied it in mysql, it worked.

 mysql> select * from sequence; +--------+ | number | +--------+ | 1 | | 2 | | 4 | | 6 | | 7 | | 8 | +--------+ 6 rows in set (0.00 sec) mysql> SELECT t1.number - 1 FROM sequence AS t1 LEFT OUTER JOIN sequence AS t2 O N t1.number = t2.number +1 WHERE t2.number IS NULL; +---------------+ | t1.number - 1 | +---------------+ | 0 | | 3 | | 5 | +---------------+ 3 rows in set (0.00 sec) 
0
source
 SET search_path='tmp'; DROP table tmp.table_name CASCADE; CREATE table tmp.table_name ( num INTEGER NOT NULL PRIMARY KEY); -- make some data INSERT INTO tmp.table_name(num) SELECT generate_series(1,20); -- create some gaps DELETE FROM tmp.table_name WHERE random() < 0.3 ; SELECT * FROM table_name; -- EXPLAIN ANALYZE WITH zbot AS ( SELECT 1+tn.num AS num FROM table_name tn WHERE NOT EXISTS ( SELECT * FROM table_name nx WHERE nx.num = tn.num+1 ) ) , ztop AS ( SELECT -1+tn.num AS num FROM table_name tn WHERE NOT EXISTS ( SELECT * FROM table_name nx WHERE nx.num = tn.num-1 ) ) SELECT zbot.num AS bot ,ztop.num AS top FROM zbot, ztop WHERE zbot.num <= ztop.num AND NOT EXISTS ( SELECT * FROM table_name nx WHERE nx.num >= zbot.num AND nx.num <= ztop.num ) ORDER BY bot,top ; 

Result:

 CREATE TABLE INSERT 0 20 DELETE 9 num ----- 1 2 6 7 10 11 13 14 15 18 19 (11 rows) bot | top -----+----- 3 | 5 8 | 9 12 | 12 16 | 17 (4 rows) 

Note: recursive CTE is also possible (and probably shorter).

UPDATE: here goes the recursive CTE ...:

 WITH RECURSIVE tree AS ( SELECT 1+num AS num FROM table_name t0 UNION SELECT 1+num FROM tree tt WHERE EXISTS ( SELECT * FROM table_name xt WHERE xt.num > tt.num ) ) SELECT * FROM tree WHERE NOT EXISTS ( SELECT * FROM table_name nx WHERE nx.num = tree.num ) ORDER BY num ; 

Results: (same data)

  num ----- 3 4 5 8 9 12 16 17 20 (9 rows) 
0
source
 select student_key, next_student_key from ( select student_key, lead(student_key) over (order by student_key) next_fed_cls_prgrm_key from student_table ) where student_key <> next_student_key-1; 
0
source

Source: https://habr.com/ru/post/913409/


All Articles