SQL query contains missing sequential numbers

I have the following data in an SQL table:

enter image description here

I need to request data so that I can get a list of missing " familyid " for each employee.

For example, I should get for Employee 1021, which is not in the sequence of identifiers: 2 and 5 and for Employee 1027 should get the missing numbers 1 and 6.

Is there a key to finding a query?

Appreciate any help.

+6
source share
5 answers

Find the first missing value

I would use the ROW_NUMBER function to assign the "correct" sequence identifier number. Assuming that the sequence identifier is restarted each time the employee identifier changes:

 SELECT e.id, e.name, e.employee_number, e.relation, e.familyid, ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid FROM employee_members e 

Then, I would like to filter the result set only to include rows with inappropriate sequence identifiers:

 SELECT * FROM ( SELECT e.id, e.name, e.employee_number, e.relation, e.familyid, ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid FROM employee_members e ) a WHERE a.familyid <> a.sequenceid 

Then again, you need to easily group employee_number and find the first identifier of the missing sequence for each employee:

 SELECT a.employee_number, MIN(a.sequence_id) AS first_missing FROM ( SELECT e.id, e.name, e.employee_number, e.relation, e.familyid, ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid FROM employee_members e ) a WHERE a.familyid <> a.sequenceid GROUP BY a.employee_number 

Search for all missing values

Extending the previous query, we can find the missing value each time the difference between familyid and sequenceid changes:

 -- Warning: this is totally untested :-/ SELECT b.employee_number, MIN(b.sequence_id) AS missing FROM ( SELECT a.*, a.familyid - a.sequenceid AS displacement SELECT e.*, ROW_NUMBER() OVER(PARTITION BY e.employeeid ORDER BY familyid) - 1 AS sequenceid FROM employee_members e ) a ) b WHERE b.displacement <> 0 GROUP BY b.employee_number, b.displacement 
+3
source

Here is one approach. Calculate the maximum family identifier for each employee. Then attach it to the list of numbers with the maximum family ID. The result has one row for each employee and the expected family identifier.

Make a left outer join from this to the source data, on familyid and in number. If nothing is found, these are the missing values:

 with nums as ( select 1 as n union all select n+1 from nums where n < 20 ) select en.employee, nn as MissingFamilyId from (select employee, min(familyid) as minfi, max(familyid) as maxfi from t group by employee ) en join nums n on nn <= maxfi left outer join t on t.employee = en.employee and t.familyid = nn where t.employee_number is null; 

Note that this will not work if the missing familyid is the last number in the sequence. But this may be the best thing you can do with your data structure.

Also in the above request, it is assumed that no more than 20 family members.

+3
source

This will work, you select all the Dependencies and the left join in the previous line. If this line does not exist, you display the result:

 SELECT 'Missing Prior', t1.* FROM employee_members t1 LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number AND (t1.familyid-1) = t2.familyid WHERE t2.employee_number is null and t1.relation == 'Dependent' 

Another version that shows you the missing number:

 SELECT t1.employee_number, t1.familyid-1 as Missing_Member FROM employee_members t1 LEFT JOIN employee_members t2 ON t1.employee_number = t2.employee_number AND (t1.familyid-1) = t2.familyid WHERE t2.employee_number is null and t1.relation == 'Dependent' 
+2
source

Another solution: Create a table with all possible values ​​from the sequence (you can play with the identification for this). Then join the table on the left, where the source table is null.

 DECLARE @Seq TABLE (id INT IDENTITY(1, 1)) DECLARE @iter INT = 1 WHILE @iter <= ( SELECT MAX([your ID column]) FROM [Offending Table] ) BEGIN INSERT @Seq DEFAULT VALUES SET @iter = @iter + 1 END SELECT id FROM @seq s LEFT JOIN [Offending Table] ot ON s.id = ot.[your ID column] WHERE ot.[Offending Table]IS NULL 
+1
source

This selection will receive a list of missing "familyid" for each employee using the CTE approach.

QUERY:

  WITH emp_grp ( EmployeeID ,MaxFamilyID ) AS ( SELECT e2.EmployeeID ,MAX(e2.FamilyID) MaxFamilyID FROM employee_number e2 GROUP BY e2.EmployeeID ) ,emp_mem AS ( SELECT EmployeeID ,0 AS FamilyID ,MaxFamilyID FROM emp_grp UNION ALL SELECT EmployeeID ,FamilyID + 1 AS FamilyID ,MaxFamilyID FROM emp_mem WHERE emp_mem.FamilyID < MaxFamilyID ) SELECT emp_mem.EmployeeID ,emp_mem.FamilyID FROM emp_mem LEFT JOIN employee_number emp_num ON emp_mem.EmployeeID = emp_num.EmployeeID AND emp_mem.FamilyID = emp_num.FamilyID WHERE emp_num.EmployeeID IS NULL ORDER BY emp_mem.EmployeeID ,emp_mem.FamilyID OPTION ( MAXRECURSION 32767) 

OUTPUT:

 EmployeeID FamilyID ----------- ----------- 1021 2 1021 5 1027 1 1027 6 
0
source

Source: https://habr.com/ru/post/948848/


All Articles