Oracle SQL query to compare male and female salaries for each job in the company

returns one row for each Job_ID with the following columns:

β€’ J ob ID

β€’ Average length of service for women in this job in years rounded to the tenth of a year

β€’ The average salary of women for this job

β€’ Average length of service for men in this job in years rounded to the tenth of a year

β€’ The average salary of men for this job

β€’ The difference in the average salaries of men and women for this job, so a positive number means that the average salary of women is higher, and a negative number means that the average salary of men is higher.

Table

HR.EMPLOYEES

Name Null Type -------------- -------- ------------ EMPLOYEE_ID NOT NULL NUMBER(6) FIRST_NAME VARCHAR2(20) LAST_NAME NOT NULL VARCHAR2(25) EMAIL NOT NULL VARCHAR2(25) PHONE_NUMBER VARCHAR2(20) HIRE_DATE NOT NULL DATE JOB_ID NOT NULL VARCHAR2(10) SALARY NUMBER(8,2) COMMISSION_PCT NUMBER(2,2) MANAGER_ID NUMBER(6) DEPARTMENT_ID NUMBER(4) GENDER CHAR(1) 

What I have so far , but this returns a row M and F for each job_id I need columns M and F

 SELECT gender, job_id, ROUND(AVG(salary),0) as avg_job_salary, (SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'M') AS avg_m_salary, (SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'F') AS avg_f_salary, ROUND(AVG(days_of_svc/365),1) AS avg_years_svc FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc FROM hr.employees) GROUP BY job_id,gender ORDER BY job_id, gender; 

or version 2

 SELECT gender, job_id, ROUND(AVG(salary),0) as avg_job_salary, ((SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'F') - (SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'M')) as diff, ROUND(AVG(days_of_svc/365),1) AS avg_years_svc FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc FROM hr.employees) GROUP BY job_id,gender ORDER BY job_id, gender; 

example of expected result lines

 JOB_ID F_AVG_LENGTH F_AVG_SAL M_AVG_LENGTH M_AVG_SAL DIFFERENCE ------ ------------ --------- ------------ --------- ---------- MAILCLERK 24.1 48000 23.4 47000 1000 CASHIER 4.6 12000 4.4 13500 -1500 
+5
source share
3 answers

On my machine, I made a copy of HR.EMPLOYEES in my circuit, I called clone HR_EMPLOYEES . Then I added a column for GENDER , since on my copy of Oracle the HR.EMPLOYEES table HR.EMPLOYEES not have a GENDER column. I populated the column with my best guesses, just for testing.

In Oracle 11, you can use the PIVOT operation, which simplifies the work. I divided the average length of stay by 365.25, so it is expressed in years, not days. Please note that there are many tasks in which either there are no males or no women working in them, therefore there are many NULL results. I suggested that you want them to be shown too - otherwise they can be ignored.

 select job_id, round(F_AVG_TENURE_D/365.25, 1) as f_avg_length, round(F_AVG_SALARY) as f_avg_salary, round(M_AVG_TENURE_D/365.25, 1) as m_avg_length, round(M_AVG_SALARY) as m_avg_salary, round(F_AVG_SALARY - M_AVG_SALARY) as avg_sal_diff from ( select job_id, gender, sysdate - hire_date as tenure, salary from hr_employees ) pivot (avg(tenure) as avg_tenure_d, avg(salary) as avg_salary for gender in ('F' as F, 'M' as M)) order by avg_sal_diff desc nulls last, job_id -- ORDER BY is optional ; 

Output

 JOB_ID F_AVG_LENGTH F_AVG_SALARY M_AVG_LENGTH M_AVG_SALARY AVG_SAL_DIFF ---------- ------------ ------------ ------------ ------------ ------------ SH_CLERK 11.2 3511 9.9 2973 538 ST_MAN 12.3 7467 10.3 7000 467 ST_CLERK 10.5 2883 10.8 2743 140 PU_CLERK 11.6 2833 10 2700 133 AD_VP 11.1 17000 15.8 17000 0 SA_REP 10.3 8244 10.6 8471 -228 SA_MAN 10.3 12000 10.9 12333 -333 IT_PROG 10.2 4500 10.5 6600 -2100 AC_ACCOUNT 14.4 8300 AC_MGR 14.4 12008 AD_ASST 13.1 4400 AD_PRES 13.4 24000 FI_ACCOUNT 11.2 7920 FI_MGR 14.2 12008 HR_REP 14.4 6500 MK_MAN 12.7 13000 MK_REP 11.2 6000 PR_REP 14.4 10000 PU_MAN 13.9 11000 19 rows selected. 
+1
source

Your expected result does not require an expression as an output column.

Therefore, you need to remove it from the select and group by statements:

 SELECT job_id, ROUND(AVG(salary),0) as avg_job_salary, (SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'M') AS avg_m_salary, (SELECT ROUND(AVG(salary),0) FROM hr.employees WHERE gender = 'F') AS avg_f_salary, ROUND(AVG(days_of_svc/365),1) AS avg_years_svc FROM (SELECT job_id, salary, gender, (SYSDATE-hire_date) AS days_of_svc FROM hr.employees) GROUP BY job_id ORDER BY job_id; 

In this approach, there are too many failures in the same table, although in the subquery. Try to optimize it.

EDIT:

 select job_id,ROUND(AVG(salary),0) avg_job_salary, round(avg(case when gender='M' then Salary end),0) avg_m_salary, round(avg(case when gender='F' then Salary end),0) avg_f_salary, round(avg(case when gender='F' then Salary end),0) - round(avg(case when gender='M' then Salary end),0) diff_in_avg ROUND(AVG((SYSDATE-HIRE_DATE)/365),1) AS avg_years_svc from hr.employees group by JOB_ID order by JOB_ID; 

I assume that your formula for calculating avg_years_svc is already as expected, and bifurcation is not required for male and female candidates.

EDIT 2:

The PIVOT function may help. This can help you double the average values ​​for men and women.

 Select job_id,avg_m_salary,avg_f_salary, avg_f_salary - avg_m_salary diff_salary from( select job_id,salary from hr.employees) PIVOT (avg(salary) for gender in('M' as avg_m_salary,'F' as avg_f_salary)); 
+1
source
 SELECT Q.*, Q.M_AVG_SAL-Q.F_AVG_SAL DIFFERENCE FROM ( SELECT JOB_ID , SUM(CASE WHEN gender = 'M' THEN SALARY ELSE 0 END)/SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END) M_AVG_SAL , ROUND(SUM(CASE WHEN gender = 'M' THEN SVC_LEN ELSE 0 END)/SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END), 1) M_AVG_LENGTH , SUM(CASE WHEN gender = 'F' THEN SALARY ELSE 0 END)/SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) F_AVG_SAL , ROUND(SUM(CASE WHEN gender = 'F' THEN SVC_LEN ELSE 0 END)/SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END), 1) F_AVG_LENGTH FROM ( SELECT JOB_ID, SALARY , MONTHS_BETWEEN(SYSDATE, p.HIRE_DATE)/12 svc_len, GENDER FROM EMPLOYEES p ) GROUP BY JOB_ID ) Q; 

Awwww, I forgot about dividing by zero, therefore:

 SELECT JOB_ID , DECODE(F_CNT, 0, 0, ROUND(F_LEN/F_CNT, 1)) F_AVG_LENGTH , DECODE(F_CNT, 0, 0, F_SAL/F_CNT) F_AVG_SAL , DECODE(M_CNT, 0, 0, ROUND(M_LEN/M_CNT, 1)) M_AVG_LENGTH , DECODE(M_CNT, 0, 0, M_SAL/M_CNT) M_AVG_SAL , DECODE(M_CNT, 0, 0, M_SAL/M_CNT) - DECODE(F_CNT, 0, 0, F_SAL/F_CNT) DIFFERENCE FROM ( SELECT JOB_ID , SUM(CASE WHEN gender = 'M' THEN SALARY ELSE 0 END) M_SAL , SUM(CASE WHEN gender = 'M' THEN 1 ELSE 0 END) M_CNT , SUM(CASE WHEN gender = 'M' THEN SVC_LEN ELSE 0 END) M_LEN , SUM(CASE WHEN gender = 'F' THEN SALARY ELSE 0 END) F_SAL , SUM(CASE WHEN gender = 'F' THEN 1 ELSE 0 END) F_CNT , SUM(CASE WHEN gender = 'F' THEN SVC_LEN ELSE 0 END) F_LEN FROM ( SELECT JOB_ID, SALARY, GENDER , MONTHS_BETWEEN(SYSDATE, p.HIRE_DATE)/12 svc_len FROM EMPLOYEES p ) GROUP BY JOB_ID ) Q; 
0
source

Source: https://habr.com/ru/post/1259076/


All Articles