Combining duplicate records in 1 record with the same table and table fields

I have a database table that contains a list of demographic records, some of these members may have multiple / duplicate records, for example.

NOTE.
Gender:
119 = Man
118 = Woman

Race:
255 = White
253 = Asian

UrbanRural:
331 = Urban
332 = Rural

participantid, gender, race, urbanrural, moduletypeid, hibernateid, and more fields
1, 119, 0, 331, 1, 1, .....
1, 119, 255, 0, 2, 2, .....
1, 0, 255, 331, 3, 3, .....
1, 119, 253, 331, 0, 4, .....

, . , , , , . .

:

participantid, gender, race, urbanrural, moduletypeid, hibernateid, and more fields
1, 119, 255, 331, 1, 1, .....


, ,

+3
3

- :

select participantid, min(gender), min(race), min(urbanrural), 
min(case moduletypeid when 0 then null else moduletypeid end), min(hibernateid), ...
from yourtable
group by participantid

, moduletypeid, 1 - , 0 , null (, case).

+2

- Postgres 9.1 +:

WITH duplicates AS (
  SELECT desired_unique_key, count(*) AS count_of_same_key, min(st.id) AS keep_id, max(st.id) as delete_id
  FROM source_table st
  GROUP BY desired_unique_key
  HAVING count(*) > 1
),
 deleted_dupes AS (
  DELETE FROM source_table st
  WHERE st.id IN (SELECT(delete_id) FROM duplicates)
)
UPDATE source_table st
  SET field = WHATEVER
  FROM duplicates d
  WHERE st.id = d.keep_id
+1

, , / , ?

If yes, try the following:

SELECT T1.* FROM table_name T1, table_name T2
WHERE T1.dupe_field = T2.dupe_field
AND T1.other_dupe_field = T2.other_dupe_field
AND T1.primary_key > T2.primary_key;

Change the names of the tables and fields to suit your own table structure.

Confirm with this SELECT query that it selects duplicates that you want to delete, and then change them to DELETE to remove duplicates.

0
source

Source: https://habr.com/ru/post/1758677/


All Articles