Removing duplicate entries in the join table

Question

Removing duplicate entries in the join table

I have a HABTM relationship between a user and a role.

A user can be an administrator (role_id = 1) or a user (role_id = 2) for roles.

In the join table, role_users, I have backup entries. For example:

enter image description here

I want to delete duplicate entries like 1: 1, 2: 4.

Two questions:

Where is the best place to execute a sql script that removes duplicates - migration? script?
What is sql query to remove duplicates?

+4

mysql ruby-on-rails duplicate-removal has-and-belongs-to-many

keruilin Feb 13 '11 at 21:35

source share

2 answers

The simplest is copying data to a new table, minus duplicates:

 CREATE TABLE roles_users2 AS SELECT DISTINCT * FROM roles_users

Then you can choose one of the following values:

Drop the old table, rename the new table to the old name and add indexes.
Trim the old table and insert the rows from role_users2 back into role_users.

+3

Mark byers Feb 13 '11 at 21:37

source share

John douthat · Accepted Answer · 2011-02-13T21:42:26+0000

CREATE TABLE roles_users2 LIKE roles_users; -- this ensures indexes are preserved INSERT INTO roles_users2 SELECT DISTINCT * FROM roles_users; DROP TABLE roles_users; RENAME TABLE roles_users2 TO roles_users;

and for the future to prevent line duplication

 ALTER TABLE roles_users ADD UNIQUE INDEX (role_id, user_id);

Or you can do it all in one step with ALTER TABLE IGNORE :

ALTER IGNORE TABLE roles_users ADD UNIQUE INDEX (role_id, user_id);

IGNORE is a MySQL extension for standard SQL. It controls how ALTER TABLE works if there are duplicates of unique keys in the new table or if warnings are turned on when strict mode is enabled. If IGNORE is not specified, the copy is aborted and rolled back if errors occur with duplicate keys. If IGNORE is specified, for rows with duplicates on a unique key, only the first row is used. The remaining conflicting lines are deleted. Invalid values are truncated to the nearest suitable value.

Removing duplicate entries in the join table

More articles: