In a table with rows> 100k, how can I effectively shuffle the values of a specific column?
Table definition:
CREATE TABLE person
(
id integer NOT NULL,
first_name character varying,
last_name character varying,
CONSTRAINT person_pkey PRIMARY KEY (id)
)
To anonymize the data, I have to shuffle the values of the "first_name" column in place (I am not allowed to create a new table).
My attempt:
with
first_names as (
select row_number() over (order by random()),
first_name as new_first_name
from person
),
ids as (
select row_number() over (order by random()),
id as ref_id
from person
)
update person
set first_name = new_first_name
from first_names, ids
where id = ref_id;
It takes a few hours.
Is there an effective way to do this?
Serge source
share