R - group by variable, and then assign a unique identifier

I am interested in de-identifying a sensitive dataset with both time and time fixed values. I want to (a) group all cases by social security number, (b) assign a unique identifier to these cases, and then (c) delete the social security number.

Here is an example dataset:

personal_id gender temperature 111-11-1111 M 99.6 999-999-999 F 98.2 111-11-1111 M 97.8 999-999-999 F 98.3 888-88-8888 F 99.0 111-11-1111 M 98.9

Any solutions would be much appreciated.

+4
source share
2 answers

dplyrhas a function group_indicesto create unique group identifiers

library(dplyr)
data <- data.frame(personal_id = c("111-111-111", "999-999-999", "222-222-222", "111-111-111"),
                       gender = c("M", "F", "M", "M"),
                       temperature = c(99.6, 98.2, 97.8, 95.5))

data$group_id <- data %>% group_indices(personal_id) 
data <- data %>% select(-personal_id)

data
  gender temperature group_id
1      M        99.6        1
2      F        98.2        3
3      M        97.8        2
4      M        95.5        1
+10
source

Using the dplyr package:

library(dplyr)
data <- data.frame(personal_id = c("111-111-111", "999-999-999", "222-222-222", "111-111-111"),
                 gender = c("M", "F", "M", "M"),
                 temperature = c(99.6, 98.2, 97.8, 95.5))

first you extract personal_id to create a unique identifier:

cases <- data.frame(levels = levels(data$personal_id))

, :

cases <- cases %>%
    mutate(id = rownames(cases))

:

       levels id
1 111-111-111  1
2 222-222-222  2
3 999-999-999  3

dataframe :

data <- left_join(data, cases, by = c("personal_id" = "levels"))

, , :

mutate(UID = paste(id, gender, sep=""))

, , personal_id id:

select(-personal_id, -id)

:):

data <- left_join(data, cases, by = c("personal_id" = "levels")) %>%
        mutate(UID = paste(id, gender, sep="")) %>%
        select(-personal_id, -id)

:

  gender temperature UID
1      M        99.6  1M
2      F        98.2  3F
3      M        97.8  2M
4      M        95.5  1M
0

Source: https://habr.com/ru/post/1655553/


All Articles