Adding a new column using and gregexpr

I have one main list with several sub-lists inside that contains a lot of data.frames. See the example below:

sublist1 <- list(data.frame('Position' = c(1,2,3), 'Color' = c("black-white-
silver-red","black-white-red","black-white")),
             data.frame('Position' = c(1,2,3), 'Color' = c("black-white-
pink-gold-red","black-white","black")) )

sublist2 <- list(data.frame('Position' = c(1,2,3), 'Color' = c("black-
silver-red","black-white-red","white")),
             data.frame('Position' = c(1,2,3), 'Color' = c("pink-gold-
red","black-white","black-white")) )

mainList <- list(sublist1, sublist2)

I am trying to add a new column to each data.frame file called Color_Count, which will return the number of different colors for each row of data.frame. Ideally, the output would look like this:

> mainList
[[1]]
[[1]][[1]]
Position                  Color Color_Count
1        1 black-white-silver-red           4
2        2        black-white-red           3
3        3            black-white           2

[[1]][[2]]
Position                     Color Color_Count
1        1 black-white-pink-gold-red           5
2        2               black-white           2
3        3                     black           1
....

I tried using the gregexpr function as well as lapply, but the result never looks the way I want.

I would really appreciate help here. Thank you in advance.

Yours faithfully,

+4
source share
2 answers

If we can assume that each color is separated by a dash “-”, we can simply count the number of dashes in the “Color” column and add 1:

foo <- function(lst, col) {
  lapply(lst, function(x) 
    if(!is.data.frame(x)) foo(x, col) 
    else transform(x, ColorCount = stringr::str_count(x[[col]], "-")+1))}

foo(mainList, "Color")

#[[1]]
#[[1]][[1]]
#  Position                    Color ColorCount
#1        1 black-white-\nsilver-red          4
#2        2          black-white-red          3
#3        3              black-white          2
#
#[[1]][[2]]
#  Position Color                     ColorCount
#1        1 black-white-pink-gold-red          5
#2        2 black-white                        2
#3        3 black                              1
#...

stringr , R stringi .

foo , , data.frames .

+6

:

data.table rbindlist() dplyr package

df <- lapply(mainList, rbindlist, idcol = "sub.id") %>% 
  rbindlist(idcol = "id") %>% 
  mutate(
    Color = stringi::stri_replace_all(Color, "", regex = "\\n|\\s"),
    Color_Count = stringi::stri_count(Color, regex = "-") + 1
  )

rbindlist(), , , .

:

head(df)
  id sub.id Position                     Color Color_Count
#  1      1        1    black-white-silver-red           4
#  1      1        2           black-white-red           3
#  1      1        3               black-white           2
#  1      2        1 black-white-pink-gold-red           5
#  1      2        2               black-white           2
#  1      2        3                     black           1
0

Source: https://habr.com/ru/post/1693627/


All Articles