Sift spaces using tag labels using dplyr chaining

I have loaded a loadable dataframe that has trailing spaces in factor labels. I am trying to remove these trailing spaces in every factor in a data frame, but have not yet succeeded.

Reproducible example

lvls <- c('a   ',
          'b   ',
          'c   ')
set.seed(314)
raw <- data.frame(a = factor(sample(lvls,100, replace=T)),
                  b = sample(1:100,100))

proc <- raw %>% mutate_each(funs(ifelse(is.factor(.),
                                        factor(as.character(trimws(.)),
                                               labels=unique(as.character(.))),
                                        .))) 

str(proc)

gives

'data.frame':   100 obs. of  2 variables:
 $ a: int  1 1 1 1 1 1 1 1 1 1 ...
 $ b: int  31 31 31 31 31 31 31 31 31 31 ...

What is wrong on two levels. The factor has no labels. Only the first observation is repeated 100 times

+4
source share
2 answers

mutate_if- your friend. If you don't like converting to a character, you can simply use

raw %>% mutate_if(is.factor, trimws)

which suggests that you can simply convert to a coefficient:

raw %>% mutate_if(is.factor, funs(factor(trimws(.))))

If you want to keep the type you can use more confusing

raw %>% mutate_if(is.factor, funs(`levels<-`(., trimws(levels(.)))))

The base equivalent of R will be

raw[] <- lapply(raw, function(x){if (is.factor(x)) {levels(x) <- trimws(levels(x))} ; x})

, , , :

levels(raw$a) <- trimws(levels(raw$a))

: forcats::relabel ( tidyverse) :

raw %>% mutate_if(is.factor, fct_relabel, trimws)

,

raw %>% mutate(a = fct_relabel(a, trimws))

, purrr-style ~trimws(.x), .

+8

- ?

l = lapply(raw, function(x) {if(is.factor(x)){x <- trimws(x)};x})
head(as.data.frame(l))
#  a  b
#1 a 31
#2 a 55
#3 c 68
#4 a 18
#5 a 72
#6 a 64
+1

Source: https://habr.com/ru/post/1665755/


All Articles