The fastest way to get a class vector from names in R

If I have the following vector in R (my levels are obviously A, B and C)

c("A_1", "A_2", "B_1", "C_1", "C_2")

What is the most efficient way to convert it to a class vector with numbers like

c(1, 1, 2, 3, 3)

I feel that it should be single-line (probably a combination of factor and grep), but could not come up with it.

Thank!

+3
source share
2 answers

A simple solution:

x <- c("A_1", "A_2", "B_1", "C_1", "C_2")


x.out <- as.numeric(factor(substr(x, 0,1)))

If your data is more diverse, let me know, and we can work to make it a more reliable solution.

+5
source

There is a (more general) regex approach that does not require specifying the width of the leading line:

Or delete anything you want, and after underlining:

> as.numeric(factor(sub("_.+", "" , x)))
[1] 1 1 2 3 3

, ( R- , parens, "\\", ):

> as.numeric(factor(sub("(^.+)_.+$", "\\1" , x)))
[1] 1 1 2 3 3
+2

Source: https://habr.com/ru/post/1779702/


All Articles