Add a new level to the ratio and replace the existing one.

Question

Add a new level to the ratio and replace the existing one.

I have big problems working with data frame level names.

I have a large data frame in which one of the columns is a factor with LOT levels.

The problem is that some of this data is duplicated, and the next step in my analysis does not accept duplicate data. Therefore, I need to change the name of the duplicated level so that I can proceed to the next step.

Let me give you a small example:

Let's say we have this simple single-column data frame:

> df col_foo 1 bar1 2 bar2 3 bar3 4 bar2 5 bar4 6 bar5 7 bar3

If we look at the column, we will see that it is a factor with 5 different levels.

 >df$col_foo [1] bar1 bar2 bar3 bar2 bar4 bar5 bar3 Levels: bar1 bar2 bar3 bar4 bar5

OK, the problem is now. See that the levels bar2 and bar3 duplicated . I want to know how I can add a level name, something like bar2_X and replace only the duplicate one for it. Thus, the dataframe should be as follows:

 > df col_foo 1 bar1 2 bar2 3 bar3 4 bar2_X 5 bar4 6 bar5 7 bar3_X

Is it possible? I can’t change the class of the column, it should still be a factor, so the solutions that should change it do not solve my problem if it cannot lead to re-influence.

thanks

+6

r

Lianzinho Oct 27 '11 at 16:55

source share

3 answers

Call make.names with unique = TRUE in your column.

 df$col_foo <- factor(make.names(df$col_foo, unique = TRUE))

+10

Richie cotton Oct 27 '11 at 17:22

source share

You can edit the levels of the factor variable:

 levels(df$col_foo) <- c(levels(df$col_foo),"bar2_X","bar3_X")

and then change the repeating levels to one of the new levels you added.

+2

user2288947 Oct 05 '16 at 20:59

source share

Greg snow · Accepted Answer · 2011-10-27T17:23:10+0000

If you want all entries to be unique, then the factor will not help you just use the symbol variable.

Probably the easiest way to do what you want is to force a symbol, use the duplicated function to find duplicates and insert something at the end of them, and then if you want to use factor to re-trade back to the factor. Maybe something like:

 df$col_foo <- factor( ifelse( duplicated(df$col_fo), paste(df$col_foo, '_x', sep=''), as.character(df$col_foo)))

Add a new level to the ratio and replace the existing one.

More articles: