Transcoding a dummy variable into an ordered coefficient

I need help with coding coefficients for logistic regression.

I have six dummy variables representing income brackets. I want to convert them into one ordered factor for use in logistic regression.

My data frame looks like this:

INC1 INC2 INC3 INC4 INC5 INC6 1 0 0 1 0 0 0 2 NA NA NA NA NA NA 3 0 0 0 0 0 1 4 0 0 0 0 0 1 5 0 0 1 0 0 0 6 0 0 0 1 0 0 7 0 0 1 0 0 0 8 0 0 0 1 0 0 

I want it to look like this:

  INC 1 INC3 2 NA 3 INC6 4 INC6 5 INC3 6 INC4 7 INC3 8 INC4 

This should be a normal (and simple) operation, but my searches did not receive a brief answer on how to perform this re-encoding. Any help is greatly appreciated.

+3
source share
1 answer

Here the solution is based on another answer that stores the NA values ​​and converts into an ordered coefficient.

 > inc INC1 INC2 INC3 INC4 INC5 INC6 1 0 0 1 0 0 0 2 NA NA NA NA NA NA 3 0 0 0 0 0 1 4 0 0 0 0 0 1 5 0 0 1 0 0 0 6 0 0 0 1 0 0 7 0 0 1 0 0 0 8 0 0 0 1 0 0 > inc$F = factor(apply(inc, 1, function(x) names(x)[x == 1]),levels=names(inc),ordered=TRUE) > inc INC1 INC2 INC3 INC4 INC5 INC6 F 1 0 0 1 0 0 0 INC3 2 NA NA NA NA NA NA <NA> 3 0 0 0 0 0 1 INC6 4 0 0 0 0 0 1 INC6 5 0 0 1 0 0 0 INC3 6 0 0 0 1 0 0 INC4 7 0 0 1 0 0 0 INC3 8 0 0 0 1 0 0 INC4 > inc$F [1] INC3 <NA> INC6 INC6 INC3 INC4 INC3 INC4 Levels: INC1 < INC2 < INC3 < INC4 < INC5 < INC6 

It will break if you have more than one in a row.

+3
source

Source: https://habr.com/ru/post/985970/


All Articles