Long if another cycle and transcoding in R

Question

Long if another cycle and transcoding in R

I know that my problem is simple, but not for me. Here is a small data set.

mark1 <- c("AB", "BB", "AB", "BB", "BB", "AB", "--", "BB") mark2 <- c("AB", "AB", "AA", "BB", "BB", "AA", "--", "BB") mark3 <- c("BB", "AB", "AA", "BB", "BB", "AA", "--", "BB") mark4 <- c("AA", "AB", "AA", "BB", "BB", "AA", "--", "BB") mark5 <- c("AB", "AB", "AA", "BB", "BB", "AA", "--", "BB") mark6 <- c("--", "BB", "AA", "BB", "BB", "AA", "--", "BB") mark7 <- c("AB", "--", "AA", "BB", "BB", "AA", "--", "BB") mark8 <- c("BB", "AA", "AA", "BB", "BB", "AA", "--", "BB") mymark <- data.frame (mark1, mark2, mark3, mark4, mark5, mark6, mark7, mark8) tmymark <- data.frame (t(mymark)) names (tmymark) <- c("P1", "P2","I1", "I2", "I3", "I4", "KL", "MN")

Thus, the data set will look like this:

  P1 P2 I1 I2 I3 I4 KL MN mark1 AB BB AB BB BB AB -- BB mark2 AB AB AA BB BB AA -- BB mark3 BB AB AA BB BB AA -- BB mark4 AA AB AA BB BB AA -- BB mark5 AB AB AA BB BB AA -- BB mark6 -- BB AA BB BB AA -- BB mark7 AB -- AA BB BB AA -- BB mark8 BB AA AA BB BB AA -- BB

I want to classify mark1: 8 based on a comparison of P1 and P2 and provide code that will create a new variable:

 loctype <- NULL if (tmymark$P1 == "AB" & tmymark$P2 == "AB"){ loctype = "<hkxhk>" } else { if (tmymark$P1== "AB" & tmymark$P2 == "BB") { loctype = "<lmxll>" } else { if (tmymark$P1 == "AA" & tmymark$P2 == "AB") { loctype = "<nnxnp>" } else { if (tmymark$P1 == "AA" & tmymark$P2 == "BB") { loctype = "MN" } else { if (tmymark$P1 == "BB" & tmymark$P2 == "AA"){ loctype = "MN" } else { if (tmymark$P1 == "--" & tmymark$P2 == "AA"){ loctype = "NR" } else { if (tmymark$P1 == "AA" & tmymark$P2 == "--"){ loctype = "NR" } else { cat ("error wrong input in P1 or P2") }} }}}}}

Here is what I'm trying to do, compare the values of P1 and P2 and generate a new variable. for example, if tmymark $ P1 == "AB" and tmymark $ P2 == "AB" loctype should be "". If not the second condition will apply and so on.

Here is my error message.

 Warning messages: 1: In if (tmymark$P1 == "AB" & tmymark$P2 == "AB") { : the condition has length > 1 and only the first element will be used 2: In if (tmymark$P1 == "AB" & tmymark$P2 == "BB") { : the condition has length > 1 and only the first element will be used

When the loctype vector is generated, I want to transcode tmymark with the information in this variable:

 tmymark1 <- data.frame (loctype, tmymark) require(car) for(i in 2:length(tmymark)){ if (loctype = "<hkxhk>") { tmymark[[i]] <- recode (x, "AB" = "hk", "BA" = "hk", "AA" = "hh", "BB" = "kk") } else { if (loctype = "<lmxll>") { tmymark[[i]] <- recode ((x, "AB" = "lm", "BA" = "lm", "AA" = "--", "BB" = "kk") } else { if (loctype = "<nnxnp>") { tmymark[[i]] <- recode ((x, "AB" = "np", "BA" = "np", "AA" = "nn", "BB" = "--") } else { if (loctype = "MN") { tmymark[[i]] <- "--" } esle { if (loctype = "NR") { tmymark[[i]] <- "NA" } else { cat ("error wrong input code") } } }}}

Am I on the right track?

Editing: Expected Result

  loctype P1 P2 I1 I2 I3 I4 KL MN mark1 <lmxmm> lm mm lm mm mm lm -- mm mark2 <hkxhk> hk hk hh kk kk hh -- kk mark3 <nnxnp> nn np nn -- -- nn -- -- and so on

+4

loops r

jon Feb 09 '12 at 15:41

source share

2 answers

your first error occurs, because if you need one logical value (or an expression that evaluates to one). Instead, you can use ifelse( ) , which is a “vectorized” if

 ifelse(tmymark$P1 == "AB" & tmymark$P2 == "AB", loctype = "<hkxhk>", else clauses...)

To avoid the long structure of if() else() (or ifelse() as it were), you can use a match. create a data frame from your expected combinations of P1 and P2 and add a column for the desired loctype:

 matches <- data.frame(p1p2 = c('AB AB', 'AB BB', 'AA AB', 'AA BB', 'BB AA', '-- AA', 'AA --'), loctype = c('<hkxhk>', '<lmxll>', '<nnxnp>', 'MN', 'MN', 'NR', 'NR')) loctype <- matches$loctype[match(paste(tmymark$P1, tmymark$P2), matches$p1p2),]

The second part can be done in several ways, but I draw a space on "neat and tidy."

+1

Justin Feb 09 '12 at 16:00

source share

Aaron · Accepted Answer · 2012-02-09T16:33:26+0000

match is definitely the way to go. I would make two data frames as keys, for example:

 key <- data.frame( P1=c("AB", "AB", "AA", "AA", "BB", "--", "AA"), P2=c("AB", "BB", "AB", "BB", "AA", "AA", "--"), loctype=c("<hkxhk>", "<lmxll>", "<nnxnp>", "MN", "MN", "NR", "NR")) key2 <- cbind( `<hkxhk>` = c("hk","hk","hh","kk"), `<lmxll>` = c("lm", "lm", "--", "kk"), `<nnxnp>` = c("np", "np", "nn", "--"), MN = rep("--", 4), NR = rep("NA", 4) ) rownames(key2) = c("AB","BA", "AA", "BB")

and then use match on key1 to get a loctype (as Justin recommends), as well as for the names of the growths and columns of key2 , to get the desired substitution using matrix indexing to get the desired value from the key.

 loctype <- key$loctype[match(with(tmymark, paste(P1, P2, sep="\b")), with(key, paste(P1, P2, sep="\b")))] ii <- match(as.vector(as.matrix(tmymark)), rownames(key2)) jj <- rep(match(loctype, colnames(key2)), nrow(tmymark)) out <- as.data.frame(matrix(key2[cbind(ii,jj)], nrow=nrow(tmymark))) colnames(out) <- colnames(tmymark) rownames(out) <- rownames(tmymark) out$loctype <- loctype

The result then looks like where the missing values are because I have no values for these combinations in my keys.

 > print(out, na="") P1 P2 I1 I2 I3 I4 KL MN loctype mark1 lm kk lm kk kk lm kk <lmxll> mark2 hk hk hh kk kk hh kk <hkxhk> mark3 mark4 nn np nn -- -- nn -- <nnxnp> mark5 hk hk hh kk kk hh kk <hkxhk> mark6 mark7 mark8 -- -- -- -- -- -- -- MN

Long if another cycle and transcoding in R

More articles: