I am writing a function that uses kmeans to determine the width of the bin to convert a continuous measurement (predicted probability) into an integer (one of three bins). I came across a boundary case where my algorithm (correctly) can predict the same probability for the whole set, and I want to handle this situation. I use rattle package binning() function as follows:
btsKmeansBin <- function(x, k = 3, default = c(0, 0.3, 0.5, 1)) { result <- binning(x, bins = k, method = "kmeans", ordered = T) bins <- attr(result, "breaks") attr(bins, "names") <- NULL bins <- bins[order(bins)] bins[1] <- 0 bins[length(bins)] <- 1 return(bins) }
Run this function on x <- c(.5,.5,.5,.5,.5,.5) and you will get an error in the order(bins) step, because bins will be NULL and therefore is not by vector.
Obviously, if x has only one distinct value, kmeans should not work. In this case, I would like to return the div default bin. When this happens, binning displays the message "Warning: variable is not considered." Therefore, I would like to use tryCatch to handle this warning, but the surrounding line result <- ... does not work with the following code as I expect:
... tryCatch({ result <- binning(x, bins = k, method = "kmeans", ordered = T) }, warning = function(w) { warn(sprintf("%s. Using default values", w)) return(default) }, error = function(e) { stop(e) }) ...
The warning is printed as if I did not use tryCatch , and the code moves past the return and again throws an error from order . I tried a bunch of options, but nothing worked. What am I missing here?