Error BTYD: pnbd.EstimateParameters: L-BFGS-B requires final values โ€‹โ€‹of 'fn'

Usage > params <- pnbd.EstimateParameters (cal.cbs)

from the BTYD package I get the following error:

" optim (logparams, pnbd.eLL, cal.cbs = cal.cbs, max.param.value = max.param.value, L-BFGS-B needs the final values' fn ". "

What does it mean? What could be causing this error? My cbs (client with sufficient statistics) matrix has 21394 3 large with the desired columns: x, tx, T.cal.

Cbs info:

  • max (cal.cbs $ x) = 302
  • min (cal.cbs $ x) = 0
  • max (cal.cbs $ tx) = 89
  • min (cal.cbs $ tx) = 0
  • max (cal.cbs $ T.cal) = 89.57143
  • min (cal.cbs $ T.cal) = 0
+5
source share
2 answers

I spent some time reading and modifying the source code and finally found a calculation error when calculating the probability of registration and correction. You can run the code below and call the pnbd.EstimateParameters.ori () function to try again. It works for my case (I had the same error as yours).

pnbd.cbs.LL.ori = function (params, cal.cbs) { dc.check.model.params(c("r", "alpha", "s", "beta"), params, "pnbd.cbs.LL") tryCatch(x <- cal.cbs[, "x"], error = function(e) stop("Error in pnbd.cbs.LL: cal.cbs must have a frequency column labelled \"x\"")) tryCatch(tx <- cal.cbs[, "tx"], error = function(e) stop("Error in pnbd.cbs.LL: cal.cbs must have a recency column labelled \"tx\"")) tryCatch(T.cal <- cal.cbs[, "T.cal"], error = function(e) stop("Error in pnbd.cbs.LL: cal.cbs must have a column for length of time observed labelled \"T.cal\"")) if ("custs" %in% colnames(cal.cbs)) { custs <- cal.cbs[, "custs"] } else { custs <- rep(1, length(x)) } return(sum(custs * pnbd.LL.ori(params, x, tx, T.cal)))## changed } pnbd.LL.ori = function (params, x, tx, T.cal) { max.length <- max(length(x), length(tx), length(T.cal)) if (max.length%%length(x)) warning("Maximum vector length not a multiple of the length of x") if (max.length%%length(tx)) warning("Maximum vector length not a multiple of the length of tx") if (max.length%%length(T.cal)) warning("Maximum vector length not a multiple of the length of T.cal") dc.check.model.params(c("r", "alpha", "s", "beta"), params, "pnbd.LL") if (any(x < 0) || !is.numeric(x)) stop("x must be numeric and may not contain negative numbers.") if (any(tx < 0) || !is.numeric(tx)) stop("tx must be numeric and may not contain negative numbers.") if (any(T.cal < 0) || !is.numeric(T.cal)) stop("T.cal must be numeric and may not contain negative numbers.") x <- rep(x, length.out = max.length) tx <- rep(tx, length.out = max.length) T.cal <- rep(T.cal, length.out = max.length) r <- params[1] alpha <- params[2] s <- params[3] beta <- params[4] maxab <- max(alpha, beta) absab <- abs(alpha - beta) param2 <- s + 1 if (alpha < beta) { param2 <- r + x } part1 <- r * log(alpha) + s * log(beta) - lgamma(r) + lgamma(r + x) part2 <- -(r + x) * log(alpha + T.cal) - s * log(beta + T.cal) if (absab == 0) { F1 <- -(r + s + x) * log(maxab + tx) F2 <- -(r + s + x) * log(maxab + T.cal) partF <- subLogs.ori(F1, F2)## changed } else { F1 <- hyperg_2F1(r + s + x, param2, r + s + x + 1, absab/(maxab + tx))/((maxab + tx)^(r + s + x)) F2 <- hyperg_2F1(r + s + x, param2, r + s + x + 1, absab/(maxab + T.cal))/((maxab + T.cal)^(r + s + x)) partF <- log(F1 - F2) } part3 <- log(s) - log(r + s + x) + partF ## modified result = part1+ part2+ log(1 + exp(part3 - part2)) return(result) } subLogs.ori = function (loga, logb) { ## this function is modified myvec = loga - logb sel = myvec <30 result = rep(0,length(myvec)) result[sel] = logb[sel] + log(exp(loga[sel] - logb[sel]) - 1) result[!sel] = loga[!sel] return(result) } pnbd.EstimateParameters.ori = function (cal.cbs, par.start = c(1, 1, 1, 1), max.param.value = 10000) { dc.check.model.params(c("r", "alpha", "s", "beta"), par.start, "pnbd.EstimateParameters") pnbd.eLL <- function(params, cal.cbs, max.param.value) { params <- exp(params) params[params > max.param.value] <- max.param.value return(-1 * pnbd.cbs.LL.ori(params, cal.cbs))## changed } logparams <- log(par.start) results <- optim(logparams, pnbd.eLL, cal.cbs = cal.cbs, max.param.value = max.param.value, method = "L-BFGS-B") estimated.params <- exp(results$par) estimated.params[estimated.params > max.param.value] <- max.param.value return(estimated.params) } params <- pnbd.EstimateParameters.ori(cal.cbs) 
+2
source

Can you try this:

 R> cal.cbs1 = subset(cal.cbs, x<100) R> params <- pnbd.EstimateParameters(cal.cbs1) 

Presumably, customers should not make more purchases (x) than the number of days (T.cal) that he / she observed, because the algorithm assumes that each user can make no more than one purchase per day. If one user makes more than two purchases, the dc.MergeTransactionsOnSameDate function should combine them with one purchase per day. So first try with a little x first.

I also believe that there is some computational error with the pnbd.EstimateParaters () function, as it is called the optim () function in R. The error message that you see refers to the optim () function.

0
source

Source: https://habr.com/ru/post/1204337/


All Articles