I am currently encountering the error below, which is related to NULL values that are being forced into a data frame. The dataset contains zeros, however I tried using the is.na () and is.null () functions to replace the null values with something else. Data is stored in hdf and stored in pig.hive format. I also added the code below. The code works fine if I remove v [, 25] from the key.
code:
AM = c("AN"); UK = c("PP"); sample.map <- function(k,v){ key <- data.frame(acc = v[!which(is.na(v[,1],1], year = substr(v[!which(is.na(v[,1]),2],1,4), month = substr(v[!which(is.na(v[,1]),2],5,6)) value <- data.frame(v[,3],count=1) keyval(key,value) } sample.reduce <- function(key,v){ AT <- sum(v[which(v[,1] %in% AM=="TRUE"),2]) UnknownT <- sum(v[which(v[,1] %in% UK=="TRUE"),2]) Total <- AT + UnknownT d <- data.frame(AT,UnknownT,Total) keyval(key,d) } out <- mapreduce(input ="/user/hduser/input", output = "/user/hduser/output", input.format = make.input.format("pig.hive", sep = "\u0001") output.format = make.output.format("csv", sep = ","), map= sample.map) reduce = sample.reduce)
Error:
Warning in asMethod(object) : NAs introduced by coercion Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable Warning in rmr.split(x, x, FALSE, keep.rownames = FALSE) : number of items to replace is not a multiple of replacement length Warning in split.default(1:rmr.length(y), unique(ind), drop = TRUE) : data length is not a multiple of split variable Warning in rmr.split(v, ind, lossy = lossy, keep.rownames = TRUE) : number of items to replace is not a multiple of replacement length Error in as(x, class(k)) : no method or default for coercing "NULL" to "data.frame" Calls: <Anonymous> ... apply.reduce -> c.keyval -> reduce.keyval -> lapply -> FUN -> as No traceback available
UPDATE I added sample data and edited the code above. Hope this helps!
Sample data:
NULL,"2014-03-14","PP" 345689202,"2014-03-14","AN" 234539390,"2014-03-14","PP" 123125444,"2014-03-14","AN" NULL,"2014-03-14","AN" 901828393,"2014-03-14","AN"