I use LBFGS logical regression to classify examples in one of two categories. When I train the model, I get many warnings of this kind -
WARN scheduler.TaskSetManager: Stage 132 contains a task of very large size (109 KB). The maximum recommended task size is 100 KB.
WARN scheduler.TaskSetManager: Stage 134 contains a task of very large size (102 KB). The maximum recommended task size is 100 KB.
WARN scheduler.TaskSetManager: Stage 136 contains a task of very large size (109 KB). The maximum recommended task size is 100 KB.
I have about 94 functions and about 7500 case studies. Is there any other argument I have to pass in order to break the size of the task into smaller pieces?
Also, this is just a warning, what in the worst case can be ignored? Or does it interfere with learning?
I call my coach that way -
val lr_lbfgs = new LogisticRegressionWithLBFGS().setNumClasses(2)
lr_lbfgs.optimizer.setRegParam(reg).setNumIterations(numIterations)
val model = lr_lbfgs.run(trainingData)
In addition, my driver and artist memory 20G, which I set as arguments forspark-submit
source
share