Too big task size

I use LBFGS logical regression to classify examples in one of two categories. When I train the model, I get many warnings of this kind -

WARN scheduler.TaskSetManager: Stage 132 contains a task of very large size (109 KB). The maximum recommended task size is 100 KB.
WARN scheduler.TaskSetManager: Stage 134 contains a task of very large size (102 KB). The maximum recommended task size is 100 KB.
WARN scheduler.TaskSetManager: Stage 136 contains a task of very large size (109 KB). The maximum recommended task size is 100 KB.

I have about 94 functions and about 7500 case studies. Is there any other argument I have to pass in order to break the size of the task into smaller pieces?

Also, this is just a warning, what in the worst case can be ignored? Or does it interfere with learning?

I call my coach that way -

val lr_lbfgs = new LogisticRegressionWithLBFGS().setNumClasses(2)
lr_lbfgs.optimizer.setRegParam(reg).setNumIterations(numIterations)
val model = lr_lbfgs.run(trainingData)

In addition, my driver and artist memory 20G, which I set as arguments forspark-submit

+4
source share
1 answer

Spark , ; , 100 . , , .

+2

Source: https://habr.com/ru/post/1662293/


All Articles