So, I have an input, which consists of a data set and several ML algorithms (with parameter settings) using scikit-learn. I tried to make several attempts to do this as efficiently as possible, but at this point I still do not have the appropriate infrastructure to evaluate my results. However, I do not have enough background in this area, and I need help to make things clear.
Basically, I want to know how tasks are distributed in such a way as to maximize the use of all available resources and what is actually done implicitly (for example, Spark) and what is not.
This is my scenario:

I need to train many different models of the decision tree (as much as a combination of all possible parameters), many different models of Random Forest, etc.
, ML .
spark.parallelize(algorithms).map(lambda algorihtm: run_experiment(dataframe, algorithm))
run_experiment GridSearchCV ML . n_jobs=-1, () parallelism.
, Spark- , , ?

, Random Forest, node? , , , .
, , parallelize for GridSearchCV databricks spark-sklearn Spark scikit-learn? , , :

, , ML, Spark MLlib scikit-learn, /?
, , . .
, CS stackexchange.