I would go with hyperOpt
https://github.com/hyperopt/hyperopt
open sourced and did a great job for me. If you decide this and need help, I can clarify.
When you ask to look at "max_depth":[2,4,6] , you can naively solve this problem by running 3 models, each of which has the maximum depth, and you will see which model gives the best results.
But "max_depth" is not the only hyperparameter you should consider. There are many other hyper parameters, such as: eta (learning rate), gamma, min_child_weight, subsample , etc. Some of them continue, and some are discrete. (assuming that you know your target functions and scorecards)
you can read about all of them here https://github.com/dmlc/xgboost/blob/master/doc/parameter.md
When you look at all these “parameters” and the dimension that they create is enormous. You cannot search in it manually (and an “expert” cannot give you better arguments).
Thus, hyperOpt gives you a neat solution for this and creates a search space that is not exactly random, or a grid. All you have to do is define the parameters and their ranges.
Here you can find a sample code: https://github.com/bamine/Kaggle-stuff/blob/master/otto/hyperopt_xgboost.py
I can tell you from my own experience that it worked better than Bayesian optimization on my models. Give him a few hours / days of trial and error and contact me if you have any problems that you cannot solve.
Good luck
source share