# Optimizers

## RandomSearch(Hyperopt)Optimizer

This optimizer produces hyperparameter configurations by random sampling. First, the type of ML algorithm is sampled uniformly from all selected algorithms . Then each hyperparameter value is also sampled uniformly from the appropriate range. This optimizer handles selection of ML algorithm and all types of hyperparameters. Its computation complexity is independent of the number of previous trials.

## AnnealHyperoptOptimizer

This optimizer is similar to the previous one, but diverges gradually from uniform distributions. Each time a new hyperparameter configuration is evaluated, internal distributions change slightly towards these values.

## TPEHyperoptOptimizer

This optimizer implements Tree of Parzen Estimators algorithm. It is a form of Bayesian optimization and belongs to a family of sequential model-based optimization algorithms (SMBO). It samples new hyperparameter configurations from a tree-based model of a distribution on all possible configurations. After each new configuration is evaluated, the model is modified based Bayes theorem and Expected improvement criterion. This approach leads to a trade-off between exploration and exploitation of the search space: points with a higher expected improvement on the current maximum score are chosen. This optimizer handles selection of ML algorithm and all types of hyperparameters. Its computation complexity is linear in the number of previous trials.

Implementation: https://github.com/hyperopt/hyperopt

Paper: J. Bergstra, R. Bardenet, Y. Bengio and B. Kégl. Algorithms for Hyper-parameter Optimization. Proc. Neural Information Processing Systems 24 (NIPS2011)

## NelderMeadOptimizer

Nelder-Mead algorithm is a derivative-free method for finding optimum of a function. In our case, this function is the selected performance measure of hyperparameters configurations. It is often characterised as local search method. It supports searching for configurations of only a single ML algorithm and handles only continuous hyperparameters. To compensate for these shortcomings, we run this (and similar subsequent) method for each ML algorithm selected in the search space. Computational complexity of Nelder-Mead optimizer is independent of the number of previous trials.

Paper: J. Nelder, R. Mead. A simplex method for function minimization. Computer Journal. 7 (1965)

## PSOOptimizer

Particle Swarm optimizer works by iteratively trying to improve a set of candidate solutions (hyperparameter configurations in our case), which are called particles. Each particle has mass and velocity and move in the search space according to certain rules. This means that between steps values of hyperparameters of each particle don’t change dramatically. Particle swarm optimizer supports searching for configurations of only a single ML algorithm and handle continuous hyperparameters.

## LIPOOptimizer

This optimizer also belongs to the SMBO category, but isn’t based on Bayesian paradigm. It uses local linear and quadratic functions to approximate the hyperparameter configuration performance surface. LIPO optimizer supports searching for configurations of only a single ML algorithm and handles discrete and continuous hyperparameters.

Description: http://blog.dlib.net/2017/12/a-global-optimization-algorithm-worth.html

Paper: C. Malherbe, N. Vayatis. Global optimization of Lipschitz functions. ICML 2017.

## SkOptBaseOptimizer

This optimizer is based on SciKit-Optimize package. It is also model-based (SMBO) and Bayesian. Internally, it uses Gaussian Processes to model distribution of all possible configurations of hyperparameters. SkOpt optimizer supports searching for configurations of only a single ML algorithm and handles discrete and continuous hyperparameters. Internally, we use separate models for each ML algorithm selected in the search space. Due to use of Gaussian Processes, each iteration of this optimizer is cubic in the number of previous trials.

Implementation: https://github.com/scikit-optimize/scikit-optimize