The online book entitled Probabilistic Programming & Bayesian methods for hackers, written by Cam Davidson-Pilon and many others, describes itself as “An intro into Bayesian methods and probabilistic programming from a computation/understanding-first, mathematics-second point of view”. It is written in a friendly, easy-to-follow manner as a collection of IPython notebooks, one for each chapter. The interested reader is free to download the notebooks and play with the provided methods.
Many learning procedures need the specification of some hyperparameters. The practice is to use a validation set to measure the performance of learning method and then invoke an optimization procedure to select a good value for the hyper-parameters. A traditional approach is to cover the range of the hyper-parameters by a grid, to fit a learning model to each point of the grid and to select the value for which the error on the validation set is minimal. Here are some recent interesting posts on Hyper-parametr optimization:
- James Bergstra and Yoshua Bengio suggested to use random search for tuning the parameters. This procedure is capable of finding models that are as good as modeled computed by a grid search at a fraction of the computational cost. A simple explanation of the good performance is given in this nice post by Alice Zheng that also describes other tuning strategies.
- Daniel Saltiel argues that random search is no better than grid search and propose using Bayesian optimization. The author present the method at a conceptual level and points to two open source libraries with an IPython notebook example.