[book]Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning

Carl Edward Rasmussen and Christopher K. I. Williams
MIT Press, 2006. ISBN-10 0-262-18253-X, ISBN-13 978-0-262-18253-9.

Book description

Winner, 2009 DeGroot Prize for the best book in statistical science, awarded by the International Society for Bayesian Analysis.

Gaussian processes (GPs) provide a principled, practical, probabilistic approach to learning in kernel machines. GPs have received increased attention in the machine-learning community over the past decade, and this book provides a long-needed systematic and unified treatment of theoretical and practical aspects of GPs in machine learning. The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics.

The book deals with the supervised-learning problem for both regression and classification, and includes detailed algorithms. A wide variety of covariance (kernel) functions are presented and their properties discussed. Model selection is discussed both from a Bayesian and a classical perspective. Many connections to other well-known techniques from machine learning and statistics are discussed, including support-vector machines, neural networks, splines, regularization networks, relevance vector machines and others. Theoretical issues including learning curves and the PAC-Bayesian framework are treated, and several approximation methods for learning with large datasets are discussed. The book contains illustrative examples and exercises, and code and datasets are available on the Web. Appendixes provide mathematical background and a discussion of Gaussian Markov processes.

About the Author

Carl Edward Rasmussen is a Lecturer at the Department of Engineering, University of Cambridge, and Adjunct Research Scientist at the Max Planck Institute for Biological Cybernetics, Tübingen.

Christopher K. I. Williams is Professor of Machine Learning and Director of the Institute for Adaptive and Neural Computation in the School of Informatics, University of Edinburgh.


The whole book as a single pdf file.

List of contents and individual chapters in pdf format


Table of Contents
Series Foreword
Symbols and Notation
1.1 A Pictorial Introduction to Bayesian Modelling
1.2 Roadmap
2.1 Weight-space View
2.2 Function-space View
2.3 Varying the Hyperparameters
2.4 Decision Theory for Regression
2.5 An Example Application
2.6 Smoothing, Weight Functions and Equivalent Kernels
2.7 History and Related Work
2.8 Appendix: Infinite Radial Basis Function Networks
2.9 Exercises
3.1 Classification Problems
3.2 Linear Models for Classification
3.3 Gaussian Process Classification
3.4 The Laplace Approximation for the Binary GP Classifier
3.5 Multi-class Laplace Approximation
3.6 Expectation Propagation
3.7 Experiments
3.8 Discussion
3.9 Appendix: Moment Derivations
3.10 Exercises
Covariance Functions
4.1 Preliminaries
4.2 Examples of Covariance Functions
4.3 Eigenfunction Analysis of Kernels
4.4 Kernels for Non-vectorial Inputs
4.5 Exercises
Model Selection and Adaptation of Hyperparameters
5.1 The Model Selection Problem
5.2 Bayesian Model Selection
5.3 Cross-validation
5.4 Model Selection for GP Regression
5.5 Model Selection for GP Classification
5.6 Exercises
Relationships between GPs and Other Models
6.1 Reproducing Kernel Hilbert Spaces
6.2 Regularization
6.3 Spline Models
6.4 Support Vector Machines
6.5 Least-Squares Classification
6.6 Relevance Vector Machines
6.7 Exercises
Theoretical Perspectives
7.1 The Equivalent Kernel
7.2 Asymptotic Analysis
7.3 Average-case Learning Curves
7.4 PAC-Bayesian Analysis
7.5 Comparison with Other Supervised Learning Methods
7.6 Appendix: Learning Curve for the Ornstein-Uhlenbeck Process
7.7 Exercises
Approximation Methods for Large Datasets
8.1 Reduced-rank Approximations of the Gram Matrix
8.2 Greedy Approximation
8.3 Approximations for GPR with Fixed Hyperparameters
8.4 Approximations for GPC with Fixed Hyperparameters
8.5 Approximating the Marginal Likelihood and its Derivatives
8.6 Appendix: Equivalence of SR and GPR using the Nyström Approximate Kernel
8.7 Exercises
Further Issues and Conclusions
9.1 Multiple Outputs
9.2 Noise Models with Dependencies
9.3 Non-Gaussian Likelihoods
9.4 Derivative Observations
9.5 Prediction with Uncertain Inputs
9.6 Mixtures of Gaussian Processes
9.7 Global Optimization
9.8 Evaluation of Integrals
9.9 Student’s t Process
9.10 Invariances
9.11 Latent Variable Models
9.12 Conclusions and Future Directions
Mathematical Background
Gaussian Markov Processes
Datasets and Code
Author Index
Subject Index

Go back to the web page for Gaussian Processes for Machine Learning.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.