gpml_extensions
文件大小: unknow
源码售价: 5 个金币 积分规则     积分充值
资源说明:Provides various extensions to the GPML toolbox for Gaussian process inference in MATLAB.
GPML Extensions
===============

This repository contains a collection of extensions to the popular
GPML toolbox for Gaussian process inference in `MATLAB`, available
here:

http://www.gaussianprocess.org/gpml/code/matlab/doc/

We provide code for:

* Incorporating arbitrary hyperparameter priors ![p(\theta)][1] into
  any inference method, allowing for MAP rather than MLE inference
  during hyperparameter learning.
* An extended API for mean and covariance functions for computing
  Hessians with respect to their hyperparameters.
* An extended API for inference methods to compute the (potentially
  approximate) Hessian of the negative log likelihood/posterior with
  respect to hyperparameters.
* Implementations of several new mean and covariance functions.
* A number of additional utilities, for example, for computing
  rank-one updates to quickly update a posterior when performing
  online GP regression.

Hyperparameter Priors
---------------------

We establish a new, simple API for specifying arbitrary hyperparameter
priors ![p(\theta)][1]. The API is:

    [nlZ, dnlZ, HnlZ] = prior(hyperparameters)

Where the input is:

* `hyperparameters`: a GPML hyperparameter struct specifying ![\theta][2]

and the outputs are

* `nlZ`: the negative of the log prior evaluated at ![\theta][2],
  ![-\log p(\theta)][3]
* `dnlZ`: a struct containing the gradient of the negative log prior
  evaluated at ![\theta][2],
  ![\nabla -\log p(\theta)][4]
* `HnlZ`: (optional) a struct containing the Hessian of the negative
  log prior evaluated at ![\theta][2],
  ![H( -\log p(\theta) )][5]

The `dnlZ` struct is specified in the same way as is typical for GPML
(for example, as the second output of `gp.m` in training mode), and,
if needed, the Hessian is specified as described in the section on
Hessians below.

We provide an implementation of a flexible family of such priors in
`independent_prior.m`. This implements a meta-prior of the form

![p(\theta) = \prod_i p(\theta_i)][6]

where we have placed independent priors on each hyperparameter
![\theta_i][7]. Several elementwise priors are provided, including:

* normal priors (see `gaussian_prior`): ![p(\theta_i) = N(\mu, \sigma^2)][8]
* uniform priors (see `uniform_prior`): ![p(\theta_i) = U}(\ell, u)][9]
* Laplace priors (see `laplace_prior`): ![p(\theta_i) = Lapalce(\mu, b)][10]
* improper constant priors (see `constant_prior`): ![p(\theta_i) = 1][11]

Once a hyperparameter prior is specified, a special meta-inference
method, `inference_with_prior` allows the user to incorporate the
prior into any arbitrary GPML inference method. Except for extra
inputs specifying the inference method and prior, the API of
`inference_with_prior` is identical to the standard GPML inference
method API, except that the negative log likelihood `nlZ`, its
gradient `dnlZ`, and, optionally, its Hessian `HnlZ`, are replaced
with the equivalent expressions for the negative (unnormalized) log
posterior ![-\log p(y | x, D, \theta) - \log p(\theta)][12].

Here is a demonstration of incorporating a hyperparameter prior to a
GPML model:

    inference_method    = @infExact;
    mean_function       = {@meanConst};
    covariance_function = {@covSEiso};

    % initial hyperparameters
    offset       = 1;
    length_scale = 1;
    output_scale = 1;
    noise_std    = 0.05;

    hyperparameters.mean = offset;
    hyperparameters.cov  = log([length_scale; output_scale]);
    hyperparameters.lik  = log(noise_std);

    % add normal priors to each hyperparameter
    priors.mean = {get_prior(@gaussian_prior, 0, 1)};
    priors.cov  = {get_prior(@gaussian_prior, 0, 1), ...
                   get_prior(@gaussian_prior, 0, 1)};
    priors.lik  = {get_prior(@gaussian_prior, log(0.01), 1)};

    % add prior to inference method
    prior = get_prior(@independent_prior, priors);
    inference_method = add_prior_to_inference_method(inference_method, prior);

    % find MAP hyperparameters
    map_hyperparameters = minimize(hyperparameters, @gp, 50, inference_method, ...
            mean_function, covariance_function, [], x, y);

Hessians
--------

We establish a simple extension to the GPML API for mean and
covariance functions, allowing us to compute their Hessians with
respect to their hyperparameters.

To compute the second (mixed) partial derivative of ![\mu(x)][13] with
respect to the pair ![(\theta_i, \theta_j)][14],

![\partial^2 / \partial \theta_i \partial \theta_j \mu(x)][15]

the interface is:

    mu = mean_function(hyperparameters, x, i, j)

which differs from the interface for computing gradients only by the
additional input argument `j`. Several mean function implementations
compliant with this extended interface are provided:

* `zero_mean`: a drop-in replacement for `meanZero`
* `constant_mean`: a drop-in replacement for `meanConst`
* `linear_mean`: a drop-in replacement for `meanLinear`

Similarly, to compute the second (mixed) partial derivative of
![K(x, z)][16] with respect to the pair ![(\theta_i, \theta_j)][14],

![\partial^2 / \partial \theta_i \partial \theta_j K(x, z][17]

the interface is:

    K = covariance_function(hyperparameters, x, z, i, j)

Several covariance function implementations compliant with this
extended interface are provided:

* `isotropic_sqdexp_covariance`: a drop-in replacement for `covSEiso`
* `ard_sqdexp_covariance`: a drop-in replacement for `covSEard`
* `factor_sqdexp_covariance`: an implementation of a squared
  exponential "factor" covariance, where an isotropic squared
  exponential is applied to data after a linear map to a
  lower-dimensional space.

These Hessians can ultimately be used to compute the Hessian of the
log likelihood with respect to the hyperparameters. In particular, we
provide:

* `exact_inference`: a drop-in replacement for `infExact`
* `laplace_inference`: a drop-in replacement for `infLaplace`.

Both support the extended inference method API

    [posterior, nlZ, dnlZ, HnlZ] = ...
        inference_method(hyperparameters, mean_function, ...
                         covariance_function, likelihood, x, y);

The last output, `HnlZ`, is a struct describing the Hessian of the
negative log likelihood with respect to ![\theta][2], including with
respect to "off-block-diagonal" terms such as mean/covariance,
mean/likelihood, and covariance/likelihood hyperparameter pairs. See
`hessians.m` for a description of this struct.

Other
-----

A number of additional files are included, providing additional
functionality. These include:

* New mean functions:
    * `step_mean`: a simple "step" changepoint mean
	* `discrete_mean`/`fixed_discrete_mean`: free-form mean vectors
      for discrete data; `discrete_mean` treats the entries of this
      vector as hyperparameters, enabling learning.
* New covariance functions:
    * `discrete_covariance`/`fixed_discrete_covariance`: free-form
      covariance matrices for discrete data; `discrete_covariance`
      treats the entries of this matrix as hyperparameters, enabling
      learning. A log-Cholesky parameterization is used, allowing for
      unconstrained optimization.
	* `scaled_covariance`: a meta-covariance for modeling functions of
      the form ![a(x)f(x)][18], where ![p(f) = GP(\mu, K)][19] and
      ![a(x)][20] is a fixed function. ![a(x)][20] is specified as a
      GPML mean function, and ![K(x, x')][21] is specified as a GPML
      covariance function. `scaled_covariance` computes the gradient
      of the covariance with respect to both the parameters of
      ![a(x)][20] and ![K(x, x'][21].
* Rank-one updates of GPML posterior structs: `update_posterior`
  allows the user to update an existing GPML posterior struct (for
  regression with Gaussian observation noise) given a single new
  observation ![(x*, y*)][22]. This can significantly decrease the
  total time needed to perform sequential online GP regression.
* Computing likelihoods assuming datasets are from independent draws
  from a joint GP prior: see `gp_likelihood_independent` for more
  information.

[1]: http://latex.codecogs.com/svg.latex?p(%5Ctheta)
[2]: http://latex.codecogs.com/svg.latex?%5Ctheta
[3]: http://latex.codecogs.com/svg.latex?-%5Clog%20p(%5Ctheta)
[4]: http://latex.codecogs.com/svg.latex?%5Cnabla%20-%5Clog%20p(%5Ctheta)
[5]: http://latex.codecogs.com/svg.latex?H%5Cbigl(-%5Clog%20p(%5Ctheta)%20%5Cbigr)
[6]: http://latex.codecogs.com/svg.latex?p(%5Ctheta)%20%3D%20%5Cprod_i%20p(%5Ctheta_i)
[7]: http://latex.codecogs.com/svg.latex?%5Ctheta_i
[8]: http://latex.codecogs.com/svg.latex?p(%5Ctheta_i%20%5Cmid%20%5Cmu%2C%20%5Csigma%5E2)%20%3D%20%5Cmathcal%7BN%7D(%5Ctheta_i%3B%20%5Cmu%2C%20%5Csigma%5E2)%20
[9]: http://latex.codecogs.com/svg.latex?p(%5Ctheta_i%20%5Cmid%20%5Cell%2C%20u)%20%3D%20%5Cmathcal%7BU%7D(%5Cell%2C%20u)
[10]: http://latex.codecogs.com/svg.latex?p(%5Ctheta_i%20%5Cmid%20%5Cmu%2C%20b)%20%3D%20%5Ctext%7BLaplace%7D(%5Cmu%2C%20b)
[11]: http://latex.codecogs.com/svg.latex?p(%5Ctheta_i)%20%3D%201
[12]: http://latex.codecogs.com/svg.latex?-%5Clog%20p(y%20%5Cmid%20X%2C%20%5Cmathcal%7BD%7D%2C%20%5Ctheta)%20-%20%5Clog%20p(%5Ctheta)
[13]: http://latex.codecogs.com/svg.latex?%5Cmu(x)
[14]: http://latex.codecogs.com/svg.latex?(%5Ctheta_i%2C%20%5Ctheta_j)
[15]: http://latex.codecogs.com/svg.latex?%5Cfrac%7B%5Cpartial%5E2%7D%7B%5Cpartial%20%5Ctheta_i%20%5Cpartial%20%5Ctheta_j%7D%20%5Cmu(x)
[16]: http://latex.codecogs.com/svg.latex?K(x%2C%20z)
[17]: http://latex.codecogs.com/svg.latex?%5Cfrac%7B%5Cpartial%5E2%7D%7B%5Cpartial%20%5Ctheta_i%20%5Cpartial%20%5Ctheta_j%7D%20K(x%2C%20z)
[18]: http://latex.codecogs.com/svg.latex?a(x)f(x)
[19]: http://latex.codecogs.com/svg.latex?p(f)%20%3D%20%5Cmathcal%7BGP%7D(f%3B%20%5Cmu%2C%20K)
[20]: http://latex.codecogs.com/svg.latex?a(x)
[21]: http://latex.codecogs.com/svg.latex?K(x%2C%20x%27)
[22]: http://latex.codecogs.com/svg.latex?(x%5E%5Cast%2C%20y%5E%5Cast)

本源码包内暂不包含可直接显示的源代码文件,请下载源码包。