Covariance Functions

In this section, we’ll explore covariance functions, a key concept in Gaussian processes. The covariance function, or kernel, is a function of two inputs that gives some measure of similarity between them. In the context of our library, the cov_func_curry argument in the model class accepts a covariance class with a single argument (the length scale) for its __init__ method, which should implement a function k(x,y) \(\rightarrow\) float. If the length scale of the covariance function is not supplied as an argument, it will be calculated automatically. Alternatively, you can use the cov_func argument to pass an instance of a Covariance class.

Here is how you can pass a pre-existing covariance function class (which is the default behavior):

Pass a predefined covariance function class (Default behavior)
from mellon.cov import Matern52
cov_func_curry = Matern52
cov_func = Matern52(length_scale)

If you want to write a custom covariance function k(x, y) \(\rightarrow\) float, you can do so by inheriting from the Covariance base class. The Covariance base class’s __call__ method will call the function k.

Write a custom covariance function \(k(x, y) \rightarrow\) float and inherit from the Covariance base class
from mellon import distance
from mellon import Covariance  # The Covariance base class __call__ method calls k.
import jax.numpy as jnp

class Matern52(Covariance):
    def __init__(self, ls=1.0):
        super().__init__()
        self.ls = ls

    def k(self, x, y):
        r = mellon.distance(x, y) / self.ls
        similarity = (
            jnp.sqrt(5.0) * r + jnp.square(jnp.sqrt(5.0) * r) / 3 + 1
        ) * jnp.exp(-jnp.sqrt(5.0) * r)
        return similarity

The Covariance class also supports arithmetic operations such as addition, multiplication, and exponentiation with the +, *, and ** operators, respectively:

Combining two covariance functions.
from mellon.cov import Matern52, ExpQuad
cov_func = Matern52(length_scale)*.7 + ExpQuad(length_scale)*.3

Implemented Covariance Functions

class mellon.cov.ExpQuad(ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Exponentiated Quadratic kernel, also known as the squared exponential or the Gaussian kernel.

The kernel is defined as:

\[e^{-\frac{||x-y||^2}{2 l^2}}\]

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Compute the Exponentiated Quadratic kernel function between inputs x and y.

The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.

Parameters:
  • x (array-like) – First input array.

  • y (array-like) – Second input array.

Returns:

similarity – The computed kernel function.

Return type:

float

k_grad(x)View on GitHub

Generate a function to compute the gradient of the Exponentiated Quadratic kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Exponentiated Quadratic kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Exponentiated Quadratic kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Exponentiated Quadratic kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable

class mellon.cov.Exponential(ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Exponential kernel.

The kernel is defined as:

\[e^{-\frac{||x-y||}{2l}}\]

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Compute the Exponential kernel function between inputs x and y.

The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.

Parameters:
  • x (array-like) – First input array.

  • y (array-like) – Second input array.

Returns:

similarity – The computed kernel function.

Return type:

float

k_grad(x)View on GitHub

Generate a function to compute the gradient of the Rational Quadratic kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Rational Quadratic kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Rational Quadratic kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Rational Quadratic kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable

class mellon.cov.Linear(ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Implementation of the Linear kernel.

TheL inear kernel function is defined as:

\[(1 + \frac{\sqrt{5}||x-y||}{l} + \frac{5||x-y||^2}{3l^2}) \cdot e^{-\frac{\sqrt{5}||x-y||}{l}}\]

where x and y are input vectors and l is the length-scale.

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Compute the Linear kernel between inputs x and y.

Parameters:
  • x (array-like) – First input array.

  • y (array-like) – Second input array.

Returns:

similarity – The computed kernel function.

Return type:

float

k_grad(x)View on GitHub

Generate a function to compute the gradient of the Linear kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Linear kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Linear kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Linear kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable

class mellon.cov.Matern32(ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Implementation of the Matern-3/2 kernel function, a member of the Matern family of kernels.

The Matern-3/2 kernel function is defined as:

\[(1 + \frac{\sqrt{3}||x-y||}{l}) \cdot e^{-\frac{\sqrt{3}||x-y||}{l}}\]

where x and y are input vectors and l is the length-scale.

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Generate a function to compute the gradient of the Matern-3/2 kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Matern-3/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Matern-3/2 kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Matern-3/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable

k_grad(x)View on GitHub

Produce a function that computes the gradient of the Matern-3/2 kernel function with the left argument set to x with respect to y for the active_dims.

Parameters

Returns:

k_grad – Function that computes the gradient of the Matern-3/2 kernel function.

Return type:

callable

class mellon.cov.Matern52(ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Implementation of the Matern-5/2 kernel function, a member of the Matern family of kernels.

The Matern-5/2 kernel function is defined as:

\[(1 + \frac{\sqrt{5}||x-y||}{l} + \frac{5||x-y||^2}{3l^2}) \cdot e^{-\frac{\sqrt{5}||x-y||}{l}}\]

where x and y are input vectors and l is the length-scale.

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Compute the Matern-5/2 kernel function between inputs x and y.

The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.

Parameters:
  • x (array-like) – First input array.

  • y (array-like) – Second input array.

Returns:

similarity – The computed kernel function.

Return type:

float

k_grad(x)View on GitHub

Generate a function to compute the gradient of the Matern-5/2 kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Matern-5/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Matern-5/2 kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Matern-5/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable

class mellon.cov.RatQuad(alpha=1.0, ls=1.0, active_dims=None)View on GitHub

Bases: Covariance

Rational Quadratic kernel.

The kernel is defined as:

\[(1 + \frac{||x-y||^2}{2 \alpha l^2})^{-\alpha l}\]

This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.

Parameters:
  • ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.

  • alpha (float) – The alpha parameter of the Rational Quadratic kernel. Default is 1.0.

  • active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.

k(x, y)View on GitHub

Compute the Rational Quadratic kernel function between inputs x and y.

The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.

Parameters:
  • x (array-like) – First input array.

  • y (array-like) – Second input array.

Returns:

similarity – The computed kernel function.

Return type:

float

k_grad(x)View on GitHub

Generate a function to compute the gradient of the Matern-3/2 kernel function.

This method returns a callable that, when given an array y, computes the gradient of the Matern-3/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.

Parameters:

x (array-like) – The fixed input array used as the first argument in the Matern-3/2 kernel. Its shape should be compatible with the active dimensions of the kernel.

Returns:

A function that takes an array y as input and returns the gradient of the Matern-3/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.

Return type:

Callable