Covariance Functions¶
In this section, we’ll explore covariance functions, a key concept in Gaussian processes. The covariance function, or kernel, is a function of two inputs that gives some measure of similarity between them. In the context of our library, the cov_func_curry argument in the model class accepts a covariance class with a single argument (the length scale) for its __init__ method, which should implement a function k(x,y) \(\rightarrow\) float. If the length scale of the covariance function is not supplied as an argument, it will be calculated automatically. Alternatively, you can use the cov_func argument to pass an instance of a Covariance class.
Here is how you can pass a pre-existing covariance function class (which is the default behavior):
from mellon.cov import Matern52
cov_func_curry = Matern52
cov_func = Matern52(length_scale)
If you want to write a custom covariance function k(x, y) \(\rightarrow\) float, you can do so by inheriting from the Covariance base class. The Covariance base class’s __call__ method will call the function k.
from mellon import distance
from mellon import Covariance # The Covariance base class __call__ method calls k.
import jax.numpy as jnp
class Matern52(Covariance):
def __init__(self, ls=1.0):
super().__init__()
self.ls = ls
def k(self, x, y):
r = mellon.distance(x, y) / self.ls
similarity = (
jnp.sqrt(5.0) * r + jnp.square(jnp.sqrt(5.0) * r) / 3 + 1
) * jnp.exp(-jnp.sqrt(5.0) * r)
return similarity
The Covariance class also supports arithmetic operations such as addition, multiplication, and exponentiation with the +, *, and ** operators, respectively:
from mellon.cov import Matern52, ExpQuad
cov_func = Matern52(length_scale)*.7 + ExpQuad(length_scale)*.3
Implemented Covariance Functions¶
- class mellon.cov.ExpQuad(ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Exponentiated Quadratic kernel, also known as the squared exponential or the Gaussian kernel.
The kernel is defined as:
\[e^{-\frac{||x-y||^2}{2 l^2}}\]This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Compute the Exponentiated Quadratic kernel function between inputs x and y.
The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.
- Parameters:
x (array-like) – First input array.
y (array-like) – Second input array.
- Returns:
similarity – The computed kernel function.
- Return type:
float
- k_grad(x)View on GitHub¶
Generate a function to compute the gradient of the Exponentiated Quadratic kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Exponentiated Quadratic kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Exponentiated Quadratic kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Exponentiated Quadratic kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable
- class mellon.cov.Exponential(ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Exponential kernel.
The kernel is defined as:
\[e^{-\frac{||x-y||}{2l}}\]This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Compute the Exponential kernel function between inputs x and y.
The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.
- Parameters:
x (array-like) – First input array.
y (array-like) – Second input array.
- Returns:
similarity – The computed kernel function.
- Return type:
float
- k_grad(x)View on GitHub¶
Generate a function to compute the gradient of the Rational Quadratic kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Rational Quadratic kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Rational Quadratic kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Rational Quadratic kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable
- class mellon.cov.Linear(ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Implementation of the Linear kernel.
TheL inear kernel function is defined as:
\[(1 + \frac{\sqrt{5}||x-y||}{l} + \frac{5||x-y||^2}{3l^2}) \cdot e^{-\frac{\sqrt{5}||x-y||}{l}}\]where x and y are input vectors and l is the length-scale.
This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Compute the Linear kernel between inputs x and y.
- Parameters:
x (array-like) – First input array.
y (array-like) – Second input array.
- Returns:
similarity – The computed kernel function.
- Return type:
float
- k_grad(x)View on GitHub¶
Generate a function to compute the gradient of the Linear kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Linear kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Linear kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Linear kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable
- class mellon.cov.Matern32(ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Implementation of the Matern-3/2 kernel function, a member of the Matern family of kernels.
The Matern-3/2 kernel function is defined as:
\[(1 + \frac{\sqrt{3}||x-y||}{l}) \cdot e^{-\frac{\sqrt{3}||x-y||}{l}}\]where x and y are input vectors and l is the length-scale.
This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Generate a function to compute the gradient of the Matern-3/2 kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Matern-3/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Matern-3/2 kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Matern-3/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable
- k_grad(x)View on GitHub¶
Produce a function that computes the gradient of the Matern-3/2 kernel function with the left argument set to x with respect to y for the active_dims.
Parameters
- Returns:
k_grad – Function that computes the gradient of the Matern-3/2 kernel function.
- Return type:
callable
- class mellon.cov.Matern52(ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Implementation of the Matern-5/2 kernel function, a member of the Matern family of kernels.
The Matern-5/2 kernel function is defined as:
\[(1 + \frac{\sqrt{5}||x-y||}{l} + \frac{5||x-y||^2}{3l^2}) \cdot e^{-\frac{\sqrt{5}||x-y||}{l}}\]where x and y are input vectors and l is the length-scale.
This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Compute the Matern-5/2 kernel function between inputs x and y.
The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.
- Parameters:
x (array-like) – First input array.
y (array-like) – Second input array.
- Returns:
similarity – The computed kernel function.
- Return type:
float
- k_grad(x)View on GitHub¶
Generate a function to compute the gradient of the Matern-5/2 kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Matern-5/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Matern-5/2 kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Matern-5/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable
- class mellon.cov.RatQuad(alpha=1.0, ls=1.0, active_dims=None)View on GitHub¶
Bases:
Covariance
Rational Quadratic kernel.
The kernel is defined as:
\[(1 + \frac{||x-y||^2}{2 \alpha l^2})^{-\alpha l}\]This class can be used as a function curry, meaning it can be called like a function on two inputs x and y.
- Parameters:
ls (float, optional) – The length-scale parameter, which controls the width of the kernel. Larger values result in wider kernels, and smaller values in narrower kernels. Default is 1.0.
alpha (float) – The alpha parameter of the Rational Quadratic kernel. Default is 1.0.
active_dims (array-like, slice or scalar, optional) – The indices of the active dimensions. If specified, the kernel function will only be computed over these dimensions. Default is None, which means all dimensions are active.
- k(x, y)View on GitHub¶
Compute the Rational Quadratic kernel function between inputs x and y.
The kernel function is computed over the active dimensions, specified by the active_dims parameter during initialization.
- Parameters:
x (array-like) – First input array.
y (array-like) – Second input array.
- Returns:
similarity – The computed kernel function.
- Return type:
float
- k_grad(x)View on GitHub¶
Generate a function to compute the gradient of the Matern-3/2 kernel function.
This method returns a callable that, when given an array y, computes the gradient of the Matern-3/2 kernel function with respect to y, considering x as the fixed input. The computation is restricted to the active dimensions specified in the covariance function instance.
- Parameters:
x (array-like) – The fixed input array used as the first argument in the Matern-3/2 kernel. Its shape should be compatible with the active dimensions of the kernel.
- Returns:
A function that takes an array y as input and returns the gradient of the Matern-3/2 kernel function with respect to y, evaluated at the pair (x, y). The gradient is computed only over the active dimensions.
- Return type:
Callable