`pymor.algorithms.ml.vkoga.regressor`¶

Module Contents¶

class pymor.algorithms.ml.vkoga.regressor.VKOGARegressor(kernel=GaussianKernel(), criterion='fp', max_centers=20, tol=1e-06, reg=1e-12)[source]¶

Bases: pymor.core.base.BasicObject

Scikit-learn-style regressor using the VKOGASurrogate.

The regressor uses the weak_greedy in its fit-method to select centers according to the given criterion.

The algorithm is described in [WH13] and [SH21].

Parameters:

kernel – Kernel to use in the regressor. The kernel is assumed to have a scalar-valued output. For vector-valued outputs, the interpolant uses vector-valued coefficients, i.e., the prediction is computed as a linear combination of kernel evaluations with vector-valued weights. The interface of the kernel needs to follow the scikit-learn interface and in particular a __call__-method for (vectorized) evaluation of the kernel and a diag-method for computing the diagonal of the kernel matrix are required. For convenience, a Gaussian kernel is provided in pymor.algorithms.ml.vkoga.kernels.
criterion – Selection criterion for the greedy algorithm. Possible values are 'fp', 'f' and 'p'.
max_centers – Maximum number of selected centers in the greedy algorithm.
tol – Tolerance for the weak greedy algorithm.
reg – Regularization parameter for the kernel interpolation.

Methods

`fit`	Fit VKOGA surrogate using pyMOR's weak greedy algorithm.
`predict`	Predict the target for the input `X`.

fit(X, Y)[source]¶

Fit VKOGA surrogate using pyMOR’s weak greedy algorithm.

Parameters:

X – Training inputs.
Y – Training targets.

Returns:

The trained regressor.

predict(X)[source]¶

Predict the target for the input X.

Parameters:: X – Input for which to compute the prediction.
Returns:: Prediction obtained by the VKOGASurrogate.

class pymor.algorithms.ml.vkoga.regressor.VKOGASurrogate(kernel, X_train, F_train, criterion='fp', reg=1e-12)[source]¶

Bases: pymor.algorithms.greedy.WeakGreedySurrogate

Surrogate for the weak greedy error used when training the VKOGARegressor.

Not intended to be used directly.

Methods

`evaluate`	Evaluate the surrogate for given parameters.
`extend`	Extends the kernel interpolant by adding a new center and updating all quantities.
`predict`

evaluate(mus, return_all_values=False)[source]¶

Evaluate the surrogate for given parameters.

Parameters:

mus – List of parameters for which to estimate the approximation error. When parallelization is used, mus can be a RemoteObject.
return_all_values – See below.

Returns:

If return_all_values is True, an NumPy array of the estimated errors.
If return_all_values is False, the maximum estimated error as first
return value and the corresponding parameter as second return value.

extend(mu)[source]¶

Extends the kernel interpolant by adding a new center and updating all quantities.

Incrementally add a new center \(\mu\) with corresponding function value from the training data. This function updates

the selected centers (self._centers)
the indices of the selected centers in the training set (self._centers_idx)
the coefficients of the interpolant (self._coefficients)
the (squared) power function evaluations on the training set (self._power2)
the Newton basis evaluations on the training set (self._V)
the residual \(f - s_f\) at the training points (self.res)
the inverse of the Cholesky factor of the kernel matrix (self._C)
the coefficients in the Newton basis (self._z).

In the following derivation we leave out the regularization term for simplicity and restrict ourselves to scalar kernels. Let \(K_n\) denote the full kernel matrix for the current set of selected centers \(X_n\). Since \(K_n\) is a kernel matrix, it is in particular positive-definite, so it has a Cholesky decomposition, i.e. \(K_n=L_nL_n^\top\). The inverse of the Choleksy decomposition \(C_n := L_{n}^{-1}\) will be used to efficiently compute and update the coefficients of the kernel interpolant. The formula for the computation and update of \(C_n\) can be found below.

Updating the coefficients of the kernel interpolant: Let \(c_n\in\mathbb{R}^n\) denote the coefficients associated to the kernel interpolant for the first \(n\) selected centers (\(X_n\)), i.e.

\[K_n c_n = f_n,\]

where \(f_n\in\mathbb{R}^n\) corresponds to the vector of target values at the selected centers. The kernel interpolant is then given as

\[s_f^n(x) := \sum\limits_{i=1}^{n} (c_n)_i k(x, x_i),\]

where \(X_n=\{x_1,\dots,x_n\}\subset\mathbb{R}^d\) is the set containing the \(n\) selected centers. When adding \(\mu\) as new center, i.e. \(X_{n+1}=X_n\cup \{\mu\}\), the new coefficient vector \(c_{n+1}\in\mathbb{R}^{n+1}\) is given as

\[\begin{split}c_{n+1} = C_{n+1}^\top \begin{bmatrix} c_n \\ z_{n+1} \end{bmatrix}\end{split}\]

for an unknown coefficient \(z_{n+1}\in\mathbb{R}\), which is computed via the residual \(r_n\) and the power function \(P_n\) for the centers \(X_n\) evaluated at \(\mu\) (the definitions of these quantities are given below):

\[\begin{align*} z_{n+1} = \frac{r_n(\mu)}{P_n(\mu)}. \end{align*}\]

The residual \(r_n=f-s_f^n\) is defined as the difference between the target function \(f\) and the current kernel interpolant \(s_f^n\) and can be updated via the following recursion:

\[\begin{align*} r_n(x) = r_{n-1}(x) - z_n V_n(x) \end{align*}\]

with initial residual \(r_0 = f\). Here, \(V_n\) denotes the Newton basis for

\[\begin{align*} V(X_n) := \mathrm{span}\{k(\,\cdot\,,x_1), \dots, k(\,\cdot\,,x_n)\}. \end{align*}\]

The first Newton basis is given as

\[\begin{align*} V_{1}(x)= \frac{k(x, x_1)}{\sqrt{k(x_1, x_1)}}. \end{align*}\]

The subsequent Newton basis functions are computed via the Gram-Schmidt algorithm as

\[\begin{align*} V_{n+1}(x) = \frac{k(x, \mu) - \sum_{j=1}^n V_j(x) V_j(\mu)}{P_n(\mu)}. \end{align*}\]

Regarding the power function (measuring for \(x\in\mathbb{R}^d\) how well the current subspace can represent \(k(\,\cdot\,,x)\), i.e. the projection error in the reproducing kernel Hilbert space): The initial squared power function is given as

\[\begin{align*} P_0^2(x) = k(x, x) \end{align*}\]

and the incremental update of the squared power function \(P_n^2\) can be computed by

\[\begin{align*} P_n^2(x) = P_{n-1}^2(x) - V_n(x)^2. \end{align*}\]

We are going to track this quantity evaluated at all training points during the iterations of the greedy algorithm.

It remains the computation of the inverse of the Cholesky factor \(C_n\): The initial inverse Cholesky factor is given as

\[\begin{align*} C_1 = \frac{1}{V_1(\mu)} \end{align*}\]

and is updated via

\[\begin{split}C_{n+1} = \begin{bmatrix} C_n & 0 \\ -\frac{1}{V_{n+1}(\mu)}\bar{v}_{n}(\mu) C_n & \frac{1}{V_{n+1}(\mu)} \end{bmatrix},\end{split}\]

where we denote as \(\bar{v}_{n}(\mu)\in\mathbb{R}^n\) the vector \([V_{1}(\mu),\ldots,V_{n}(\mu)]\).

Parameters:: mu – Parameter (center) to add.

predict(X)[source]¶

pymor.algorithms.ml.vkoga.regressor¶

Module Contents¶

`pymor.algorithms.ml.vkoga.regressor`¶