Run this tutorial

Click here to run this tutorial on binder: try on binder
Please note that starting the notebook server may take a couple of minutes.

Tutorial: VectorArray Basics

What are VectorArrays and what are they used for in pyMOR?

While pyMOR uses NumPy arrays to represent extrinsic quantities like inputs or outputs of a Model, the Model’s internal state is always represented by VectorArrays, ordered collections of abstract vectors of the same dimension. The reason for this is that when pyMOR is used in conjunction with an external solver that implements the FOM, state space vectors must be compatible with the solver’s internal data structures, for instance when some Operator appearing in the FOM shall be applied to the vector. Instead of constantly converting between NumPy arrays and the solver’s data structures – which might even be impossible in some cases – pyMOR uses VectorArrays to access these vectors through a unified interface. So as soon as you get in touch with a model’s state-space data, e.g., by calling its solve method, you will work with VectorArrays. Note that this also the case for ROMs. pyMOR does not distinguish between FOMs and ROMs, so even though no external solver is involved, the ROM state is still represented by VectorArrays.

VectorSpaces / How to create VectorArrays

VectorArrays are never instantiated directly but are always created by the VectorSpace they belong to. You will find VectorSpaces as the source and range space of Operators or as the solution_space of Models. In the following, we will work with NumpyVectorArrays that internally store the vectors as two-dimensional NumPy arrays. The corresponding VectorSpace is called NumpyVectorSpace. To make a NumpyVectorSpace, we need to specify the dimension of the contained vectors:

from pymor.basic import NumpyVectorSpace
import numpy as np

space = NumpyVectorSpace(7)

The dimension can be accessed via the dim attribute:

space.dim
7

Our first VectorArray will only contain three zero vectors:

U = space.zeros(3)

To check that we indeed have created three zero vectors of dimension 7, we can do the following:

print(len(U))
print(U.dim)
print(U.norm())
3
7
[0. 0. 0.]

Just zero vectors are boring. To create vectors with different entries, we can use the following methods:

print(space.ones(1))
print(space.full(42, count=4))
print(space.random(2))
[[1. 1. 1. 1. 1. 1. 1.]]
[[42 42 42 42 42 42 42]
 [42 42 42 42 42 42 42]
 [42 42 42 42 42 42 42]
 [42 42 42 42 42 42 42]]
[[0.77395605 0.43887844 0.85859792 0.69736803 0.09417735 0.97562235
  0.7611397 ]
 [0.78606431 0.12811363 0.45038594 0.37079802 0.92676499 0.64386512
  0.82276161]]

Sometimes, for instance when accumulating vectors in a loop, we want to initialize a VectorArray with no vectors in it. We can use the empty method for that:

U.empty()
NumpyVectorArray(NumpyVectorSpace(7), [], _len=0)

You might wonder how to create VectorArrays with more interesting data. When implementing Operators or, for instance, the solve method, you will get in touch with data from the actual solver backend you are using, that needs to be wrapped into a corresponding VectorArray. For that, every VectorSpace has a make_array method that takes care of the wrapping for you. In case of NumpyVectorSpace, the backend is NumPy and the data is given as NumPy arrays:

space.make_array(np.arange(0, 14).reshape((2, 7)))
NumpyVectorArray(
    NumpyVectorSpace(7),
    [[ 0  1  2  3  4  5  6]
     [ 7  8  9 10 11 12 13]],
    _len=2)

Converting NumPy arrays to VectorArrays

Some but not all VectorArrays can be initialized from NumPy arrays. For these arrays, the corresponding VectorSpace implements the from_numpy method:

space = NumpyVectorSpace(4)
numpy_array = np.linspace(0, 4, 8).reshape((2, 4))
print(numpy_array)

vector_array = space.from_numpy(numpy_array)
vector_array
[[0.         0.57142857 1.14285714 1.71428571]
 [2.28571429 2.85714286 3.42857143 4.        ]]
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[0.         0.57142857 1.14285714 1.71428571]
     [2.28571429 2.85714286 3.42857143 4.        ]],
    _len=2)

Warning

The generated VectorArray might take ownership of the provided data. In particular, this is the case for NumpyVectorArrays. So if you change the VectorArray, the original array will change and vice versa:

numpy_array[0, 0] = 99
vector_array
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[99.          0.57142857  1.14285714  1.71428571]
     [ 2.28571429  2.85714286  3.42857143  4.        ]],
    _len=2)

To avoid this problem, you can set ensure_copy to True:

numpy_array = np.linspace(0, 4, 8).reshape((2, 4))
vector_array = space.from_numpy(numpy_array, ensure_copy=True)

numpy_array[0, 0] = 99
print(numpy_array)
vector_array
[[99.          0.57142857  1.14285714  1.71428571]
 [ 2.28571429  2.85714286  3.42857143  4.        ]]
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[0.         0.57142857 1.14285714 1.71428571]
     [2.28571429 2.85714286 3.42857143 4.        ]],
    _len=2)

If you quickly want to create a NumpyVectorArray, you can also use from_numpy as a class method of NumpyVectorSpace, which will infer the dimension of the space from the data:

vector_array = NumpyVectorSpace.from_numpy(numpy_array)
vector_array
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[99.          0.57142857  1.14285714  1.71428571]
     [ 2.28571429  2.85714286  3.42857143  4.        ]],
    _len=2)

Warning

Do not use from_numpy in generic code that is supposed to work with arbitrary external solvers that might not support converting data from NumPy or only at high costs. Using it is fine, however, when you know that you are dealing with NumpyVectorArrays, for instance when building ROMs.

Converting VectorArrays to NumPy arrays

Some VectorArrays also implement a to_numpy method that returns the internal data as a NumPy array:

space = NumpyVectorSpace(4)
vector_array = space.random(3)
vector_array
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[0.4434142  0.22723872 0.55458479 0.06381726]
     [0.82763117 0.6316644  0.75808774 0.35452597]
     [0.97069802 0.89312112 0.7783835  0.19463871]],
    _len=3)
array = vector_array.to_numpy()
array
array([[0.4434142 , 0.22723872, 0.55458479, 0.06381726],
       [0.82763117, 0.6316644 , 0.75808774, 0.35452597],
       [0.97069802, 0.89312112, 0.7783835 , 0.19463871]])

Warning

Again, the returned NumPy array might share data with the VectorArray:

array[:] = 0
vector_array
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[0. 0. 0. 0.]
     [0. 0. 0. 0.]
     [0. 0. 0. 0.]],
    _len=3)

As with from_numpy we can use the ensure_copy parameter to make sure we get a copy:

vector_array = space.random(3)
array = vector_array.to_numpy(ensure_copy=True)
array[:] = 0
vector_array
NumpyVectorArray(
    NumpyVectorSpace(4),
    [[0.466721   0.04380377 0.15428949 0.68304895]
     [0.74476216 0.96750973 0.32582536 0.37045971]
     [0.46955581 0.18947136 0.12992151 0.47570493]],
    _len=3)

Rows and columns

First time pyMOR users coming from numerical linear algebra often say that pyMOR uses row vectors instead column vectors. However, it is more useful to think of VectorArrays as simple lists of vectors, that do not have any notion of being a row or column vector. When a matrix Operator is applied to a VectorArray, think of a for-loop, where the corresponding linear Operator is applied individually to each vector in the array, not of a matrix-matrix product. What is true, however, is that from_numpy / to_numpy interpret VectorArrays as matrices of row vectors. The reason for that is that NumPy prefers a C-like memory layout for matrices, where the individual rows are stored consecutively in memory. (In contrast, Matlab uses a Fortran-like memory layout, where the columns are stored consecutively in memory.)

Basic operations

space = NumpyVectorSpace(5)

U = space.ones(2)
V = space.full(2)
W = space.random(1)
print(U)
print(V)
print(W)
[[1. 1. 1. 1. 1.]
 [1. 1. 1. 1. 1.]]
[[2 2 2 2 2]]
[[0.22690935 0.66981399 0.43715192 0.8326782  0.7002651 ]]

We can append the vectors from one array to another:

V.append(space.full(3))
print(V)
[[2 2 2 2 2]
 [3 3 3 3 3]]

The original array is left unchanged. If we do not want to copy the vectors but remove them from the original array, we can use the remove_from_other argument.

We can add VectorArrays and multiply by a scalar:

print(U + V)
print(U * 2)
[[3. 3. 3. 3. 3.]
 [4. 4. 4. 4. 4.]]
[[2. 2. 2. 2. 2.]
 [2. 2. 2. 2. 2.]]

All operations are vectorized as they are for NumPy arrays. Addition and other operations also allow ‘broadcasting’ a single vector to the entire array:

print(V + W)
[[2.22690935 2.66981399 2.43715192 2.8326782  2.7002651 ]
 [3.22690935 3.66981399 3.43715192 3.8326782  3.7002651 ]]

VectorArrays can be scaled in place using the scal method:

U.scal(3)
print(U)
[[3. 3. 3. 3. 3.]
 [3. 3. 3. 3. 3.]]

Adding a scalar multiple can be achieved using axpy:

U.axpy(2, V)
print(U)
[[7. 7. 7. 7. 7.]
 [9. 9. 9. 9. 9.]]

The same could be achieved with:

U = space.ones(2)
U += 2 * V
print(U)
[[5. 5. 5. 5. 5.]
 [7. 7. 7. 7. 7.]]

However, this would create an unnecessary temporary for 2 * V. axpy also allows broadcasting or specifying multiple coefficients:

U = space.ones(2)
U.axpy(2, W)
print(U)

U = space.ones(2)
U.axpy(np.array([1, -1]), V)
print(U)
[[1.4538187  2.33962799 1.87430384 2.66535639 2.4005302 ]
 [1.4538187  2.33962799 1.87430384 2.66535639 2.4005302 ]]
[[ 3.  3.  3.  3.  3.]
 [-2. -2. -2. -2. -2.]]

Often, a VectorArray represents a basis, and we want to form linear combinations w.r.t. this basis. For this, we can use the lincomb method:

V.lincomb(np.array([2,3]))
NumpyVectorArray(NumpyVectorSpace(5), [[13 13 13 13 13]], _len=1)

This can also be vectorized:

V.lincomb(np.array([[2,3],
                    [1,0],
                    [0,1]]))
NumpyVectorArray(
    NumpyVectorSpace(5),
    [[13 13 13 13 13]
     [ 2  2  2  2  2]
     [ 3  3  3  3  3]],
    _len=3)

Inner products can be computed using the inner method:

print(U)
print(V)
U.inner(V)
[[ 3.  3.  3.  3.  3.]
 [-2. -2. -2. -2. -2.]]
[[2 2 2 2 2]
 [3 3 3 3 3]]
array([[ 30.,  45.],
       [-20., -30.]])

inner returns the matrix of all possible inner products between vectors in U and V. pairwise_inner only returns the inner product between the i-th vector in U and the i-th vector in V:

U.pairwise_inner(V)
array([ 30., -30.])

Norms can be conveniently computed using using the norm or norm2 methods:

print(U.norm())
print(np.sqrt(U.norm2()))
print(np.sqrt(U.pairwise_inner(U)))
[6.70820393 4.47213595]
[6.70820393 4.47213595]
[6.70820393 4.47213595]

The norm and inner methods have an optional product argument that can be used to compute norms and inner products w.r.t. the specified product Operator:

from pymor.operators.numpy import NumpyMatrixOperator
mat = np.diag(np.arange(5) + 1.)
print(mat)
prod = NumpyMatrixOperator(mat)
print(U)
print(U.norm2(product=prod))
[[1. 0. 0. 0. 0.]
 [0. 2. 0. 0. 0.]
 [0. 0. 3. 0. 0.]
 [0. 0. 0. 4. 0.]
 [0. 0. 0. 0. 5.]]
[[ 3.  3.  3.  3.  3.]
 [-2. -2. -2. -2. -2.]]
[135.  60.]

Finally, we can use the gramian method as a shorthand to compute the matrix of all inner products of vectors in U:

print(U.gramian())
[[ 45. -30.]
 [-30.  20.]]

Copies

Remember that assignment using = never copies data in Python, but only assigns a new name to an existing object:

space = NumpyVectorSpace(3)
U = space.ones()
V = U
U *= 0
print(U)
print(V)
[[0. 0. 0.]]
[[0. 0. 0.]]

To get a new array we use the copy method:

U = space.ones()
V = U.copy()
U *= 0
print(U)
print(V)
[[0. 0. 0.]]
[[1. 1. 1.]]

VectorArrays use copy-on-write semantics. This means that just calling copy will not copy any actual data. Only when one of the arrays is modified, the data is copied. Sometimes, this behavior is not desired. Then deep=True can be specified to force an immediate copy of the data.

Indexing and views

Like NumPy arrays, VectorArrays can be indexes with positive or negative integers, slices or lists of integers. This always results in a new VectorArray that is a view onto the data in the original array.

space = NumpyVectorSpace(4)
U = space.ones(5)
print(U)
U[1].scal(2)
U[-1].scal(42)
print(U)
print(U[:3].gramian())
print(U[[1,-1]].lincomb(np.array([1,1])))
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
[[ 1.  1.  1.  1.]
 [ 2.  2.  2.  2.]
 [ 1.  1.  1.  1.]
 [ 1.  1.  1.  1.]
 [42. 42. 42. 42.]]
[[ 4.  8.  4.]
 [ 8. 16.  8.]
 [ 4.  8.  4.]]
[[44. 44. 44. 44.]]

Note

Indexing with lists of indices creates a view as well. This is different from NumPy where ‘advanced indexing’ always yields a copy.

V = U[[3,0]]
print(V.is_view)
print(V.base is U)
V.scal(0)
print(U)
True
True
[[ 0.  0.  0.  0.]
 [ 2.  2.  2.  2.]
 [ 1.  1.  1.  1.]
 [ 0.  0.  0.  0.]
 [42. 42. 42. 42.]]

DOF access

All interface methods of VectorArray we have covered so far operate on abstract vectors and either return None, new VectorArrays or scalar quantities based on inner products. In general, there is no way to extract the actual (potentially high-dimensional) data stored in a VectorArray. This is a deliberate design decision in pyMOR as the computationally heavy operations on this data should be handled by the respective backend. In particular, note again that from_numpy / to_numpy should only be used for NumpyVectorArrays, debugging or other special cases. Usually, algorithms should not rely on these methods.

However, sometimes it is necessary to access selected degrees of freedom (entries) of the vectors stored in the VectorArray, in particular for techniques such as empirical interpolation. For that reason, most VectorArrays implement the dofs method, which allows extracting the values of certain degrees of freedom:

space = NumpyVectorSpace(6)
U = space.from_numpy(np.arange(6) * 2.)
U.append(space.full(-1))
print(U)
print(U.dofs(np.array([3, 5])))
[[ 0.  2.  4.  6.  8. 10.]
 [-1. -1. -1. -1. -1. -1.]]
[[ 6. 10.]
 [-1. -1.]]

Related methods are sup_norm and amax:

print(U.sup_norm())
dofs, values = U.amax()
print(dofs, values)
for i in range(len(U)):
    print(np.abs(U[i].dofs([dofs[i]])) == values[i])
[10.  1.]
[5 0] [10.  1.]
[[ True]]
[[ True]]

By speaking of degrees of freedom, we assume that our vectors are coefficient vectors w.r.t. some basis and that dofs returns those coefficients. We further assume that, if defined, from_numpy returns the NumPy array of the values at all degrees of freedom and that inner is just the Euclidean inner product of these coefficient vectors:

numpy_array = U.dofs(np.arange(U.dim))
print(numpy_array)
print(numpy_array == U.to_numpy())
print(numpy_array @ numpy_array.T == U.inner(U))
[[ 0.  2.  4.  6.  8. 10.]
 [-1. -1. -1. -1. -1. -1.]]
[[ True  True  True  True  True  True]
 [ True  True  True  True  True  True]]
[[ True  True]
 [ True  True]]

Warning

Theoretically, dofs allows to extract all data from a VectorArray by calling U.dofs(np.arange(U.dim)) as above. However, doing so is strongly discouraged and might lead to bad performance as dofs is designed to be used to only extract a small amount of degrees of freedom.

Complex Numbers

Like NumPy, VectorArrays transparently handle complex entries and VectorSpaces do not distinguish between being complex or not:

space = NumpyVectorSpace(3)
U = space.ones(2)
U[1].scal(1j)
print(U)
[[1.+0.j 1.+0.j 1.+0.j]
 [0.+1.j 0.+1.j 0.+1.j]]

This also works for external solvers that do not support complex numbers themselves. In that case, pyMOR automatically manages pairs of real vectors to represent the real and imaginary parts and translates all operations on these ‘complexified’ vectors to the corresponding operations on the real and complex parts. For any VectorArray, those can be accessed through the real and imag attributes:

U.real, U.imag
(NumpyVectorArray(
     NumpyVectorSpace(3),
     [[1. 1. 1.]
      [0. 0. 0.]],
     _len=2),
 NumpyVectorArray(
     NumpyVectorSpace(3),
     [[0. 0. 0.]
      [1. 1. 1.]],
     _len=2))

Even when the array is real, real will alwasy return a copy of the original array. When it comes to inner products, the convention in pyMOR is that inner products are anti-linear in the first argument:

print((U * 1j).inner(U) == U.inner(U) * (-1j))
print(U.inner(U * 1j) == U.inner(U) * 1j)
[[ True  True]
 [ True  True]]
[[ True  True]
 [ True  True]]

Internals / other types of arrays

Finally, we want to take brief look at the internals of VectorArrays and also try one other type of array. Let’s first build another NumpyVectorArray.

space = NumpyVectorSpace(4)
U = space.random(3)

So where does U actually store its data? NumpyVectorArray is a subclass of the VectorArray interface class, which takes care of arguments checking and indexing. The actual work, however, is performed by an implementation object, which also holds the data:

type(U.impl)
pymor.vectorarrays.numpy.NumpyVectorArrayImpl

In the case of NumpyVectorArrayImpl, the data is stored in the private _array attribute:

U.impl._array
array([[0.31236664, 0.8322598 , 0.80476436, 0.38747838],
       [0.2883281 , 0.6824955 , 0.13975248, 0.1999082 ],
       [0.00736227, 0.78692438, 0.66485086, 0.70516538]])

However, you should never access this attribute directly and use from_numpy / to_numpy instead, as _array may not be what you expect: the array creation methods empty, zeros, ones, full have an additional reserve argument, which allows the user to give the VectorArray implementation a hint how large the array might grow.+ The implementation can use this to over-allocate memory to improve performance:

V = space.ones(reserve=5)
V.impl._array
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

Here, _array is large enough to store 5 vectors, even though the array only contains one vector:

print(len(V))
print(V.to_numpy())
1
[[1. 1. 1. 1.]]

If we append one vector, the second row in the array is overwritten:

V.append(space.full(2))
V.impl._array
array([[1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

As NumPy arrays store their data consecutively in memory, this can significantly improve performance of iterative algorithms where new vectors are repeatedly appended to a given array. However, this is only useful for implementations which do store the data consecutively in memory.

We will now look at another type of VectorArray, which uses Python lists of one-dimensional NumPy arrays to store the data as a list of vectors:

from pymor.vectorarrays.list import NumpyListVectorSpace
space = NumpyListVectorSpace(4)
W = space.random(3)

W behaves as any other array. It also implements to_numpy so that we can look at the data directly:

W.to_numpy()
array([[0.78072903, 0.45891578, 0.5687412 , 0.139797  ],
       [0.11453007, 0.66840296, 0.47109621, 0.56523611],
       [0.76499886, 0.63471832, 0.5535794 , 0.55920716]])

However, the implementation class now holds a list of vector objects:

print(type(W))
print(type(W.impl))
print(W.impl._list)
print(type(W.impl._list))
<class 'pymor.vectorarrays.list.ListVectorArray'>
<class 'pymor.vectorarrays.list.ListVectorArrayImpl'>
[NumpyVector('??'), NumpyVector('??'), NumpyVector('??')]
<class 'list'>

Each of these NumpyVector objects now stores one of the vectors in the array:

W.impl._list[1]._array
array([0.11453007, 0.66840296, 0.47109621, 0.56523611])

Ending this tutorial, we want to take a brief look how one of the interface methods is actually implemented by the different array types. We choose pairwise_inner as the implementation is particularly simple. Here is the implementation for NumpyVectorArray:

from pymor.tools.formatsrc import print_source
print_source(U.impl.pairwise_inner)
    def pairwise_inner(self, other, ind, oind):
        A = self._array[:self._len] if ind is None else self._array[ind]
        B = other._array[:other._len] if oind is None else other._array[oind]

        # .conj() is a no-op on non-complex data types
        return np.sum(A.conj() * B, axis=1)

We see the actual call to NumPy that does all computations in the last line. The implementation has to take into account that one of the arrays might be a view, in which case the selected indices are given by ind/oind. Also, reserve might have been used when the array was created, making _array larger than it actually is. The true length is stored in the _len attribute, so that the first _len vectors from the array are selected.

Here is the code for the list-based array:

print_source(W.impl.pairwise_inner)
    def pairwise_inner(self, other, ind, oind):
        return np.array([a.inner(b) for a, b in zip(self._indexed(ind), other._indexed(oind))])

W.impl is a generic ListVectorArrayImpl object, which provides a common implementation for all VectorArrays based on Python lists. The implementation just loops over all pairs a, b of vector objects in both arrays and calls a’s inner method, which computes the actual inner product. In case of NumpyVector, the code looks like this:

print_source(W.impl._list[0].inner)
    def inner(self, other):
        assert self.dim == other.dim
        return np.sum(self._array.conj() * other._array)

As most PDE solver libraries only know of single vectors and have no notion of an array of vectors, VectorArrays for external solvers are typically based on ListVectorArray / ListVectorArrayImpl. In that case, only the interface methods for a single Vector object have to be implemented. Note that for NumPy-based data, NumpyVectorArray is the default VectorArray type used throughout pyMOR, as it can benefit from vectorized operations on the underlying NumPy array. NumpyListVectorSpace is only implemented in pyMOR for testing purposes.