Run this tutorial
Click here to run this tutorial on binder:Tutorial: VectorArray Basics¶
What are VectorArrays and what are they used for in pyMOR?¶
While pyMOR uses NumPy
arrays to represent extrinsic quantities like
inputs or outputs of a Model
, the Model’s internal state is always represented
by VectorArrays
, ordered collections of abstract vectors of the same
dimension.
The reason for this is that when pyMOR is used in conjunction with an external
solver that implements the FOM, state space vectors must be compatible with the
solver’s internal data structures, for instance when some Operator
appearing in the FOM shall be applied to the vector.
Instead of constantly converting between NumPy arrays and the solver’s data structures
– which might even be impossible in some cases – pyMOR uses VectorArrays
to access
these vectors through a unified interface.
So as soon as you get in touch with a model’s state-space data, e.g., by calling its
solve
method, you will work with VectorArrays
.
Note that this also the case for ROMs. pyMOR does not distinguish between FOMs and ROMs,
so even though no external solver is involved, the ROM state is still represented by
VectorArrays
.
VectorSpaces / How to create VectorArrays¶
VectorArrays
are never instantiated directly but are always created by the
VectorSpace
they belong to.
You will find VectorSpaces
as the source
and range
space of Operators
or as the
solution_space
of Models
.
In the following, we will work with NumpyVectorArrays
that internally store the
vectors as two-dimensional NumPy arrays.
The corresponding VectorSpace
is called NumpyVectorSpace
.
To make a NumpyVectorSpace
, we need to specify the dimension of the contained vectors:
from pymor.basic import NumpyVectorSpace
import numpy as np
space = NumpyVectorSpace(7)
The dimension can be accessed via the dim
attribute:
space.dim
7
Our first VectorArray
will only contain three zero vectors:
U = space.zeros(3)
To check that we indeed have created three zero vectors of dimension 7, we can do the following:
print(len(U))
print(U.dim)
print(U.norm())
3
7
[0. 0. 0.]
Just zero vectors are boring. To create vectors with different entries, we can use the following methods:
print(space.ones(1))
print(space.full(42, count=4))
print(space.random(2))
[[1. 1. 1. 1. 1. 1. 1.]]
[[42 42 42 42 42 42 42]
[42 42 42 42 42 42 42]
[42 42 42 42 42 42 42]
[42 42 42 42 42 42 42]]
[[0.77395605 0.43887844 0.85859792 0.69736803 0.09417735 0.97562235
0.7611397 ]
[0.78606431 0.12811363 0.45038594 0.37079802 0.92676499 0.64386512
0.82276161]]
Sometimes, for instance when accumulating vectors in a loop, we want to initialize a
VectorArray
with no vectors in it.
We can use the empty
method for that:
U.empty()
NumpyVectorArray(NumpyVectorSpace(7), [], _len=0)
You might wonder how to create VectorArrays
with more interesting data.
When implementing Operators
or, for instance, the
solve
method, you will get in touch with data
from the actual solver backend you are using, that needs to be wrapped into a
corresponding VectorArray
.
For that, every VectorSpace
has a
make_array
method that takes care of the
wrapping for you.
In case of NumpyVectorSpace
, the backend is NumPy and the data is given as NumPy arrays:
space.make_array(np.arange(0, 14).reshape((2, 7)))
NumpyVectorArray(
NumpyVectorSpace(7),
[[ 0 1 2 3 4 5 6]
[ 7 8 9 10 11 12 13]],
_len=2)
Converting NumPy arrays to VectorArrays¶
Some but not all VectorArrays
can be initialized from NumPy arrays.
For these arrays, the corresponding VectorSpace
implements the
from_numpy
method:
space = NumpyVectorSpace(4)
numpy_array = np.linspace(0, 4, 8).reshape((2, 4))
print(numpy_array)
vector_array = space.from_numpy(numpy_array)
vector_array
[[0. 0.57142857 1.14285714 1.71428571]
[2.28571429 2.85714286 3.42857143 4. ]]
NumpyVectorArray(
NumpyVectorSpace(4),
[[0. 0.57142857 1.14285714 1.71428571]
[2.28571429 2.85714286 3.42857143 4. ]],
_len=2)
Warning
The generated VectorArray
might take ownership of the provided data.
In particular, this is the case for NumpyVectorArrays
.
So if you change the VectorArray
, the original array will change and vice versa:
numpy_array[0, 0] = 99
vector_array
NumpyVectorArray(
NumpyVectorSpace(4),
[[99. 0.57142857 1.14285714 1.71428571]
[ 2.28571429 2.85714286 3.42857143 4. ]],
_len=2)
To avoid this problem, you can set ensure_copy
to True
:
numpy_array = np.linspace(0, 4, 8).reshape((2, 4))
vector_array = space.from_numpy(numpy_array, ensure_copy=True)
numpy_array[0, 0] = 99
print(numpy_array)
vector_array
[[99. 0.57142857 1.14285714 1.71428571]
[ 2.28571429 2.85714286 3.42857143 4. ]]
NumpyVectorArray(
NumpyVectorSpace(4),
[[0. 0.57142857 1.14285714 1.71428571]
[2.28571429 2.85714286 3.42857143 4. ]],
_len=2)
If you quickly want to create a NumpyVectorArray
, you can also use
from_numpy
as a class method of NumpyVectorSpace
, which will infer the dimension of the
space from the data:
vector_array = NumpyVectorSpace.from_numpy(numpy_array)
vector_array
NumpyVectorArray(
NumpyVectorSpace(4),
[[99. 0.57142857 1.14285714 1.71428571]
[ 2.28571429 2.85714286 3.42857143 4. ]],
_len=2)
Warning
Do not use from_numpy
in generic code that is supposed to work with arbitrary external solvers that might
not support converting data from NumPy or only at high costs.
Using it is fine, however, when you know that you are dealing with NumpyVectorArrays
,
for instance when building ROMs.
Converting VectorArrays to NumPy arrays¶
Some VectorArrays
also implement a
to_numpy
method that returns the
internal data as a NumPy array:
space = NumpyVectorSpace(4)
vector_array = space.random(3)
vector_array
NumpyVectorArray(
NumpyVectorSpace(4),
[[0.4434142 0.22723872 0.55458479 0.06381726]
[0.82763117 0.6316644 0.75808774 0.35452597]
[0.97069802 0.89312112 0.7783835 0.19463871]],
_len=3)
array = vector_array.to_numpy()
array
array([[0.4434142 , 0.22723872, 0.55458479, 0.06381726],
[0.82763117, 0.6316644 , 0.75808774, 0.35452597],
[0.97069802, 0.89312112, 0.7783835 , 0.19463871]])
Warning
Again, the returned NumPy array might share data with the VectorArray
:
array[:] = 0
vector_array
NumpyVectorArray(
NumpyVectorSpace(4),
[[0. 0. 0. 0.]
[0. 0. 0. 0.]
[0. 0. 0. 0.]],
_len=3)
As with from_numpy
we can use the ensure_copy
parameter to make sure we get a copy:
vector_array = space.random(3)
array = vector_array.to_numpy(ensure_copy=True)
array[:] = 0
vector_array
NumpyVectorArray(
NumpyVectorSpace(4),
[[0.466721 0.04380377 0.15428949 0.68304895]
[0.74476216 0.96750973 0.32582536 0.37045971]
[0.46955581 0.18947136 0.12992151 0.47570493]],
_len=3)
Rows and columns¶
First time pyMOR users coming from numerical linear algebra often say that
pyMOR uses row vectors instead column vectors.
However, it is more useful to think of VectorArrays
as simple lists of vectors,
that do not have any notion of being a row or column vector.
When a matrix Operator
is applied to a VectorArray
, think of a for
-loop,
where the corresponding linear Operator
is applied individually to each vector in
the array, not of a matrix-matrix product.
What is true, however, is that
from_numpy
/
to_numpy
interpret VectorArrays
as matrices of row vectors.
The reason for that is that NumPy prefers a C-like memory layout for matrices, where the
individual rows are stored consecutively in memory.
(In contrast, Matlab uses a Fortran-like memory layout, where the columns are
stored consecutively in memory.)
Basic operations¶
space = NumpyVectorSpace(5)
U = space.ones(2)
V = space.full(2)
W = space.random(1)
print(U)
print(V)
print(W)
[[1. 1. 1. 1. 1.]
[1. 1. 1. 1. 1.]]
[[2 2 2 2 2]]
[[0.22690935 0.66981399 0.43715192 0.8326782 0.7002651 ]]
We can append
the vectors from one array
to another:
V.append(space.full(3))
print(V)
[[2 2 2 2 2]
[3 3 3 3 3]]
The original array is left unchanged.
If we do not want to copy the vectors but remove them from the original array, we can use the
remove_from_other
argument.
We can add VectorArrays
and multiply by a scalar:
print(U + V)
print(U * 2)
[[3. 3. 3. 3. 3.]
[4. 4. 4. 4. 4.]]
[[2. 2. 2. 2. 2.]
[2. 2. 2. 2. 2.]]
All operations are vectorized as they are for NumPy arrays. Addition and other operations also allow ‘broadcasting’ a single vector to the entire array:
print(V + W)
[[2.22690935 2.66981399 2.43715192 2.8326782 2.7002651 ]
[3.22690935 3.66981399 3.43715192 3.8326782 3.7002651 ]]
VectorArrays
can be scaled in place using the
scal
method:
U.scal(3)
print(U)
[[3. 3. 3. 3. 3.]
[3. 3. 3. 3. 3.]]
Adding a scalar multiple can be achieved using
axpy
:
U.axpy(2, V)
print(U)
[[7. 7. 7. 7. 7.]
[9. 9. 9. 9. 9.]]
The same could be achieved with:
U = space.ones(2)
U += 2 * V
print(U)
[[5. 5. 5. 5. 5.]
[7. 7. 7. 7. 7.]]
However, this would create an unnecessary temporary for 2 * V
.
axpy
also allows broadcasting
or specifying multiple coefficients:
U = space.ones(2)
U.axpy(2, W)
print(U)
U = space.ones(2)
U.axpy(np.array([1, -1]), V)
print(U)
[[1.4538187 2.33962799 1.87430384 2.66535639 2.4005302 ]
[1.4538187 2.33962799 1.87430384 2.66535639 2.4005302 ]]
[[ 3. 3. 3. 3. 3.]
[-2. -2. -2. -2. -2.]]
Often, a VectorArray
represents a basis, and we want to form linear combinations
w.r.t. this basis.
For this, we can use the lincomb
method:
V.lincomb(np.array([2,3]))
NumpyVectorArray(NumpyVectorSpace(5), [[13 13 13 13 13]], _len=1)
This can also be vectorized:
V.lincomb(np.array([[2,3],
[1,0],
[0,1]]))
NumpyVectorArray(
NumpyVectorSpace(5),
[[13 13 13 13 13]
[ 2 2 2 2 2]
[ 3 3 3 3 3]],
_len=3)
Inner products can be computed using the inner
method:
print(U)
print(V)
U.inner(V)
[[ 3. 3. 3. 3. 3.]
[-2. -2. -2. -2. -2.]]
[[2 2 2 2 2]
[3 3 3 3 3]]
array([[ 30., 45.],
[-20., -30.]])
inner
returns the matrix of all possible inner products between vectors in U
and V
.
pairwise_inner
only returns the inner product
between the i-th vector in U
and the i-th vector in V
:
U.pairwise_inner(V)
array([ 30., -30.])
Norms can be conveniently computed using using the
norm
or
norm2
methods:
print(U.norm())
print(np.sqrt(U.norm2()))
print(np.sqrt(U.pairwise_inner(U)))
[6.70820393 4.47213595]
[6.70820393 4.47213595]
[6.70820393 4.47213595]
The norm
and inner
methods have an optional product
argument that can be used to compute
norms and inner products w.r.t. the specified product Operator
:
from pymor.operators.numpy import NumpyMatrixOperator
mat = np.diag(np.arange(5) + 1.)
print(mat)
prod = NumpyMatrixOperator(mat)
print(U)
print(U.norm2(product=prod))
[[1. 0. 0. 0. 0.]
[0. 2. 0. 0. 0.]
[0. 0. 3. 0. 0.]
[0. 0. 0. 4. 0.]
[0. 0. 0. 0. 5.]]
[[ 3. 3. 3. 3. 3.]
[-2. -2. -2. -2. -2.]]
[135. 60.]
Finally, we can use the gramian
method as a
shorthand to compute the matrix of all inner products of vectors in U
:
print(U.gramian())
[[ 45. -30.]
[-30. 20.]]
Copies¶
Remember that assignment using =
never copies data in Python, but only assigns a new name
to an existing object:
space = NumpyVectorSpace(3)
U = space.ones()
V = U
U *= 0
print(U)
print(V)
[[0. 0. 0.]]
[[0. 0. 0.]]
To get a new array we use the copy
method:
U = space.ones()
V = U.copy()
U *= 0
print(U)
print(V)
[[0. 0. 0.]]
[[1. 1. 1.]]
VectorArrays
use copy-on-write semantics.
This means that just calling copy
will not
copy any actual data.
Only when one of the arrays is modified, the data is copied.
Sometimes, this behavior is not desired. Then deep=True
can be specified to force an immediate
copy of the data.
Indexing and views¶
Like NumPy arrays, VectorArrays
can be indexes with positive or negative integers,
slices or lists of integers.
This always results in a new VectorArray
that is a view onto the data in the original
array.
space = NumpyVectorSpace(4)
U = space.ones(5)
print(U)
U[1].scal(2)
U[-1].scal(42)
print(U)
print(U[:3].gramian())
print(U[[1,-1]].lincomb(np.array([1,1])))
[[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]
[1. 1. 1. 1.]]
[[ 1. 1. 1. 1.]
[ 2. 2. 2. 2.]
[ 1. 1. 1. 1.]
[ 1. 1. 1. 1.]
[42. 42. 42. 42.]]
[[ 4. 8. 4.]
[ 8. 16. 8.]
[ 4. 8. 4.]]
[[44. 44. 44. 44.]]
Note
Indexing with lists of indices creates a view as well. This is different from NumPy where ‘advanced indexing’ always yields a copy.
V = U[[3,0]]
print(V.is_view)
print(V.base is U)
V.scal(0)
print(U)
True
True
[[ 0. 0. 0. 0.]
[ 2. 2. 2. 2.]
[ 1. 1. 1. 1.]
[ 0. 0. 0. 0.]
[42. 42. 42. 42.]]
DOF access¶
All interface methods of VectorArray
we have covered so far operate on abstract vectors
and either return None
, new VectorArrays
or scalar quantities based on inner products.
In general, there is no way to extract the actual (potentially high-dimensional) data stored
in a VectorArray
.
This is a deliberate design decision in pyMOR as the computationally heavy operations on this
data should be handled by the respective backend.
In particular, note again that
from_numpy
/
to_numpy
should only be used for NumpyVectorArrays
, debugging or other special cases.
Usually, algorithms should not rely on these methods.
However, sometimes it is necessary to access selected degrees of freedom (entries) of the vectors
stored in the VectorArray
, in particular for techniques such as empirical interpolation.
For that reason, most VectorArrays
implement the
dofs
method, which allows extracting the
values of certain degrees of freedom:
space = NumpyVectorSpace(6)
U = space.from_numpy(np.arange(6) * 2.)
U.append(space.full(-1))
print(U)
print(U.dofs(np.array([3, 5])))
[[ 0. 2. 4. 6. 8. 10.]
[-1. -1. -1. -1. -1. -1.]]
[[ 6. 10.]
[-1. -1.]]
Related methods are sup_norm
and
amax
:
print(U.sup_norm())
dofs, values = U.amax()
print(dofs, values)
for i in range(len(U)):
print(np.abs(U[i].dofs([dofs[i]])) == values[i])
[10. 1.]
[5 0] [10. 1.]
[[ True]]
[[ True]]
By speaking of degrees of freedom, we assume that our vectors are coefficient vectors w.r.t.
some basis and that dofs
returns those coefficients.
We further assume that, if defined,
from_numpy
returns the NumPy array of the
values at all degrees of freedom and that inner
is just the Euclidean inner product of these coefficient vectors:
numpy_array = U.dofs(np.arange(U.dim))
print(numpy_array)
print(numpy_array == U.to_numpy())
print(numpy_array @ numpy_array.T == U.inner(U))
[[ 0. 2. 4. 6. 8. 10.]
[-1. -1. -1. -1. -1. -1.]]
[[ True True True True True True]
[ True True True True True True]]
[[ True True]
[ True True]]
Warning
Theoretically, dofs
allows to extract all
data from a VectorArray
by calling U.dofs(np.arange(U.dim))
as above.
However, doing so is strongly discouraged and might lead to bad performance as
dofs
is
designed to be used to only extract a small amount of degrees of freedom.
Complex Numbers¶
Like NumPy, VectorArrays
transparently handle complex entries and VectorSpaces
do not distinguish between being complex or not:
space = NumpyVectorSpace(3)
U = space.ones(2)
U[1].scal(1j)
print(U)
[[1.+0.j 1.+0.j 1.+0.j]
[0.+1.j 0.+1.j 0.+1.j]]
This also works for external solvers that do not support complex numbers themselves.
In that case, pyMOR automatically manages pairs of real vectors to represent the real and
imaginary parts and translates all operations on these ‘complexified’ vectors to the
corresponding operations on the real and complex parts.
For any VectorArray
, those can be accessed through the
real
and
imag
attributes:
U.real, U.imag
(NumpyVectorArray(
NumpyVectorSpace(3),
[[1. 1. 1.]
[0. 0. 0.]],
_len=2),
NumpyVectorArray(
NumpyVectorSpace(3),
[[0. 0. 0.]
[1. 1. 1.]],
_len=2))
Even when the array is real,
real
will alwasy return a copy of the original
array.
When it comes to inner products, the convention in pyMOR is that inner products are anti-linear
in the first argument:
print((U * 1j).inner(U) == U.inner(U) * (-1j))
print(U.inner(U * 1j) == U.inner(U) * 1j)
[[ True True]
[ True True]]
[[ True True]
[ True True]]
VectorSpace ids¶
VectorSpaces
can have an id
attached to them
to make different spaces with the same properties (e.g. dimension) distinguishable and protect
the user from potential errors.
space = NumpyVectorSpace(3)
different_space = NumpyVectorSpace(3, id='different')
print(space == different_space)
U = space.ones()
print(U in space, U in different_space)
False
True False
It is not allowed to combine arrays from different spaces, in particular from spaces with different ids:
V = different_space.zeros()
print(V.space.id)
U + V
different
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[43], line 3
1 V = different_space.zeros()
2 print(V.space.id)
----> 3 U + V
File /builds/pymor/pymor/src/pymor/vectorarrays/interface.py:693, in VectorArray.__add__(self, other)
691 assert other == 0
692 return self.copy()
--> 693 assert other in self.space
694 assert len(self) == len(other) or len(other) == 1
695 return type(self)(self.space, self.impl.axpy_copy(1, other.impl, self.ind, other.ind))
AssertionError:
Internals / other types of arrays¶
Finally, we want to take brief look at the internals of VectorArrays
and also try one other type of
array. Let’s first build another NumpyVectorArray
.
space = NumpyVectorSpace(4)
U = space.random(3)
So where does U
actually store its data?
NumpyVectorArray
is a subclass of the VectorArray
interface class, which takes care of
arguments checking and indexing.
The actual work, however, is performed by an implementation object, which also holds the data:
type(U.impl)
pymor.vectorarrays.numpy.NumpyVectorArrayImpl
In the case of NumpyVectorArrayImpl
, the data is stored in the
private _array
attribute:
U.impl._array
array([[0.31236664, 0.8322598 , 0.80476436, 0.38747838],
[0.2883281 , 0.6824955 , 0.13975248, 0.1999082 ],
[0.00736227, 0.78692438, 0.66485086, 0.70516538]])
However, you should never access this attribute directly and use
from_numpy
/
to_numpy
instead, as _array
may not be what you expect:
the array creation methods
empty
,
zeros
,
ones
,
full
have an additional reserve
argument,
which allows the user to give the VectorArray
implementation a hint how large the array
might grow.+
The implementation can use this to over-allocate memory to improve performance:
V = space.ones(reserve=5)
V.impl._array
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
Here, _array
is large enough to store 5 vectors, even though the array only contains one vector:
print(len(V))
print(V.to_numpy())
1
[[1. 1. 1. 1.]]
If we append one vector, the second row in the array is overwritten:
V.append(space.full(2))
V.impl._array
array([[1., 1., 1., 1.],
[2., 2., 2., 2.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
As NumPy arrays store their data consecutively in memory, this can significantly improve performance of iterative algorithms where new vectors are repeatedly appended to a given array. However, this is only useful for implementations which do store the data consecutively in memory.
We will now look at another type of VectorArray
, which uses Python lists of one-dimensional
NumPy arrays to store the data as a list of vectors:
from pymor.vectorarrays.list import NumpyListVectorSpace
space = NumpyListVectorSpace(4)
W = space.random(3)
W
behaves as any other array.
It also implements to_numpy
so that we can look at the data directly:
W.to_numpy()
array([[0.78072903, 0.45891578, 0.5687412 , 0.139797 ],
[0.11453007, 0.66840296, 0.47109621, 0.56523611],
[0.76499886, 0.63471832, 0.5535794 , 0.55920716]])
However, the implementation class now holds a list of vector objects:
print(type(W))
print(type(W.impl))
print(W.impl._list)
print(type(W.impl._list))
<class 'pymor.vectorarrays.list.ListVectorArray'>
<class 'pymor.vectorarrays.list.ListVectorArrayImpl'>
[NumpyVector('??'), NumpyVector('??'), NumpyVector('??')]
<class 'list'>
Each of these NumpyVector
objects now stores one of the vectors
in the array:
W.impl._list[1]._array
array([0.11453007, 0.66840296, 0.47109621, 0.56523611])
Ending this tutorial, we want to take a brief look how one of the interface methods is actually
implemented by the different array types.
We choose pairwise_inner
as the implementation is
particularly simple.
Here is the implementation for NumpyVectorArray
:
from pymor.tools.formatsrc import print_source
print_source(U.impl.pairwise_inner)
def pairwise_inner(self, other, ind, oind):
A = self._array[:self._len] if ind is None else self._array[ind]
B = other._array[:other._len] if oind is None else other._array[oind]
# .conj() is a no-op on non-complex data types
return np.sum(A.conj() * B, axis=1)
We see the actual call to NumPy that does all computations in the last line.
The implementation has to take into account that one of the arrays might be a view, in which case the
selected indices are given by ind
/oind
.
Also, reserve
might have been used when the array was created, making _array
larger than it
actually is.
The true length is stored in the _len
attribute, so that the first _len
vectors from the array
are selected.
Here is the code for the list
-based array:
print_source(W.impl.pairwise_inner)
def pairwise_inner(self, other, ind, oind):
return np.array([a.inner(b) for a, b in zip(self._indexed(ind), other._indexed(oind))])
W.impl
is a generic ListVectorArrayImpl
object, which provides a
common implementation for all VectorArrays
based on Python lists.
The implementation just loops over all pairs a, b
of vector objects in both arrays and calls a
’s
inner
method, which computes the actual inner product.
In case of NumpyVector
, the code looks like this:
print_source(W.impl._list[0].inner)
def inner(self, other):
assert self.dim == other.dim
return np.sum(self._array.conj() * other._array)
As most PDE solver libraries only know of single vectors and have no notion of an array of vectors,
VectorArrays
for external solvers are typically based on ListVectorArray
/
ListVectorArrayImpl
.
In that case, only the interface methods for a single Vector
object
have to be implemented.
Note that for NumPy-based data, NumpyVectorArray
is the default VectorArray
type used throughout
pyMOR, as it can benefit from vectorized operations on the underlying NumPy array.
NumpyListVectorSpace
is only implemented in pyMOR for testing purposes.