# The block assembly

The block matrix assembly in FEAT2 is an alternative assembly strategy for block matrices. Instead of setting up each submatrix in a block matrix separately, this technique directly works on the complete block matrix, thus saving a lot of computational time for the precomputation of all the basis functions. However, for applying this technique, one has to know a bit more about the assembly of matrices and vectors.

In contrast to the "scalar" assembly, the "block" assembly does not have separate structures for a linear form, bilinear form or trilinear form. Instead, appropriate callback routines directly calculate local matrix/vector entries which are imposed by the block assembly routines into the global matrices/vectors. This involves some additional loops in the callback routines, and the callback routines directly have to deal with finite element basis functions.

Contents
1 Involved modules and functionality
2 Basics
3 Providing nonlinearities / FE functions / temporary memory
4 Standard operators
5 Block matrices: Advanced operator assembly
6 Block vectors: Advanced vector assembly
7 Accessing FE functions and temporary arrays

## Involved modules and functionality

The following modules are involved in the block assembly:

Module Description
blockmatassemblybase.f90 Definition of structures
blockmatassembly.f90 The actual assembly routines
blockmatassemblystdop.f90 Implementation of standard operators with block assembly routines
feevaluation2.f90 Evaluation of finite element functions for nonlinearities

Currently, the block assembly routines in blockmatassembly.f90 realise the following functionalities:

• Assembly of block matrices: Subroutine bma_buildMatrix

This subroutine realises the assembly of block matrices. A special callback routine has to be provided that calculates local matrix entries.

• Assembly of block vectors: Subroutine bma_buildVector

This subroutine realises the assembly of block vectors. A special callback routine has to be provided that calculates local vector entries.

• Assembly of integrals: Subroutines bma_buildIntegral and bma_buildIntegrals

These subroutines can be used to calculate one or multiple integrals or other distributed operators. They are usually used to compute $L_2$ or $H^1$ norms but can also be used to compute the $sup$ norm of a vector or vector field.

The assembly routines support scalar-values as well as vector-values finite element functions and can also deal with interleaved matrices and vectors. However, each functionality needs a special implementation in the corresponding callback routines, see below.

## Basics

When a block assembly routine is invoked, the following steps are done:

• The routine loops over sets of cells. On every cell, there exist a set of cubature points.

• On every cell and in each cubature point on every cell, the assembly routine calculates the values of the basis functions of all finite element spaces that are involved in the assembly. This includes the basis functions that come from the submatrices in the block matrix as well as from the basis functions of all involved nonlinearities.

• If there are finite element nonlinearities specified (e.g., a convection given by a finite element function), their values in all the cubature points are calculated.

• Next, a callback routine is invoked. This callback routine has to compute the local matrices/vectors/integrals in all the cubature points on all the elements given.

• Finally, the assembly routines collect the calculated local matrix/vector/integral contributions and form a global matrix/vector/integral from them.

### Simple block matrix assembly

In the simplest form, calling the matrix assembly reads as follows:

type(t_matrixBlock) :: rmatrix
...
call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,fcalcLocalMatrices)


The routine fcalcLocalMatrices is a user-defined callback routine that calculates the local matrix contributions.

### Simple block vector assembly

In the simplest form, calling the vector assembly reads as follows:

type(t_vectorBlock) :: rrhs
...
call bma_buildVector (rrhs,BMA_CALC_STANDARD,fcalcLocalVectors)


The routine fcalcLocalVectors is a user-defined callback routine that calculates the local vector contributions.

### Simple integral assembly: General integral

In the simplest form, calling the integral assembly reads as follows:

type(t_scalarCubatureInfo), target :: rmyCubature
real(DP) :: dintvalue
...
call bma_buildIntegral (dintvalue,BMA_CALC_STANDARD,fcalcLocalIntegral,&
rtriangulation,rcubatureInfo=rmyCubature)


The routine fcalcLocalIntegral is a callback routine that calculates the local integral contribution on a set of cells. rtriangulation specifies the underlying mesh and rmyCubature the cubature formula to be used. The result of the integration is saved to dintvalue.

The alternative routine bma_buildIntegrals cab be used to calculate multiple integrals at once. For example,

type(t_scalarCubatureInfo), target :: rmyCubature
real(DP), dimension(3) :: Dintvalues
...
call bma_buildIntegrals (Dintvalue,BMA_CALC_STANDARD,fcalcLocalIntegral,&
rtriangulation,rcubatureInfo=rmyCubature)


calculates three domain integrals at once into Dintvalues(:). The corresponding local contributions are calculated by fcalcLocalIntegral. rmyCubature defines the cubature rule to be used on the mesh rtriangulation.

Remark: This form of the integration does not necessitate any finite element function and is rather seldom used. The original idea of the integral assembly is to calculate norms which involves finite element functions. For such an application, the routine is used in a slightly different manner.

## Providing nonlinearities / FE functions / temporary memory

All assembly routines can be called with an optional parameter revalVectors. In combination with the module feevaluation2.f90, this parameter offers the possibility to specify finite element functions that live on the same mesh as on which the assembly is applied. All finite element functions given in this structure are automatically evaluated in all cubature points, and the calculated values are provided to the callback routines for further processing.

Example: A discrete stationary Oseen equation has the form $$-Delta u_h + v_h nabla u_h + nabla p_h = f_h$$ with $u_h$, $v_h$ and $f_h$ vector fields. The operator $n(v_h,cdot):=v_h nabla cdot$ needs a finite element function $v_h$ for the assembly, which specifies the direction of a flow. Such a function can be provided to the callback routines via revalVectors.

Let a finite element function $v_h$ be given as a block vector structure rvectorVh. This vector can be provided to the assembly routines as follows. We demonstrate the approach for the matrix assembly, however, for vector or integral assembly, the approach is exactly the same.

Providing a scalar function: Let $v_h=(v_1)$ be a block vector containing only one subvector. Such a subvector is provided to the assembly routines as follows:

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)


There is no special initialisation routine for the structure rmyVectors. One simply adds all scalar components of the vector $v_h$ to rmyVectors and provides this structure to the bma_buildXXX routine. The assembly routine will evaluate the provided functions automatically.

Derivatives: The last parameter nmaxDerivative in the routine fev2_addVectorToEvalList is a derivative quantifier. It advises the evaluation routines what to evaluate. nmaxDerivative=0 (as specified above) calculates the function values in the cubature points, =1 the 1st derivatives, =2 the 2nd derivatives and so on. This can be used for more complicated operators which also involve derivatives of nonlinearities.

The following example assumes that $v_h=(v_1,v_2,v_3)$ has multiple components. It adds all components, one by one, and advises the assembly routine to calculate also their derivatives in the cubature points. So in the callback function the values of $v_h$ as well as the values of $nabla v_h$ are available.

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)


Providing vector fields: Let $v_h=(v_1,v_2)$ be a block vector describing a vector field (for example, a velocity vector field of a flow in 2D). The following example provides the vector field to the evaluation routine via the function fev2_addVectorFieldToEvalList. This subroutine expects all components of the vector field in the parameters and allows 1D, 2D and 3D vector fields:

type(t_fev2Vectors) :: rmyVectors
type(t_vectorBlock) :: rvectorVh

type(t_matrixBlock) :: rmatrix
...
rvectorVh%RvectorBlock(1),rvectorVh%RvectorBlock(2))

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)


Again, the parameter nmaxDerivative (which is set to nmaxDerivative=0 in this example) specifies in how far derivatives from the vector field should be calculated. A call in the form

...
rvectorVh%RvectorBlock(1),rvectorVh%RvectorBlock(2))
...


would calculate $v_h$ as well as $nabla v_h$ and provide it to the callback routine.

Providing temporary memory: Additionally to vectors and vector fields, the user can also specify "dummy vectors" and "dummy vector fields" in the t_fev2Vectors structure. With this technique, it is possible to provide temporary memory in all cubature points on all elements to callback routines. Such temporary memory is needed in complicated operators for intermediate calculations which cannot be done in one simple step.

Example: A non-Newtonian fluid following the Power law model is given by the equation $$-nu(x)Delta u + u nabla u + nabla p = f$$ with a nonlinear viscosity $$nu(x) = nu_0 z^{e/2 - 1},$$ $e,,nu_0inmathbb{R}$ a constant and $z$ be given by $$z=||D(u)||^2+varepsilon,qquad D(u)=frac{1}{2} (nabla u + nabla u^T),$$ with $varepsiloninmathbb{R}$ a constant. Computation routines for this model at first evaluate $nabla u$ and $nabla u^T$, then $||D(u)||$ and at the end $nu(x)$. For the intermediate steps, some temporary memory is necessary which can be provided by "dummy vectors".

The following routines provide temporary memory:

fev2_addDummyVectorToEvalList

provides one or multiple temporary arrays with memory in every cubature point on every element. The following example adds one temporary array to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addDummyVecFieldToEvalList

provides temporary memory for one or multiple vector fields with memory in every cubature point on every element. The following example adds one temporary vector field with 2 vector components to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)
...


Using the parameter nsubarrays allows to reserve also some subarrays associated to this temporary array. This is used, e.g., to allocate memory for a vector field as well as its derivatives. The following example allocates temporary memory for a temporary vector field $v_h=(v_1,v_2)$ and its derivative $nabla v_h$ in all cubature points on all elements. We need nsubarrays=3, corresponding to the three terms $v_h$, $partial_x v_h$ and $partial_y v_h$.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addCellDummyToEvalList

provides one or multiple temporary arrays with memory for one information on every element (cell based information). The following example adds one temporary cell based array to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)
...

fev2_addDummyVecFieldToEvalList

provides temporary memory for one or multiple vector fields with memory for one information on every element (cell based information). The following example adds one temporary vector field with 2 cell based vector components to the structure rmyVectors.

type(t_fev2Vectors) :: rmyVectors
type(t_matrixBlock) :: rmatrix
...

call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcLocalMatrices,revalVectors=rmyVectors)
...


## Standard operators

The module blockmatassemblystdop.f90 provides a lot of standard implementations for the most important standard operators - matrices, vectors and integrals. These implemetations are given as callback routines which can either be provided to the bma_buildXXXX routines directly, or which can be called inside of a callback function. The following naming conventions are used here:

bma_fcalc_XXXX
Routines with this name can be used as callback function in a call to bma_buildXXXX. Parameters are passed via a t_collection structure and the vector evaluation structure revalVectors. There is no need for writing a user-defined callback function. It is a wrapper for the corresponding bma_docalc_XXXX subroutine.
bma_docalc_XXXX
This is the immediate version of the bma_fcalc_XXXX routine from above. Routines of type bma_docalc_XXXX can be called inside of a callback function of bma_buildXXXX to calculate a specific operator. This allows to easily compute multiple operators in one step.

### Using bma_fcalc_XXXX for simple calculations

With the routines bma_fcalc_XXXX from blockmatassemblystdop.f90 one can easily assemble all types of matrices with very few commands. Parameters for these routines are passed via the "QuickAccess" arrays of the collection structure in a specified manner. In the following, we give some examples how these routines are used.

Assembly of a Laplace matrix

The following code calculates a Laplace matrix at position (ix,iy) of a block matrix. We specify "1.0" as multiplier in front of the Laplace matrix:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
rcoll%IquickAccess(1) = ix
rcoll%IquickAccess(2) = iy

call bma_buildMatrix (&
rmatrix,BMA_CALC_STANDARD,bma_fcalc_laplace, rcoll)
...

Assembly of a tensor Laplace matrix

The following code calculates a tensor Laplace matrix at position (1,1) of a block matrix - i.e., a $dtimes d$ block matrix with the Laplace operator on the diagonal. The dimension $d$ of the tensor is taken from the underlying triangulation. We specify "1.0" as multiplier in front of the tensor Laplace matrix and use a tensor size of 2. This is typically used for the Stokes equations in 2D where the Stokes operator is exactly a $2times 2$ block matrix with Laplace on the diagonal:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
rcoll%IquickAccess(1) = 1
rcoll%IquickAccess(2) = 1
rcoll%IquickAccess(3) = 2

call bma_buildMatrix (&
rmatrix,BMA_CALC_STANDARD,bma_fcalc_laplaceTensor, rcoll)
...

Assembly of a Mass matrix

The following code calculates a mass matrix at position (ix,iy) of a block matrix. We specify "1.0" as multiplier in front of the Laplace matrix:

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
rcoll%IquickAccess(1) = ix
rcoll%IquickAccess(2) = iy

call bma_buildMatrix (&
rmatrix,BMA_CALC_STANDARD,bma_fcalc_mass, rcoll)
...


A tensor mass matrix can be assembled via bma_fcalc_massTensor, similar to the tensor Laplace matrix above.

Assembly of a "scalar" convection operator

The following code assembles the convection operator $n(u_h,v_h,w_h)=( (u_h nabla) v_h, w_h )$ at position (1,1) of a block matrix with convection (2.0,3.0). We apply the operator only to the scalar matrix at position (1,1) which realises the convection in a convection-diffusion-reaction equation of type $$-Delta u + u nabla u + u = f.$$

type(t_matrixBlock) :: rmatrix
type(t_collection) :: rcoll
...
rcoll%IquickAccess(1) = 1  ! x-position
rcoll%IquickAccess(2) = 1  ! y-position
rcoll%IquickAccess(3) = 0  ! scalar destination
rcoll%IquickAccess(4) = 0  ! constant direction

rcoll%DquickAccess(1) = 1.0_DP  ! multiplier
rcoll%DquickAccess(2) = 2.0_DP  ! x-direction
rcoll%DquickAccess(3) = 3.0_DP  ! y-direction

call bma_buildMatrix (&
...

Assembly of a convection operator

The following code assembles the convection operator $n(u_h,v_h,w_h)=( (u_h nabla) v_h, w_h )$ at position (1,1) of a block matrix. The nonlinearity $u_h$ is given as a finite element function, realised by the block vector rvelocity. We apply the nonlinearity to the full velocity tensor; the dimension is derived from the underlying triangulation, which is assumed to be 2D here.

type(t_matrixBlock) :: rmatrix
type(t_vectorBlock) :: rvelocity

type(t_fev2Vectors) :: rmyVectors
type(t_collection) :: rcoll
...
! Parameters
rcoll%IquickAccess(1) = 1  ! x-position
rcoll%IquickAccess(2) = 1  ! y-position
rcoll%IquickAccess(3) = 1  ! tensor destination
rcoll%IquickAccess(4) = 1  ! nonconstant direction

rcoll%DquickAccess(1) = 1.0_DP  ! multiplier

! Provide the vector field
rvelocity%RvectorBlock(1),rvelocity%RvectorBlock(2))

call bma_buildMatrix (&
revalVectors = rmyVectors)

call fev2_releaseVectorList (rmyVectors)
...

Assembly of a constant RHS vector

The following code assembles a RHS vector to the constant RHS function $f=1$. The RHS is assumed to have only one component:

type(t_matrixBlock) :: rrhs
type(t_collection) :: rcoll
...
! Parameters
rcoll%IquickAccess(1) = 1       ! Number of components
rcoll%DquickAccess(1) = 1.0_DP  ! The value of the RHS function

call bma_buildVector (&
rmatrix,BMA_CALC_STANDARD,bma_fcalc_rhsConst, rcoll)
...

Assembly of a $||u_h||_{L_2}$

The following code assembles the $L_2$ norm of a finite element function $u_h=(u_1)$ given by rx.

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dintegral
...
! Provide the vector

call bma_buildIntegral (&
dintvalue,BMA_CALC_STANDARD,bma_fcalc_L2norm,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)
...


Remark: If a finite element function is specified via revalVectors, a triangulation, boundary definition and cubature rule do not have to be specified. Although it may be advisable top specify a cubature rule, triangulation and boundary description is taken from the first vector in the vector list. This simplifies the call to bma_buildIntegral.

Assembly of a $||u_h||_{H^1}$

The following code assembles the $H^1$ norm of a finite element function $u_h=(u_1,u_2)$ given by rx. The callback routine needs the 1st derivatives of the function $u_h$, and thus, nmaxDerivative=1 is specified in fev2_addVectorToEvalList:

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dintegral
...
! Provide the vector and its derivative

call bma_buildIntegral (&
dintvalue,BMA_CALC_STANDARD,bma_fcalc_H1norm,revalVectors=rmyVectors)

call fev2_releaseVectorList (rmyVectors)
...

Assembly of $||u_h||_{infty}$

The following call uses an extended syntax of bma_buildIntegral to calculate the $sup$ norm of a finite element function. Normally, bma_buildIntegral calculates integrals by summing up the contributions of all subdomains. Modifying the optional parameter coperation, bma_buildIntegral can be told to compute the maximum of the values of the subdomains. In combination with bma_fcalc_MAXnorm, this allows to compute the maximum norm:

type(t_vectorBlock) :: rx
type(t_fev2Vectors) :: rmyVectors
real(DP) :: dmaxnorm
...
! Provide the vector

call bma_buildIntegral (&
dintvalue,BMA_CALC_STANDARD,bma_fcalc_MAXnorm,&
revalVectors=rmyVectors,coperation=BMA_INT_MAX)

call fev2_releaseVectorList (rmyVectors)
...


### Using bma_docalc_XXXX in callback routines

For more complex matrices and vectors, the user has to provide a user-defined callback routine that does the assembly. The actual calculation routines bma_docalc_XXXX are designed to be called in such user-defined callback routines for the assembly of standard operators. Using these routines allows to assemble a lot of standard operators with high efficiency.

The following example demonstrates how to set up the operator $text{DR}:=-Delta + I$ at position (1,1) of a block matrix. This operator realises the left-hand side of the diffusion-reaction equation $-Delta u + u = f$:

subroutine fcalc_DR(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)
...

call bma_docalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,1.0_DP,1,1)

call bma_docalc_mass(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,1.0_DP,1,1)

end subroutine

...
type(t_matrixBlock) :: rmatrix
...
call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,fcalc_DR)
...


The callback routine just calls the bma_docalc_XXXX routines for all operators to be assembled. For more complicated operators which probably involve a nonlinearity, parameters must be passed to the user-defined callback routine via the collection structure and the vector evaluation structure revalVectors. The callback routine can then use this information and call the predefined calculation routines.

The following example demonstrates this approach. The callback routine assembles the operator for the diffusion-convection-equation $$-nuDelta u_h + v_h nabla u_h = f_h.$$ The convection is given as a finite element function $v_h$ which is available inside of the callback function via revalVectors%p_RvectorData(1). The constant $nu$ is passed via the DquickAccess array of the collection.

subroutine fcalc_DC(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)
...
real(DP) :: dnu

! Get the parameter
dnu = rcollection%DquickAccess(1)

! Calculate nu*Laplace
call bma_docalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,dnu,1,1)

! Calculate the convection
npointsPerElement,nelements,1.0_DP,1,1,&
rvectorField=revalVectors%p_RvectorData(1))

end subroutine

type(t_matrixBlock) :: rmatrix
type(t_vectorBlock) :: rvectorVh

type(t_fev2Vectors) :: rmyVectors
type(t_collection) :: rcoll
...
! Parameters
rcoll%DquickAccess(1) = nu  ! multiplier

! Provide the nonlinearity

call bma_buildMatrix (&
rmatrix,BMA_CALC_STANDARD,fcalc_DC, rcoll,&
revalVectors = rmyVectors)

call fev2_releaseVectorList (rmyVectors)
...


## Block matrices: Advanced operator assembly

For complex operators, it is necessary to use an appropriately designed callback routine for the matrix assembly. While standard operators can be assembled with the techniques above, the implementation of complex operators involves loops over all elements, cubature points as well as trial and test basis functions - and probably, the dimension of the FE space. However, there is no such a "general" case as it is for the "scalar" assembly. For optimal efficiency and a maximum of generality, one has to implement an operator four times, once for each of the following cases:

1. Scalar valued trial/test FE space
2. Vector valued trial/test FE space
3. Scalar valued, interleaved trial/test FE space
4. Vector valued, interleaved trial/test FE space

The four implementations are rather similar, however, differ in some small parts of the loops. One should note firthermore that some operators cannot be implemented in all cases. The user has to decide on which implementation is to choose for the desired case.

Remark: Also the default implementations in blockmatassemblystdop.f90 do not cover all the above four cases for all types of finite element spaces. Most implementations cover only case 1 while only the most important standard operators (like Laplace and Mass) are designed for all cases. For the implementation of a new operator, the user is advised to do copy&paste the code for case 1 and later generalise this if necessary.

### Input/output of the callback routine

The callback routine used for the calculation of matrices has the following interface:

subroutine fcalcLocalMatrices(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)


The variables have the following meaning:

Variable Description
RmatrixData An $mtimes n$ array of t_bmaMatrixData structures, one for each submatrix in the global matrix
rmatrixData Data necessary for the assembly. Contains determinants and cubature weights for the cubature,...
rmatrixAssembly Structure with all data about the assembly
npointsPerElement Number of points per element
nelements Number of elements
revalVectors Provided FEM routines, encapsules nonlinearities and temporary memory
rcollection Collection structure for user-defined parameters

The callback routine has to use the values of the finite element trial and test spaces, cubature points/weights and the values of all nonlinearities/coefficients to compute local matrices for a set of nelements elements. The main assembly routine will later on impose these local matrices into the global matrix. Depending on which of the above four cases the callback routine is used four, different variables of the above parameters can be used and have to be written to. Here a small overview about input/output variables in the four cases.

Case 1: Scalar valued trial/test FE space

The following parameters hold:

Variable Input/Output Description
RmatrixData(:,:)%p_Dentry Output The local matrix entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest Number of degrees of freedom in the test space
RmatrixData(:,:)%ndimfeTrial Input =1, indicates scalar-valued trial FE space
RmatrixData(:,:)%ndimfeTest =1, indicates scalar-valued test FE space
RmatrixData(:,:)%bisInterleaved =.false. here, the matrix is not interleaved
• p_Dentry (1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTrial (1:ndofTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 2: Vector valued trial/test FE space

The following parameters hold:

Variable Input/Output Description
RmatrixData(:,:)%p_Dentry Output The local matrix entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest Number of degrees of freedom in the test space
RmatrixData(:,:)%ndimfeTrial >1, Dimension of the trial FE space
RmatrixData(:,:)%ndimfeTest >1, Dimension of the test FE space
RmatrixData(:,:)%bisInterleaved Input =.false. here, the matrix is not interleaved
• p_Dentry (1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTrial (1:ndofTrial*ndimfeTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely

• p_DbasTrial (1:ndofTrial + 0*ndofTrial , :,:,:) = 1st dimension, trial space
• p_DbasTrial (1:ndofTrial + 1*ndofTrial , :,:,:) = 2nd dimension, trial space
• p_DbasTrial (1:ndofTrial + 2*ndofTrial , :,:,:) = 3rd dimension, trial space, ...
• p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
• p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
• p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...
Case 3: Scalar valued, interleaved trial/test FE space

The following parameters hold:

Variable Input/Output Description
RmatrixData(:,:)%p_DentryIntl Output The local matrix entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest Number of degrees of freedom in the test space
RmatrixData(:,:)%nvar Number of variables per matrix entry
RmatrixData(:,:)%ndimfeTrial Input =1, indicates scalar-valued trial FE space
RmatrixData(:,:)%ndimfeTest =1, indicates scalar-valued test FE space
RmatrixData(:,:)%bisInterleaved =.true. here, the matrix is interleaved
• p_DentryIntl (1:nvar, 1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTrial (1:ndofTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 4: Vector valued, interleaved trial/test FE space

The following parameters hold:

Variable Input/Output Description
RmatrixData(:,:)%p_DentryIntl Output The local matrix entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RmatrixData(:,:)%p_DbasTrial For all matrices: values of the corresponding trial basis functions in all cubature points
RmatrixData(:,:)%p_DbasTest For all matrices: values of the corresponding test basis functions in all cubature points
RmatrixData(:,:)%ndofTrial Number of degrees of freedom in the trial space
RmatrixData(:,:)%ndofTest Number of degrees of freedom in the test space
RmatrixData(:,:)%ndimfeTrial >1, Dimension of the trial FE space
RmatrixData(:,:)%ndimfeTest >1, Dimension of the test FE space
RmatrixData(:,:)%nvar Number of variables per matrix entry
RmatrixData(:,:)%bisInterleaved Input =.true. here, the matrix is interleaved
• p_DentryIntl (1:nvar, 1:ndofTrial, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local matrices.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTrial (1:ndofTrial*ndimfeTrial, :, 1:ncubp, 1:nelements) and
p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely

• p_DbasTrial (1:ndofTrial + 0*ndofTrial , :,:,:) = 1st dimension, trial space
• p_DbasTrial (1:ndofTrial + 1*ndofTrial , :,:,:) = 2nd dimension, trial space
• p_DbasTrial (1:ndofTrial + 2*ndofTrial , :,:,:) = 3rd dimension, trial space, ...
• p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
• p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
• p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

### Basic structure of the callback routine

The actual structure of the callback routine is now demonstrated on an example:

Example: The following example demonstrates how to set up a Laplace operator in 2D at position (1,1) of a block matrix, here for scalar-valued FEM spaces like $Q_1$, $Q_2$, etc:

    subroutine fcalc_laplace(RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)

...

real(DP) :: dbasIx, dbasJx, dbasIy, dbasJy
integer :: iel, icubp, idofe, jdofe, ndofTrial, ndofTest
real(DP), dimension(:,:,:), pointer :: p_DlocalMat
real(DP), dimension(:,:,:,:), pointer :: p_DbasTrial,p_DbasTest
real(DP), dimension(:,:), pointer :: p_DcubWeight

! Get cubature weights data
p_DcubWeight => rassemblyData%p_DcubWeight

! Get local data
ndofTrial = RmatrixData(1,1)%ndofTrial
ndofTest  = RmatrixData(1,1)%ndofTrial
p_DbasTrial => RmatrixData(1,1)%p_DbasTrial
p_DbasTest  => RmatrixData(1,1)%p_DbasTest
p_DlocalMat => RmatrixData(1,1)%p_Dentry

! Loop over the elements in the current set.
do iel = 1,nelements

! Loop over all cubature points on the current element
do icubp = 1,npointsPerElement

! Loop over the test functions
do idofe=1,ndofTest

! Get the values of the test basis functions
dbasIx = p_DbasTest(idofe,DER_DERIV2D_X,icubp,iel)
dbasIy = p_DbasTest(idofe,DER_DERIV2D_Y,icubp,iel)

! Loop over the trial basis functions
do jdofe=1,ndofTrial

! Get the values of the trial basis functions
dbasJx = p_DbasTrial(jdofe,DER_DERIV2D_X,icubp,iel)
dbasJy = p_DbasTrial(jdofe,DER_DERIV2D_Y,icubp,iel)

! Calculate the local matrix entries
p_DlocalMat(jdofe,idofe,iel) = p_DlocalMat(jdofe,idofe,iel) + &
p_DcubWeight(icubp,iel) * ( dbasJx*dbasIx + dbasJy*dbasIy )

end do ! jdofe

end do ! idofe

end do ! icubp

end do ! iel

end subroutine


In the above example it can be seen that setting up a Laplace matrix needs a callback routine with altogether four nested do loops:

• An outer DO loop over elements
• An inner DO loop over cubature points
• For every cubature point, an inner loop over the DOFs in the test space
• For every DOF in the test space, an inner loop over the DOFs in the trial space

In the innerst loop, the actual matrix entries are computed using cubature. The variables dbasIx, dbasIy, dbasJx and dbasJy fetch the values of the basis functions in the cubature points. The sum in the innerst loop

...
p_DlocalMat(...) = p_DlocalMat(...) + &
p_DcubWeight(...) * ( dbasJx*dbasIx + dbasJy*dbasIy )
...


realises the summation $$sum_{x_k}sum_{i,j} omega(x_k) (partial_{x_1}psi_j(x_k) partial_{x_1}varphi_i(x_k) + partial_{x_2}psi_j(x_k) partial_{x_2}varphi_i(x_k) )$$ with $x_k$ the cubature points, $psi_j$ the local test functions and $varphi_i$ the local trial functions on the cell $T$. This is the discrete counterpart to the integral $$(nabla psi_j, nabla varphi_i)T = int_T partial{x_1}psi_j partial_{x_1}varphi_i + partial_{x_2}psi_j partial_{x_2}varphi_i$$ which realises the local Laplace operator on a cell $T$ in 2D. Summing up the contributions of all elements gives the global Laplace operator.

Remark: For the actual assembly, the order of the summation is changed. The above loops do not calculate $(nabla psi_j, nabla varphi_i)_T$ but only its contribution in one cubature point for one combination of trial/test functions, summing up later. In this order, the assembly is faster.

## Block vectors: Advanced vector assembly

THe assembly of complex right-hand side vectors is rather similar to the assembly of complex vectors. One needs an appropriately designed callback routine which applies loops over all elements, cubature points as well as trial and test basis functions. For optimal efficiency and a maximum of generality, also here one has to implement an operator four times, once for each of the following cases:

1. Scalar valued trial/test FE space
2. Vector valued trial/test FE space
3. Scalar valued, interleaved trial/test FE space
4. Vector valued, interleaved trial/test FE space

### Input/output of the callback routine

The callback routine used for the calculation of matrices has the following interface:

subroutine fcalcLocalVectors(rvectorData,rassemblyData,rvectorAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)


The variables have the following meaning:

Variable Description
RvectorData An array of t_bmaVectorData structures, one for each subvector in the global vector
rassemblyData Data necessary for the assembly. Contains determinants and cubature weights for the cubature,...
rvectorAssembly Structure with all data about the assembly
npointsPerElement Number of points per element
nelements Number of elements
revalVectors Provided FEM routines, encapsules nonlinearities and temporary memory
rcollection Collection structure for user-defined parameters

The callback routine has to use the values of the finite element test spaces, cubature points/weights and the values of all nonlinearities/coefficients to compute local vectors for a set of nelements elements. The main assembly routine will later on impose these local vectors into the global vector. Depending on which of the above four cases the callback routine is used four, different variables of the above parameters can be used and have to be written to. Here a small overview about input/output variables in the four cases.

Case 1: Scalar valued test FE space

The following parameters hold:

Variable Input/Output Description
RvectorData(:)%p_Dentry Output The local vector entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RvectorData(:)%p_DbasTest For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest Number of degrees of freedom in the test space
RvectorData(:)%ndimfeTest Input =1, indicates scalar-valued test FE space
RvectorData(:)%bisInterleaved =.false. here, the vector is not interleaved
• p_Dentry (1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 2: Vector valued test FE space

The following parameters hold:

Variable Input/Output Description
RvectorData(:)%p_Dentry Output The local vector entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RvectorData(:)%p_DbasTest For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest Number of degrees of freedom in the test space
RvectorData(:)%ndimfeTest >1, Dimension of the test FE space
RvectorData(:)%bisInterleaved Input =.false. here, the vector is not interleaved
• p_Dentry (1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely

• p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
• p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
• p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...
Case 3: Scalar valued, interleaved test FE space

The following parameters hold:

Variable Input/Output Description
RvectorData(::)%p_DentryIntl Output The local vector entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RvectorData(:)%p_DbasTest For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest Number of degrees of freedom in the test space
RvectorData(:)%nvar Number of variables per vector entry
RvectorData(:)%ndimfeTest Input =1, indicates scalar-valued test FE space
RvectorData(:)%bisInterleaved =.true. here, the vector is interleaved
• p_DentryIntl (1:nvar, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTest (1:ndofTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative.

Case 4: Vector valued, interleaved test FE space

The following parameters hold:

Variable Input/Output Description
RvectorData(:)%p_DentryIntl Output The local vector entries, to be computed
rassemblyData%p_DcubWeight Input Cubature weights in all cubature points
RvectorData(:)%p_DbasTest For all vectors: values of the corresponding test basis functions in all cubature points
RvectorData(:)%ndofTest Number of degrees of freedom in the test space
RvectorData(:)%ndimfeTest >1, Dimension of the test FE space
RvectorData(:)%nvar Number of variables per vector entry
RvectorData(:)%bisInterleaved Input =.true. here, the vector is interleaved
• p_DentryIntl (1:nvar, 1:ndofTest, 1:nelements)
has to be filled by the callback routine with the entries of the local vectors.

• p_DcubWeight (1:ncubp, 1:nelements)
specify the cubature weights (including the Jacobian determinant of the mapping, etc.)

• p_DbasTest (1:ndofTest*ndimfeTest, :, 1:ncubp, 1:nelements)
specify the values of the local basis functions. The 2nd dimension specifies a derivative quantifier DER_xxxx for the desired derivative. More precisely

• p_DbasTest (1:ndofTest + 0*ndofTest , :,:,:) = 1st dimension, test space
• p_DbasTest (1:ndofTest + 1*ndofTest , :,:,:) = 2nd dimension, test space
• p_DbasTest (1:ndofTest + 2*ndofTest , :,:,:) = 3rd dimension, test space, ...

### Basic structure of the callback routine

The actual structure of the callback routine is now demonstrated on an example:

Example: The following example demonstrates how to set up the right-hand side $f=x_1 x_2$ in 2D for the first subvector of a block vector, here for scalar-valued FEM spaces like $Q_1$, $Q_2$, etc:

    subroutine fcalc_rhsxy(rvectorData,rassemblyData,rvectorAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)

...

! Local variables
integer :: iel, icubp, idofe, dbasI, dval
real(DP), dimension(:,:), pointer :: p_DlocalVector
real(DP), dimension(:,:,:,:), pointer :: p_DbasTest
real(DP), dimension(:,:), pointer :: p_DcubWeight
type(t_bmaVectorData), pointer :: p_rvectorData
real(DP), dimension(:,:,:), pointer :: p_Dpoints

! Get cubature weights data
p_DcubWeight => rassemblyData%p_DcubWeight

! Get the coordinates of the cubature points
p_Dpoints => rassemblyData%revalElementSet%p_DpointsReal

! Get the data arrays of the subvector
p_rvectorData => RvectorData(icomp)
p_DlocalVector => p_rvectorData%p_Dentry
p_DbasTest => p_rvectorData%p_DbasTest

! Loop over the elements in the current set.
do iel = 1,nelements

! Loop over the cubature points
do icubp = 1,npointsPerElement

! Get the coordinates of the cubature point.
dx = p_Dpoints(1,icubp,iel)
dy = p_Dpoints(2,icubp,iel)

! Loop over the test space
do idofe=1,p_rvectorData%ndofTest

! Get the value of the basis function
dbasI = p_DbasTest(idofe,DER_FUNC,icubp,iel)

! Value of the function f=x_1 x_2
dval = dx * dy

! Multiply the values of the basis functions
! (1st derivatives) by the cubature weight and sum up
! into the local vectors.
p_DlocalVector(idofe,iel) = p_DlocalVector(idofe,iel) + &
p_DcubWeight(icubp,iel) * dval * dbasI

end do ! jdofe

end do ! icubp

end do ! iel

end subroutine


In the above example it can be seen that setting up a right-hand side needs a callback routine with altogether three nested do loops:

• An outer DO loop over elements
• An inner DO loop over cubature points
• For every cubature point, an inner loop over the DOFs in the test space

In the innerst loop, the actual vector entries are computed using cubature. The variables dbasI fetches the values of the basis functions in the cubature points. The sum in the innerst loop

...
p_DlocalVector(...) = p_DlocalVector(...) + &
p_DcubWeight(...) * dval * dbasI
...


realises the summation $$sum_{x_k}sum_{i} omega(x_k) f(x_k) psi_i(x_k)$$ with $x_k$ the cubature points and $psi_i$ the test functions on the cell $T$. This is the discrete counterpart to the integral $$(f, psi_i)_T = int_T f psi_i$$ which realises the local right-hand side on a cell $T$ in 2D. Summing up the contributions of all elements gives the global right-hand side.

Remark: For the actual assembly, the order of the summation is changed. The above loops do not calculate $(f, psi_i)_T$ but only its contribution in one cubature point for one test functions, summing up later. In this order, the assembly is faster.

## Accessing FE functions and temporary arrays

As introduced above, it is possible to pass finite element functions an temporary arrays to the callback routine in order to do complex calculations there. To demonstrate the exact passing strategy, we start with an example based on a matrix assembly.

Example: Assume that there are two finite element functions to be passed to a callback routine as well as a set of temporary arrays.

• The first finite element function, $v_h=(v_1)$, is a scalar finite element function, denoted by rvectorVh for which we need $v_h$ and $nabla v_h$.

• The second one, $w_h=(w_1,w_2,w_3)$, is a vector field for which we also need $w_h$ and $nabla w_h$.

• Furthermore, we need a temporary arrays where we store intermediate information for processing $v_h$, $nabla v_h$, $w_h$ and $nabla w_h$.

• Finally, we need a temporary arrays that store intermediate information for processing $v_h$, $nabla v_h$, $w_h$ and $nabla w_h$.

The main program sets up a vector evaluation structure as follows. Herein, the dummy vector fields use "4" temporary vectors for each dimension, one for the function values and three for the x/y/z derivatives of every component - resulting in 12 temporary arrays in total.

    ...
type(t_vectorBlock) :: rvectorVh
type(t_vectorBlock) :: rvectorWh

type(t_fev2Vectors) :: rmyVectors
...

! V_h, nabla V_h
rmyVectors,rvectorVh%RvectorBlock(1),1)

! W_h, nabla W_h
rvectorWh%RvectorBlock(1),&
rvectorWh%RvectorBlock(2),&
rvectorWh%RvectorBlock(3))

! Temporary memory, corresponding to V_h, nabla V_h.

! Temporary memory, corresponding to W_h, nabla W_h.

! Cell based temp memory, corresponding to V_h, nabla V_h.

! Cell based temp memory, corresponding to W_h, nabla W_h.

! Assembly
call bma_buildMatrix (rmatrix,BMA_CALC_STANDARD,&
fcalcMatrix,revalVectors=rmyVectors)

! Cleanup
call fev2_releaseVectorList (rmyVectors)
...


The callback routine fcalcMatrix which is called in bma_buildMatrix receives a parameter revalVectors=rmyVectors which contains the vectors and temporary arrays in exactly the order they were added to rmyVectors. The data is accessible via revalVectors%p_RvectorData which is an array of all data added to rmyVectors. revalVectors%p_RvectorData contains data arrays for all cubature points and all elements that are currently in process of the assembly.

The content of the structure revalVectors%p_RvectorData(i) depends on the type of the data which is represented. The callback routine has to "know" which data is added to revalVectors and use it in the appropriate order. In the following tables, there will be an overview which data is accessible in which situation:

Structural data

The following table depicts general information that describe the type of data represented by revalVectors%p_RvectorData(i):

Variable Description
bisInterleaved Specifies if the data is interleaved
bisVectorField Specifies if the data is a vector field
bisCellBased Specifies if the data is cell based
ndimVectorField Dimension of the vector field.
nvar Number of entries per vector if the vector is interleaved
ndimfe Dimension of the underlying FEM space (for vector valued FEM spaces. =1 for standard FEM spaces)
nmaxDerivativeIdx Number of allocated subarrays. This corresponds to the maximum derivative quantifier DER_xxxx that can be used
Scalar data in every cubature point

If the structure represents a scalar FE function in every cubature point:

Variable Description
p_Ddata Array with values in every cubature point on every element
Vector data in every cubature point

If the structure represents a fector field FE function in every cubature point:

Variable Description
p_DdataVec Array with values in every dimension, cubature point and on every element
Interleaved scalar data in every cubature point

If the structure represents an interleaved scalar FE function in every cubature point:

Variable Description
p_DdataIntl Array with values in every dimension, cubature point and on every element
Scalar data on every element

If the structure represents one data entry per element:

Variable Description
p_DcellData Array with one value per element
Vector data on every element

If the structure represents one vector field data entry per element:

Variable Description
p_DcellDataVec Array with one value in every dimension on every element
Interleaved scalar data on every element

If the structure represents an one interleaved data entry per element:

Variable Description
p_DcellDataIntl Array with one value per element

For vector data added by fev2_addVector_XXXX, the data is initialised with the values in the cubature points on all elements. "Dummy" arrays are treated the same way as data arrays but are not initialised; they represent uninitialised temporary memory the callback routine can use arbitrarily.

Example: A callback routine fcalcMatrix corresponding to the above code may look like this:

    subroutine fcalcMatrix (RmatrixData,rassemblyData,rmatrixAssembly,&
npointsPerElement,nelements,revalVectors,rcollection)
...

real(DP), dimension(:,:,:), pointer   :: p_DdataVh
real(DP), dimension(:,:,:,:), pointer :: p_DdataWh

real(DP), dimension(:,:,:), pointer   :: p_DdataTempVh
real(DP), dimension(:,:,:,:), pointer :: p_DdataTempWh

real(DP), dimension(:,:), pointer     :: p_DdataCellVh
real(DP), dimension(:,:,:), pointer   :: p_DdataCellWh

! Get the data arrays
p_DdataVh     => revalVectors%p_RvectorData(1)%p_Ddata
p_DdataWh     => revalVectors%p_RvectorData(2)%p_DdataVec

p_DdataTempVh => revalVectors%p_RvectorData(1)%p_Ddata
p_DdataTempWh => revalVectors%p_RvectorData(2)%p_DdataVec

p_DdataCellVh => revalVectors%p_RvectorData(1)%p_DcellData
p_DdataCellWh => revalVectors%p_RvectorData(2)%p_DcellDataVec

...
do iel = 1,nelements
do icubp = 1,npointsPerElement

! Get the value of V_h in the cubature point
dvh = p_DdataVh(icubp,iel,DER_FUNC)

! Save for later use
p_DdataTempVh(icubp,iel,DER_FUNC) = dvh**2

! Sum up to norm over the cell
p_DdataCellVh(iel,DER_FUNC) = p_DdataCellVh(iel,DER_FUNC) + dvh**2

do idofe=1,ndofTest

dbasI = p_DbasText(idofe,DER_FUNC,icubp,iel)

do jdofe=1,ndofTrial

dbasJ = p_DbasTrial(jdofe,DER_FUNC,icubp,iel)

! Calculate the local matrix entries
p_DlocalMat(jdofe,idofe,iel) = p_DlocalMat(jdofe,idofe,iel) + &
p_DcubWeight(icubp,iel) * ( dvh*dbasJ*dbasI )

end do ! jdofe
end do
end do
end do
...

end subroutine


This code calculates the bilinear form $$m(v_h)(varphi_h,psi_h) = ( psi_h, v_h varphi_h )$$ which corresponds to the operator $v_h I$ with $I$ the identity operator (resulting in a mass matrix with nonconstant coefficient).

Using nonconstant data: The data arrays p_DdataXXXX in this subroutine are extracted from the revalVectors%p_RvectorData(:) structures in exactly the order they were added to rmyVectors above. The shape of the data arrays correspond to the type of data that was added and is designed for fast data access inside of the loops. In the above example, p_DdataVh contains the values of $v_1$ in all cubature points and all elements. As soon as iel and icubp identify one cubature point on one element, the value dvh of $v_1$ can be obtained via p_DdataVh and be used in all the loops over the test and trial functions that calculate the local matrices.

Using temporary arrays: The temp arrays can be arbitrarily used in the callback routine. In the above example, the value of $v_h^2$ is saved to p_DdataTempVh which has the same layout as p_DdataVh. The data can then be accessed in a later step of the assembly.

Cell based data is used in a similar way. Such temporary data provides one data entry per cell. The above example uses, e.g., p_DdataCellVh to compute the norm of $v_h$ on a cell $T$, $||v_h||_{L_2(T)}$, which may be used later, for example in a stabilisation technique.