Parameters for the FPT method

The Fast Polynomial Transformation implements the approach described in B. K. Alpert und V. Rokhlin, „A Fast Algorithm for the Evaluation of Legendre Expansions“, SIAM Journal on Scientific and Statistical Computing, Vol. 12, Nr. 1, pp. 158–179, Jan. 1991, doi: 10.1137/0912009.

It utilizes the Fast Fourier Transformation by first converting the Legendre modes into Chebyshev modes. The conversion between Legendre and Chebyshev modes is done approximately by approximating increasingly larger blocks away from the diagonal. This method is only available if the executable is linked against the FFTW.

As with the other projection methods, a factor can be specified to use more points in the nodal representation and achieve some de-aliasing. Because the FFT works especially well for powers of two, it is possible to choose the oversampling such, that the oversampled modes have count of the next larger power of 2. To achieve this, the option adapt_factor_pow2 has to be set to true, by default it is assumed to be false. Further it is possible to make use of lobattoPoints with this transformation method. If lobattoPoints = true, the points on the interval boundary will be included in the set of points in the nodal representation. This is for example necessary when positivity is to be preserved for the numerical fluxes. By default this setting is false.

All other options tune the transformation algorithm:

• blocksize defines the minimal block size that is to be approximated in the transformation matrix. It defaults to 64, which is the recommendation for double precision computations, but requires high polynomial degrees to attain any approximation at all. The (oversampled) number of nodes needs to be larger than two times the blocksize, to have at least one approximated block. As long as the number modes is below this threshold the method is not "fast" and a computational complexity of number of modes squared is required for the operation. Smaller values push the approximation closer to the diagonal. This can speed up the computation for smaller number of modes but also detoriates the accuracy of the transformation.
• approx_terms number of terms to use in the approximation of blocks, defaults to 18, which is recommended for double precision computations. In each block only approx_terms will be used to represent rows in the block. Smaller values make the transformation faster, but less accurate (if blocks are actually approximated).
• implementation selects the implementation for the transformation. There are two variants of the implementation: 'scalar' this is the default and treats the transformations with an outer loop over the independent operations. The 'vector' implementation on the other hand gathers multiple independent operations together and performs them all at once with an inner loop over blocks length striplen. While the 'vector' variant may exploit vector instructions, it utilizes a larger amount of temporary memory. The default setting for this option is 'scalar'.
• striplen determines the length for vectorized loops to be used in the matrix operation. It defaults to the vlen setting defined in tem_compileconf_module during compilation. Depending on the computing architecture, different values may provide more efficient computations.
• subblockingWidth defines striding in the multiplication of the diagonal elements in the transformation matrix. The default for this setting is 8.

The configuration table for the FPT table may, for example, look as follows:

  projection = {
kind              = 'fpt',
factor            = 1.5,
lobattoPoints     = false,
blocksize         = 16,
approx_terms      = 12,
implementation    = 'scalar',
striplen          = 256,
subblockingWidth  = 8
}


## Variables

TypeVisibilityAttributesNameInitial
integer, public, parameter:: ply_fpt_default_blocksize =64

The recommended minimal blocksize for double precision.

integer, public, parameter:: ply_fpt_default_subblockingWidth =8

The default width of the subblocking of the diagonal calculation of the fpt projection

integer, public, parameter:: ply_fpt_default_approx_terms =18

Default number of terms to use in FPT blocks. 18 is recommended for double precision.

integer, public, parameter:: ply_fpt_scalar =1

Value to signify the use of the scalar FPT implementation.

integer, public, parameter:: ply_fpt_vector =2

Value to signify the use of the vector FPT implementation.

## Interfaces

• ### private pure subroutine Copy_fpt_header(left, right)

#### Arguments

TypeIntentOptionalAttributesName

fpt to copy to

fpt to copy from

• ### private pure function isEqual(left, right) result(equality)

This function provides the test for equality of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is equal??

• ### private pure function isUnequal(left, right) result(unequality)

This function provides the test for unequality of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is unequal??

• ### private pure function isSmaller(left, right) result(small)

This function provides a < comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is smaller??

• ### private pure function isSmallerOrEqual(left, right) result(small)

This function provides a <= comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is smaller??

• ### private pure function isGreater(left, right) result(great)

This function provides a > comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is greater??

• ### private pure function isGreaterOrEqual(left, right) result(great)

This function provides a >= comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is greater??

## Derived Types

Type for the fpt header, stores all information needed to initialize the fpt method later on

#### Components

TypeVisibilityAttributesNameInitial
real(kind=rk), private :: factor =1.0_rk

In case of nonlinear equations, aliasing occurs if the projections of the nonlinear terms on the testfunctions are not calculated accurately enough. To avoid these errors it is possible to extend the transformation vectors of the FPT with zeros. This factor determines by how many zeros the modal vector is extended before transformation. This factor has to be chosen properly with respect of the type of nonlinearity of your equation.

integer, private :: blocksize =ply_fpt_default_blocksize

The blockisze of the fast bases exchange algorithm from Legendre to Chebyshev polynomials. A negative number indicates to use the default blocksize of the algorithm.

integer, private :: approx_terms =ply_fpt_default_approx_terms

The number of approximation terms to use for blocks apart from the diagonal.

integer, private :: implementation

The implementation variant to use for the transformation computation.

integer, private :: striplen =vlen

The striplen, that should be used for vectorized simultaneous computations of the matrix operation.

integer, private :: subblockingWidth =ply_fpt_default_subblockingWidth

The width of the subblocks used during the unrolled base exchange to ensure a better cache usage.

Should the oversampling factor be adapted to ensure a power of 2 in the oversampled polynomial?

## Functions

### private pure function isEqual(left, right) result(equality)

This function provides the test for equality of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is equal??

### private pure function isUnequal(left, right) result(unequality)

This function provides the test for unequality of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is unequal??

### private pure function isSmaller(left, right) result(small)

This function provides a < comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is smaller??

### private pure function isSmallerOrEqual(left, right) result(small)

This function provides a <= comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is smaller??

### private pure function isGreater(left, right) result(great)

This function provides a > comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is greater??

### private pure function isGreaterOrEqual(left, right) result(great)

This function provides a >= comparison of two projections.

#### Arguments

TypeIntentOptionalAttributesName

projection to compare

projection to compare against

is greater??

## Subroutines

Read the FPT configuration options from the provided Lua script in conf.

#### Arguments

TypeIntentOptionalAttributesName
type(flu_State), intent(inout) :: conf
integer, intent(in) :: thandle

### public subroutine ply_fpt_header_define(me, blocksize, factor, approx_terms, implementation, striplen, subBlockingWidth, adapt_factor_pow2, lobattoPoints)

Define settings for the Fast Polynomial Transformation.

#### Arguments

TypeIntentOptionalAttributesName

FPT header to hold the defined settings.

integer, intent(in), optional :: blocksize

Blocksize to use in approximation algorithm. Defaults to ply_fpt_default_blocksize.

real(kind=rk), intent(in), optional :: factor

Oversampling factor to use.

This can be used to reduce aliasing when projecting functions that will be truncated in the polynomial series expansion. Default is a factor of 1, so no oversampling.

integer, intent(in), optional :: approx_terms

Number of approximation terms to use for the representation of the blocks in the Legendre to Chebyshev transformation algorithm. Defaults to ply_fpt_default_approx_terms.

integer, intent(in), optional :: implementation

Implementation to use in the computation.

Select the implementation variant to use. Either scalar (ply_fpt_scalar) or vectorized (ply_fpt_vector). Default is ply_fpt_scalar.

integer, intent(in), optional :: striplen

Length of strips to use in the transformation implementation. Defaults to vlen.

integer, intent(in), optional :: subBlockingWidth

Width for subblocks in unrolling the approximate Legendre to Chebyshev transformation. Defaults to ply_fpt_default_subblockingWidth.

Adapt the oversampling factor such, that oversampled space has a number of degrees of freedoms in one direction that is a power of 2. Default is .false..

logical, intent(in), optional :: lobattoPoints

Wether to use Chebyshev-Lobatto points (include boundary points) or not. Defaults to .false..

Write FPT settings into a Lua table.

#### Arguments

TypeIntentOptionalAttributesName
type(aot_out_type), intent(inout) :: conf

Print the FPT settings to the log output.

#### Arguments

TypeIntentOptionalAttributesName