Mathematical (Blas/Lapack) libraries for ProDiMo

TODO: Find a proper place for this. Maybe also accelerate ProDiMo

Linking with ifx/ifort mkl libraries

01/06/2018 Peter Woitke

You can achieve some acceleration of ProDiMo by linking it with highly optimised linear algebra libraries. If you are using ifx/ifort, I would recommend using -qmkl=sequential as follows

  LAPACK_INC=-qmkl=sequential

  FLAGS = -r8 -i4 -g -traceback -O3 -xHOST -prec-div -fp-model source -qopenmp
  FCRIT = -r8 -i4 -g -traceback -O3 -xHOST -prec-div -fp-model source -qopenmp
  DEFS  = -fpp -DIFORT -DEXTERNAL_LAPACK

The default compilation of ProDiMo uses slatec_routines.F, which was later replaced by LAPACK.

If you set the lapack_solver flag in Parameter.in, ProDiMo will use lapack2006.F instead, which contains copies of the LAPACK routines from 2006, but only those actually required in ProDiMo, still self-compiled. This flag is enabled by default if you use the quadp_solver

If you say -DEXTERNAL_LAPACK ProDiMo will not use any of these but will link itself to external blas and lapack libraries which can result in a considerable acceleration depending on the computer you are using. On the computers I have tested it, the mkl library, which comes with the ifx/ifort compiler, seems to provide the fastest linear algebra routines; here are some benchmarking results from ParallelTest, where numbers are user-time [sec] measured for the chemistry and the line transfer parts, using 6 openmp processes:

FLAGS = -r8 -i4 -g -traceback -O3 -xHOST -prec-div -fp-model source -qopenmp
FCRIT = -r8 -i4 -g -traceback -O3 -xHOST -prec-div -fp-model source -qopenmp
setenv OMP_NUM_THREADS 6
parallel_chem=T

default compilation with internal linear algebra:   CHEM=97.10   LineRT=22.12
with lapack_solver flag:                            CHEM=97.07   LineRT=20.94
-DEXTERNAL_LAPACK -mkl=sequential:                  CHEM=88.64   LineRT=20.81
-DEXTERNAL_LAPACK -mkl=parallel:                    CHEM=93.74   LineRT=20.98

If you are running parallel chemistry as here (parallel_chem=T), you should link with -mkl=sequential. The linear algebra calls in ProDiMo are from within the parallelised parts, where there are on other openmp processes waiting/available. You might get an acceleration, however, from (parallel_chem=F,-mkl=sequential) to (parallel_chem=F,-mkl=parallel), haven't checked that. If you are using the gfortran compiler, you can download and compile OpenBLAS as described below.

BLAS/LAPACK with gfortran

Linux

OpenBLAS is derived from the old fast GotoBLAS. Contrary to the Intel MKL routines, OpenBLAS is an open-source project, and if it is installed, it may bring significant speed improvements and can be used with gfortran.

Installation of the library

Detailed installation instructions for openblas can be found in the Installation Guide. Often, the library can be installed via a package manager (on both Linux or Mac), see the Guide mentioned above.

However, it might be required to download/compile/install the library manually, depending on the system (see the Installation Guide link).

Makefile setting

Similar to ifx/ifort, one has to enable the -DEXTERNAL_LAPACK compiler flag. Additional settings like this have to be added in the makefile (see makefile.gfortran in src_develop directory for an example).

#-------------------------------------
# OPENBLAS for -DEXTERNAL_LAPACK ... uses package from ubuntu installation
# has to be adapted to your installation
#-------------------------------------
LAPACK_INC = -I/usr/include/x86_64-linux-gnu
LAPACK = -L/usr/lib/x86_64-linux-gnu/lapack -lopenblas -lpthread -lgfortran

Mac OS X

Xcode includes several libraries for accelerating code. To use some of those, one can simply add -framework accelerate to the FLAGS in the makefile e.g.

LAPACK_INC = -framework accelerate

and don't forget -DEXTERNAL_LAPACK. This is for gfortran (ifort/ifx are not supported on Mac OS X). Please note that those things tend to change with the OS X version.

Alternatively one can also install OpenBlas like for Linux (see link above for installation instructions).