MPI all-2-all transposition
This module ed_hamiltonian_normal_common
defines several
variables shared across the Hamiltonian setup in the ed_mode
= normal
mode. It also contains the procedure vector_transpose_mpi()
implementing the MPI Allv-2-Allv
parallel transposition of a matrix. This is the key function of the massively parallel execution of matrix-vector products discussed in j.cpc.2021.108261.
Description
Global variables related to sector Hamiltonian construction. It contains the vector_transpose_mpi()
implementing the MPI Allv-2-Allv
parallel matrix transposition.
Quick access
- Routines:
Used modules
-
-
sf_sp_linalg
sp_lanc_tridiag()
ed_input_vars
: Contains all global input variables which can be set by the user through the input file. A specific preocedureed_read_input()
should be called to read the input file usingparse_input_variable()
procedure from SciFortran. All variables are automatically set to a default, looked for and updated by reading into the file and, sequentially looked for and updated from command line (std.input) using the notation variable_name=variable_value(s) (case independent).ed_vars_global
: Contains all variables, arrays and derived types instances shared throughout the code. Specifically, it contains definitions of theeffective_bath
, thegfmatrix
and thesector
data structures.ed_bath
: Contains routines for setting, accessing, manipulating and clearing the bath of the Impurity problem.ed_aux_funx
: Hosts a number of auxiliary procedures required in different parts of the code. Specifically, it implements: creation/annihilation fermionic operators, binary decomposition of integer representation of Fock states and setup the local impurity Hamiltonianed_sector
: Contains procedures to construct the symmetry sectors corresponding to a given set of quantum numbers \(\vec{Q}\), in particular it allocated and build thesector_map
connecting the states of a given sector with the corresponding Fock ones.ed_setup
: Contains procedures to set up the Exact Diagonalization calculation, executing all internal consistency checks and allocation of the global memory.
Subroutines and functions
- subroutine ed_hamiltonian_normal_common/vector_transpose_mpi(nrow, qcol, a, ncol, qrow, b)
Performs the parallel transposition of the vector
a
, as a matrix of dimensions [nrow
,qcol
], using MPIAlltoAllV
procedure, which transfers data such that the j-block, sent from the process i, is received by process j and placed as block i. This parallel transposition involves the minimum amount of data transfer necessary to execute the matrix-vector product, removing the communicational congestion and unlocking optimal parallel scaling.See j.cpc.2021.108261 for a detailed description of the algorithm implemented in this procedure.
- Options:
nrow [integer] – Global number of rows
qcol [integer] – Local number of columns on each thread
ncol [integer] – Global number of columns
qrow [integer] – Local number of rows on each thread
- Parameters:
a (nrow, qcol) [real] – Input vector to be transposed
b (ncol, qrow) [real] – Output vector \(b = v^T\)