2 Gb MPI Memory Limit Workaround
D-01137
Currently, multiple errors occur when using MPI to pass buffers greater than 2GB either. This occurs for both MPI communications to other processes and when writing Lustre stripes (bigger than 2 GB). The errors occur due to the internal buffer limit of 2 GB.
PreDev Notes:
With MPI-3, we can use the derived datatype MPI_COUNT to override this. MPI_COUNT has a limit of 2 Exabytes:
-
use MPI_TYPE_CONTIGUOUS to create a new derived datatype which is essentially a vector of some elementary datatype that has more than 2GB worth of entries:
n = 3*10^12; std::vector<int> vec(n); MPI_COUNT count = n MPI_Datatype largeIntVec_t; MPI_Type_Contiguous(count, MPI_INT, &largeIntVec_t); MPI_Type_Commit(&largeIntVec_t);
-
this new datatype can be used to write data
MPI_File_write_at_all(file, offset, vec, 1, largeIntVec_t, status); MPI_Type_Free(&largeIntVec_t)
see https://blogs.cisco.com/performance/new-things-in-mpi-3-mpi_count All supported versions of Intel MPI, OpenMPI, and MPICH are MPI-3.0+ compliant so should support MPI_Count.
Implementation Notes:
ADDME
System Test Changes:
ADDME
Bug Fixes:
ADDME
C++ API Changes:
ADDME
C API Changes:
ADDME
Success Criteria:
ADDME
CREATED 04/07/2017