(RESOLVED but needs test) Tpetra:`transferAndFillComplete` causes MPI error at large communicator size.
Created by: cgcgcg
@trilinos/tpetra @mhoemmen @DrBooom
Current Behavior
Tpetra's transferAndFillComplete
causes MPI error at large communicator size:
MPIR_MAXF_check_dtype(72): MPI_Op MPI_MAX operation not defined for this datatype
Rank 31 [Thu Dec 13 20:42:18 2018] [c4-0c0s10n2] Fatal error in PMPI_Iallreduce: Invalid MPI_Op, error stack:
PMPI_Iallreduce(807).....: MPI_Iallreduce(sendbuf=0x7ffffffeb988, recvbuf=0x7ffffffeb98c, count=0, datatype=MPI_BYTE, op=MPI_MAX, comm=MPI_COMM_WORLD, request=0x7ffffffe72e8
) MPIR_MAXF_check_dtype(72): MPI_Op MPI_MAX operation not defined for this datatype
The error can also be reproduced at low small communicator size by passing
<Parameter name="MM_TAFC_OptimizationCoreCount" type="int" value="1"/>
in the parameter list.
I believe the issue might be here: https://github.com/trilinos/Trilinos/blob/2da4a8a6dca7d679b24215151079c423fd4f664f/packages/tpetra/core/src/Tpetra_CrsMatrix_def.hpp#L93-L94 Shouldn't the views be created with a size argument? Cause it seems that the MPI_BYTE datatype that MPI complains about is set by default for views of length 0.