Tpetra: unreliable test condition in ScopeGuard tests
Created by: kddevin
@trilinos/tpetra @trilinos/kokkos
Current Behavior
The following tests fail on white for a CUDA build: 31 - TpetraCore_Core_initialize_where_user_initializes_mpi_MPI_4 (Failed) 32 - TpetraCore_Core_ScopeGuard_where_user_initializes_mpi_MPI_4 (Failed) 35 - TpetraCore_Core_initialize_where_tpetra_initializes_kokkos_MPI_1 (Failed) 36 - TpetraCore_Core_ScopeGuard_where_tpetra_initializes_kokkos_MPI_1 (Failed) 37 - TpetraCore_Core_initialize_where_user_initializes_kokkos_MPI_1 (Failed) 38 - TpetraCore_Core_ScopeGuard_where_user_initializes_kokkos_MPI_1 (Failed) 39 - TpetraCore_Core_initialize_where_tpetra_initializes_mpi_and_user_initializes_kokkos_MPI_2 (Failed) 40 - TpetraCore_Core_ScopeGuard_where_tpetra_initializes_mpi_and_user_initializes_kokkos_MPI_2 (Failed)
These tests rely on Kokkos not writing to std::cerr during Kokkos::initialize. However, for reasons unrelated to proper/improper use of Kokkos::initialize, Kokkos may write to std::cerr. E.g.,
"Captured output: Kokkos::Cuda::initialize WARNING: running kernels compiled for compute capability 3.5 on device with compute capability 3.7 , this will likely reduce potential performance."
In this case, all the initialization took place correctly (so the test should have passed), but Kokkos issued a warning to std::cerr (so the test failed).
Since Tpetra cannot control what Kokkos writes to std::cerr, this condition is not a reliable way to determine whether these tests pass or fail.
A side note: The test goes on to say "Captured output is empty!" when they should say "Captured output is NOT empty!" The incorrect message is confusing, but secondary to the unreliable test condition.
Steps to Reproduce
On white:
module purge
module load openmpi/2.1.2/gcc/7.2.0/cuda/9.2.88
module load cmake/3.9.6
module load openblas/0.2.20/gcc/7.2.0
module load boost/1.65.1/gcc/7.2.0
module load cuda/9.2.88
module load netcdf-exo/4.4.1.1/openmpi/2.1.2/gcc/7.2.0/cuda/9.0.176
export NVCC_WRAPPER_DEFAULT_COMPILER=which g++
echo ${NVCC_WRAPPER_DEFAULT_COMPILER}
TRILINOS_SRC="/home/Trilinos" export OMPI_CXX=${TRILINOS_SRC}/packages/kokkos/bin/nvcc_wrapper which mpic++ mpic++ --version
export PATH={PATH}:
{TRILINOS_SRC}/packages/kokkos/bin
cmake
-DTPL_ENABLE_MPI=ON
-DMPI_BASE_DIR=${MPI_ROOT}
-DBLAS_LIBRARY_DIRS=${OPENBLAS_ROOT}/lib
-DLAPACK_LIBRARY_DIRS=${OPENBLAS_ROOT}/lib
-DNetcdf_LIBRARY_DIRS=${NETCDF_ROOT}/lib
-DBoostLib_LIBRARY_DIRS=${BOOST_ROOT}/lib
-DTPL_ENABLE_Matio=OFF
-DTrilinos_ENABLE_ALL_PACKAGES=OFF
-DTrilinos_ENABLE_Tpetra=ON
-DTpetra_ENABLE_TESTS=ON
-DTpetra_ENABLE_EXAMPLES=ON
-DCMAKE_INSTALL_PREFIX=${TRILINOS_SRC}/tmp
-D Trilinos_ENABLE_CUDA=ON
-D TPL_ENABLE_CUDA=ON
-D Tpetra_INST_CUDA:BOOL=ON
-DCMAKE_CXX_FLAGS="--expt-extended-lambda"
-DKokkos_ENABLE_Cuda_UVM:BOOL=ON
$TRILINOS_SRC
make -j 8 ctest -j4
Related Issues
Additional Information
This issue is low priority and will likely not be fixed unless it becomes a blocker for other developers.