KokkosKernels_sparse_* tests are failing on ATDM cuda builds
Created by: fryeguy52
CC: @trilinos/kokkos-kernels , @kddevin (Trilinos Data Services Product Lead)
Next Action Status
The test KokkosKernels_sparse_serial_MPI_1
on 'waterman' has been passing without timing out in each 'debug' build on 'waterman' since 10/9/2018 as shown here.
Description
As shown in this query the tests:
- KokkosKernels_sparse_serial_MPI_1
- KokkosKernels_sparse_cuda_MPI_1
are failing in all the cuda builds on white, ride, hansen, and waterman:
- Trilinos-atdm-waterman-gnu-debug-openmp
- Trilinos-atdm-waterman-cuda-9.2-debug
- Trilinos-atdm-waterman-cuda-9.2-opt
- Trilinos-atdm-white-ride-cuda-9.2-opt
- Trilinos-atdm-white-ride-cuda-9.2-debug-pt
- Trilinos-atdm-white-ride-cuda-9.2-debug
- Trilinos-atdm-hansen-shiller-cuda-8.0-opt
- Trilinos-atdm-hansen-shiller-cuda-9.0-debug
- Trilinos-atdm-hansen-shiller-cuda-9.0-opt
Here you can see that these tests started failing on 9/7/2018
at the bottom of this page is a list of commits that were new on that day.
Steps to Reproduce
One should be able to reproduce this failure as described in:
- https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md More specifically, the commands given for the system white are provided at:
- https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite The exact commands to reproduce this issue should be:
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_KokkosKernels=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16