KokkosKernels tests randomly timing out on ATDM mutrino KNL build
Created by: fryeguy52
CC: @trilinos/, @kddevin (Trilinos Data Services Product Lead), @bartlettroscoe
Next Action Status
Description
As shown in this query the tests:
- KokkosKernels_sparse_serial_MPI_1
- KokkosKernels_sparse_openmp_MPI_1
- KokkosKernels_graph_serial_MPI_1
- KokkosKernels_graph_openmp_MPI_1
- KokkosKernels_common_serial_MPI_1
- KokkosKernels_common_openmp_MPI_1
- KokkosKernels_blas_serial_MPI_1
- KokkosKernels_blas_openmp_MPI_1
are randomly timing out in the build:
- Trilinos-atdm-mutrino-intel-opt-openmp-KNL
Links above are to a 30 day history of each test. The KokkosKernels_*_serial_MPI_1
tests are nearly 3 times as likely to timeout over the last 30 days vs the KokkosKernels_*_openmp_MPI_1
tests. On average 3 of the 8 tests fail and there has been only one day in that last 30 when all passed.
Current Status on CDash
The current status of these tests/builds for the current testing day can be found at:
Steps to Reproduce
One should be able to reproduce this failure on the machine mutrino as described in:
- https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md More specifically, the commands given for the system mutrino are provided at:
- https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#<mutrino
The exact commands to reproduce this issue should be:
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-mutrino-intel-opt-openmp-KNL
$ cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_KokkosKernels=ON \
$TRILINOS_DIR
$ make -j16
$ salloc -N 1 -p standard -J Trilinos-atdm-mutrino-intel-opt-openmp-KNL ctest -j16