Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #3864

Closed
Open
Created Nov 13, 2018 by James Willenbring@jmwilleMaintainer

KokkosKernels tests randomly timing out on ATDM mutrino KNL build

Created by: fryeguy52

CC: @trilinos/, @kddevin (Trilinos Data Services Product Lead), @bartlettroscoe

Next Action Status

Description

As shown in this query the tests:

  • KokkosKernels_sparse_serial_MPI_1
  • KokkosKernels_sparse_openmp_MPI_1
  • KokkosKernels_graph_serial_MPI_1
  • KokkosKernels_graph_openmp_MPI_1
  • KokkosKernels_common_serial_MPI_1
  • KokkosKernels_common_openmp_MPI_1
  • KokkosKernels_blas_serial_MPI_1
  • KokkosKernels_blas_openmp_MPI_1

are randomly timing out in the build:

  • Trilinos-atdm-mutrino-intel-opt-openmp-KNL

Links above are to a 30 day history of each test. The KokkosKernels_*_serial_MPI_1 tests are nearly 3 times as likely to timeout over the last 30 days vs the KokkosKernels_*_openmp_MPI_1 tests. On average 3 of the 8 tests fail and there has been only one day in that last 30 when all passed.

Current Status on CDash

The current status of these tests/builds for the current testing day can be found at:

Current test status

Steps to Reproduce

One should be able to reproduce this failure on the machine mutrino as described in:

  • https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md More specifically, the commands given for the system mutrino are provided at:
  • https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#<mutrino

The exact commands to reproduce this issue should be:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-mutrino-intel-opt-openmp-KNL

$ cmake \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_KokkosKernels=ON \
  $TRILINOS_DIR

$ make -j16

$ salloc -N 1 -p standard -J Trilinos-atdm-mutrino-intel-opt-openmp-KNL ctest -j16
Assignee
Assign to
Time tracking