Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #2699

Closed
Open
Created May 08, 2018 by James Willenbring@jmwilleOwner

Address initial ATDM Trilinos failing tests on 'serrano' due to 'found no available cpus' and other system issues

Created by: bartlettroscoe

CC: @fryeguy52, @jmgate

Next Action Status

PR #2703 was merged on 5/9/2018 which should greatly reduce test failures due to this issue. None of these failures were observed in any 'serrano' test from 6/15/2018 to 6/20/2018.

Description

With the ATDM Trilinos tests now submitting from 'serrano' to CDash (see TRIL-204), we are seeing a lot of test failures as seen, for example, today at:

  • https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&date=2018-05-08&filtercount=1&showfilters=1&field1=buildname&compare1=65&value1=Trilinos-atdm-toss3-intel-

with the number of failing and passing tests:

Build Name Fail Pass
Trilinos-atdm-toss3-intel-debug-openmp 402 1377
Trilinos-atdm-toss3-intel-debug-openmp-panzer 29 127
Trilinos-atdm-toss3-intel-opt-openmp 402 1379
Trilinos-atdm-toss3-intel-opt-openmp-panzer 29 127

Many of these failing tests show failures like:

--------------------------------------------------------------------------
While computing bindings, we found no available cpus on
the following node:

  Node:  ser58

Please check your allocation.
--------------------------------------------------------------------------

The test executables with MPI are being run with:

mpiexec -np <NP> -map-by socket:PE=16 --oversubscribe <exec-name>

and the level of ctest parallelism being used is only CTEST_PARALLEL_LEVEL=8.

Assignee
Assign to
Time tracking