Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #2247

Closed
Open
Created Feb 16, 2018 by James Willenbring@jmwilleOwner

Test NOX_Thyra_Heq_MPI_1 failing in new Trilinos-atdm-hansen-shiller-intel-opt-serial and XXX-openmp builds

Created by: bartlettroscoe

CC: @trilinos/nox

Description

The test NOX_Thyra_Heq_MPI_1 fails in the new ATDM build Trilinos-atdm-hansen-shiller-intel-opt-serial on hansen as shown yesterday at:

  • https://testing-vm.sandia.gov/cdash/testDetails.php?test=43469004&build=3316785

and for the build Trilinos-atdm-hansen-shiller-intel-opt-openmp yesterday at:

  • https://testing-vm.sandia.gov/cdash/testDetails.php?test=43472316&build=3317072

In both cases, the end of the test shows:

************************************************************************
-- Final Status Test Results --
Converged....OR Combination -> 
  **...........Finite Number Check (Two-Norm F) = Finite
  Converged....AND Combination -> 
    Converged....F-Norm = 6.749e-09 < 1.000e-08
                 (Length-Scaled Two-Norm, Absolute Tolerance)
    Converged....WRMS-Norm = 5.011e-05 < 1
  ??...........Number of Iterations = -1 < 100
************************************************************************
Test failed!
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpiexec detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[42973,1],0]
  Exit code:    1
--------------------------------------------------------------------------

Steps to Reproduce

Anyone with access to the SNL test bed machines hansen (SON) or shiller (SRN) using the instructions linked to from the page:

  • https://snl-wiki.sandia.gov/display/CoodinatedDevOpsATDM/ATDM+Builds+of+Trilinos

should be able to reproduce.

The link from there to the README file:

  • https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md

should provide the info. But in short, once you clone Trilinos on hansen or shiller into your home directory (pointed to by env var TRILINOS_DIR), you should be able to reproduce with:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh intel-opt-serial

$ cmake \
  -GNina \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_NOX=ON \
  $TRILINOS_DIR

$ make NP=16  NOX_Thyra_Heq

$ ctest -R NOX_Thyra_Heq_MPI_1

I ran the above on hansen just now for the Trilinos version:

a39e44b "Merge branch 'develop' of github.com:trilinos/Trilinos into develop"
Author: Curtis C. Ober <ccober@sandia.gov>
Date:   Fri Feb 16 11:38:11 2018 -0700 (19 minutes ago)

and it produced the same failure.

Assignee
Assign to
Time tracking