Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #2410
Closed
Open
Issue created Mar 17, 2018 by James Willenbring@jmwilleOwner

Test TeuchosNumerics_LAPACK_test_MPI_1 fails in all 'debug' builds on power8 'ride'

Created by: bartlettroscoe

CC: @trilinos/teuchos

Next Action Status:

PR #2447 was merged on 3/23/2018 which disabled the test. PR #4064 which enables the whole test TeuchosNumerics_LAPACK_test_MPI_1 but disables the single unit test for STEQR() merged to 'develop' on 12/18/2018. Next: Watch for test running and passing (minus STEQR() unit test) on 'release-debug' and 'opt' builds on 'white', 'ride', and 'waterman' on 12/19/2018 ...

Description

The test TeuchosNumerics_LAPACK_test_MPI_1 segfaults on the 'debug' builds Trilinos-atdm-white-ride-cuda-debug and Trilinos-atdm-white-ride-gnu-debug-openmp on 'ride' and 'white' but passes in all of the 'opt' builds on these same machines as well as for all of the builds on hansen as shown this morning in:

  • https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-17&filtercombine=and&filtercombine=and&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=TeuchosNumerics_LAPACK_test_MPI_1

The failing tests all show segfaults showing the output:

Teuchos in Trilinos 12.13 (Dev)

GESV test ... passed!
LAPY2 test ... passed!
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 16320 on node white24 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

What is interesting is that this test only failed in all of the Trilinos builds that were done yesterday in the query:

  • https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-16&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=TeuchosNumerics_LAPACK_test_MPI_1&field3=status&compare3=62&value3=passed

May this be the same error reported in #1208 (closed) that we basically gave up on?

Steps to Reproduce

Following the instructions at:

  • https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite

one can reproduce this failing test by enabling the Teuchos package for the builds gnu-debug-openmp or cuda-debug and running the failing test.

Related issues

  • Related to: #1208 (closed)?
Assignee
Assign to
Time tracking