Test TeuchosCore_testTeuchosTestForTermination_2_MPI_4 randomly failing due to jumbled output regex failures
Created by: bartlettroscoe
CC: @trilinos/teuchos
Next Action Status
PR #3164 merged to develop on 7/21/2018 which should fix issue. As of 10/2/2018, no failures in any promoted ATDM Trilinos build since 7/16/2018.
Description
As shown in this query, the test TeuchosCore_testTeuchosTestForTermination_2_MPI_4
fails randomly in the builds:
Trilinos-atdm-hansen-shiller-cuda-8.0-opt
Trilinos-atdm-hansen-shiller-cuda-9.0-debug
on hansen
.
As shown here, for example, the test fails due to the jumbled output:
p=[hansen03:38807] *** Process received signal ***
[hansen03:38807] Signal: Aborted (6)
[hansen03:38807] Signal code: (-6)
[hansen03:38807] 2: /home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-8.0-opt/SRC_AND_BUILD/Trilinos/packages/teuchos/core/test/MemoryManagement/testTeuchosTestForTermination.cpp:63:
Terminate test that evaluated to true: GlobalMPISession::getRank() == terminate_on_procid
Bingo, we are terminating on procid == terminate_on_procid = 2!
terminate called without an active exception
which breaks the required regex:
TEST_0: Pass criteria = Match REGEX {p=2: .*/testTeuchosTestForTermination.cpp} [FAILED]
because the system-generated message [hansen03:38807] *** Process received signal ***
is printed in the middle of that line.
Steps to reproduce
Since this error seems to only be showing up in the CUDA builds on 'hansen', one might be able to reproduce this error on 'hansen' or 'shiller' using the instructions at:
But since this is a random build that rarely fails, it may be hard to reproduce.