Numerous failing tests in 'opt' builds of Trilinos on Power8 'white' and 'ride'
Created by: bartlettroscoe
CC: @trilinos/anasazi, @trilinos/belos, @trilinos/ifpack2, @trilinos/nox, @trilinos/panzer, @trilinos/stratimikos, @nmhamster, @crtrott, @mhoemmen
Next Action Status
Moving to NETLIB BLAS and LAPACK fixes all of these failing tests in the 'opt' builds on 'white' and 'ride' (zero failing tests 6/5-6/6/2018).
Description
This story is to address the many Trilinos tests that are segfaulting in the builds Trilinos-atdm-white-ride-cuda-opt
and Trilinos-atdm-white-ride-gnu-opt-openmp
shown at, for example:
- https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3389461
- https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3391722
(I show results on different days on 'ride' because of the problem of bsub commands getting terminated early on 'white' and 'ride' as is being tracked in TRIL-198 .)
These builds show about 70 or so failing tests while the debug builds Trilinos-atdm-white-ride-cuda-opt
and Trilinos-atdm-white-ride-gnu-opt-openmp
on these same machine show very few failing tests.
The full list of failing tests for these two builds shown at the above links are:
Failing tests for build Trilinos-atdm-white-ride-cuda-opt (click to expand)
As shown at:
Name | Status | Time | Proc Time | Details |
---|---|---|---|---|
Anasazi_BlockDavidsonThyra_test_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_BKS_nh_test_1_MPI_4 | Failed | 760ms | 3s 40ms | Completed (Failed) |
Anasazi_Epetra_BKS_solvertest_MPI_4 | Failed | 980ms | 3s 920ms | Completed (Failed) |
Anasazi_Epetra_BKS_test_1_MPI_4 | Failed | 990ms | 3s 960ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_auxtest_MPI_4 | Failed | 700ms | 2s 800ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_solvertest_MPI_4 | Failed | 770ms | 3s 80ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_0_MPI_4 | Failed | 770ms | 3s 80ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_1_MPI_4 | Failed | 820ms | 3s 280ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_2_MPI_4 | Failed | 920ms | 3s 680ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_3_MPI_4 | Failed | 750ms | 3s | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_4_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_5_MPI_4 | Failed | 1s 50ms | 4s 200ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_6_MPI_4 | Failed | 900ms | 3s 600ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_7_MPI_4 | Failed | 890ms | 3s 560ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_8_MPI_4 | Failed | 730ms | 2s 920ms | Completed (Failed) |
Anasazi_Epetra_GeneralizedDavidson_solvertest_MPI_4 | Failed | 880ms | 3s 520ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_0_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_1_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_2_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_3_MPI_4 | Failed | 650ms | 2s 600ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_4_MPI_4 | Failed | 690ms | 2s 760ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_0_MPI_4 | Failed | 1s 50ms | 4s 200ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_1_MPI_4 | Failed | 1s 50ms | 4s 200ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_2_MPI_4 | Failed | 740ms | 2s 960ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_3_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_auxtest_MPI_4 | Failed | 1s 60ms | 4s 240ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_simpletest_MPI_4 | Failed | 790ms | 3s 160ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_solvertest_MPI_4 | Failed | 900ms | 3s 600ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_test_MPI_4 | Failed | 930ms | 3s 720ms | Completed (Failed) |
Anasazi_Epetra_ModalSolversTester_MPI_4 | Failed | 890ms | 3s 560ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 | Failed | 920ms | 3s 680ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerGenTester_1_MPI_4 | Failed | 1s 20ms | 4s 80ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_0_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_1_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_2_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_3_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_4_MPI_4 | Failed | 750ms | 3s | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_5_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_0_MPI_4 | Failed | 880ms | 3s 520ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_1_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_2_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_3_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_4_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_5_MPI_4 | Failed | 860ms | 3s 440ms | Completed (Failed) |
Anasazi_IRTRThyra_test_0_MPI_4 | Failed | 880ms | 3s 520ms | Completed (Failed) |
Anasazi_IRTRThyra_test_1_MPI_4 | Failed | 790ms | 3s 160ms | Completed (Failed) |
Anasazi_IRTRThyra_test_2_MPI_4 | Failed | 780ms | 3s 120ms | Completed (Failed) |
Anasazi_IRTRThyra_test_3_MPI_4 | Failed | 990ms | 3s 960ms | Completed (Failed) |
Anasazi_LOBPCGThyra_test_MPI_4 | Failed | 10m 20ms | 40m 80ms | Completed (Timeout) |
Anasazi_Tpetra_TraceMinDavidson_largest_standard_test_MPI_4 | Failed | 720ms | 2s 880ms | Completed (Failed) |
Belos_bl_fgmres_hb_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Belos_bl_pgmres_hb_0_MPI_4 | Failed | 820ms | 3s 280ms | Completed (Failed) |
Belos_bl_pgmres_hb_1_MPI_4 | Failed | 1s 30ms | 4s 120ms | Completed (Failed) |
Belos_bl_pgmres_hb_2_MPI_4 | Failed | 1s 30ms | 4s 120ms | Completed (Failed) |
Belos_gcrodr_hb_MPI_4 | Failed | 1s 280ms | 5s 120ms | Completed (Failed) |
Belos_prec_gcrodr_hb_1_MPI_4 | Failed | 1s 580ms | 6s 320ms | Completed (Failed) |
Belos_pseudo_gmres_hb_MPI_4 | Failed | 1s 190ms | 4s 760ms | Completed (Failed) |
Belos_pseudo_pgmres_hb_MPI_4 | Failed | 930ms | 3s 720ms | Completed (Failed) |
Ifpack2_GS_belos_MPI_1 | Failed | 1s 190ms | 1s 190ms | Completed (Failed) |
Ifpack2_Jacobi_hb_belos_MPI_1 | Failed | 1s 830ms | 1s 830ms | Completed (Failed) |
Ifpack2_Jacobi_hb_belos_MPI_2 | Failed | 2s 340ms | 4s 680ms | Completed (Failed) |
Ifpack2_SGS_belos_MPI_1 | Failed | 970ms | 970ms | Completed (Failed) |
Intrepid2_unit-test_Orientation_Serial_Test_Orientation_TET_MPI_1 | Failed | 880ms | 880ms | Completed (Failed) |
NOX_LOCA_AnasaziJacobianInverse_MPI_1 | Failed | 10m 30ms | 10m 30ms | Completed (Timeout) |
NOX_LOCA_AnasaziNotConverged_MPI_1 | Failed | 10m 60ms | 10m 60ms | Completed (Timeout) |
NOX_LOCA_MultiPointTcubed_MPI_2 | Failed | 1s 590ms | 3s 180ms | Completed (Failed) |
PanzerAdaptersSTK_main_driver_energy-ss-loca-eigenvalue | Failed | 1s 530ms | 6s 120ms | Completed (Failed) |
Piro_MatrixFreeDecorator_UnitTests_MPI_4 | Failed | 1s 70ms | 4s 280ms | Completed (Failed) |
Stratimikos_Belos_GCRODR_strattest_MPI_4 | Failed | 1s 70ms | 4s 280ms | Completed (Failed) |
Stratimikos_Thyra_Belos_StatusTest_UnitTests_MPI_1 | Failed | 1s 750ms | 1s 750ms | Completed (Failed) |
Failing tests for build Trilinos-atdm-white-ride-gnu-opt-openmp (click to expand)
As shown at:
Name | Status | Time | Proc Time | Details |
---|---|---|---|---|
Anasazi_BlockDavidsonThyra_test_MPI_4 | Failed | 820ms | 3s 280ms | Completed (Failed) |
Anasazi_Epetra_BKS_nh_test_1_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Anasazi_Epetra_BKS_solvertest_MPI_4 | Failed | 740ms | 2s 960ms | Completed (Failed) |
Anasazi_Epetra_BKS_test_1_MPI_4 | Failed | 2s 210ms | 8s 840ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_auxtest_MPI_4 | Failed | 790ms | 3s 160ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_solvertest_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_0_MPI_4 | Failed | 880ms | 3s 520ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_1_MPI_4 | Failed | 880ms | 3s 520ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_2_MPI_4 | Failed | 840ms | 3s 360ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_3_MPI_4 | Failed | 900ms | 3s 600ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_4_MPI_4 | Failed | 890ms | 3s 560ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_5_MPI_4 | Failed | 990ms | 3s 960ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_6_MPI_4 | Failed | 990ms | 3s 960ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_7_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Anasazi_Epetra_BlockDavidson_test_8_MPI_4 | Failed | 860ms | 3s 440ms | Completed (Failed) |
Anasazi_Epetra_GeneralizedDavidson_solvertest_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_0_MPI_4 | Failed | 700ms | 2s 800ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_1_MPI_4 | Failed | 730ms | 2s 920ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_2_MPI_4 | Failed | 630ms | 2s 520ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_3_MPI_4 | Failed | 670ms | 2s 680ms | Completed (Failed) |
Anasazi_Epetra_IRTR_auxtest_4_MPI_4 | Failed | 640ms | 2s 560ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_0_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_1_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_2_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_Epetra_IRTR_test_3_MPI_4 | Failed | 610ms | 2s 440ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_auxtest_MPI_4 | Failed | 890ms | 3s 560ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_simpletest_MPI_4 | Failed | 1s 30ms | 4s 120ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_solvertest_MPI_4 | Failed | 1s 620ms | 6s 480ms | Completed (Failed) |
Anasazi_Epetra_LOBPCG_test_MPI_4 | Failed | 900ms | 3s 600ms | Completed (Failed) |
Anasazi_Epetra_ModalSolversTester_MPI_4 | Failed | 780ms | 3s 120ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerGenTester_1_MPI_4 | Failed | 640ms | 2s 560ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_0_MPI_4 | Failed | 860ms | 3s 440ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_1_MPI_4 | Failed | 720ms | 2s 880ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_2_MPI_4 | Failed | 650ms | 2s 600ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_3_MPI_4 | Failed | 690ms | 2s 760ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_4_MPI_4 | Failed | 650ms | 2s 600ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerMatTester_5_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_0_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_1_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_2_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_3_MPI_4 | Failed | 690ms | 2s 760ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_4_MPI_4 | Failed | 690ms | 2s 760ms | Completed (Failed) |
Anasazi_Epetra_OrthoManagerTester_5_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_IRTRThyra_test_0_MPI_4 | Failed | 710ms | 2s 840ms | Completed (Failed) |
Anasazi_IRTRThyra_test_1_MPI_4 | Failed | 670ms | 2s 680ms | Completed (Failed) |
Anasazi_IRTRThyra_test_2_MPI_4 | Failed | 660ms | 2s 640ms | Completed (Failed) |
Anasazi_IRTRThyra_test_3_MPI_4 | Failed | 640ms | 2s 560ms | Completed (Failed) |
Anasazi_LOBPCGThyra_test_MPI_4 | Failed | 910ms | 3s 640ms | Completed (Failed) |
Anasazi_Tpetra_TraceMinDavidson_largest_standard_test_MPI_4 | Failed | 6s 800ms | 27s 200ms | Completed (Failed) |
Belos_bl_fgmres_hb_MPI_4 | Failed | 740ms | 2s 960ms | Completed (Failed) |
Belos_bl_pgmres_hb_0_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Belos_bl_pgmres_hb_1_MPI_4 | Failed | 760ms | 3s 40ms | Completed (Failed) |
Belos_bl_pgmres_hb_2_MPI_4 | Failed | 720ms | 2s 880ms | Completed (Failed) |
Belos_gcrodr_hb_MPI_4 | Failed | 1s 10ms | 4s 40ms | Completed (Failed) |
Belos_prec_gcrodr_hb_1_MPI_4 | Failed | 700ms | 2s 800ms | Completed (Failed) |
Belos_pseudo_gmres_hb_MPI_4 | Failed | 810ms | 3s 240ms | Completed (Failed) |
Belos_pseudo_pgmres_hb_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Intrepid2_unit-test_Orientation_Serial_Test_Orientation_TET_MPI_1 | Failed | 1s 180ms | 1s 180ms | Completed (Failed) |
NOX_LOCA_AnasaziJacobianInverse_MPI_1 | Failed | 10m 30ms | 10m 30ms | Completed (Timeout) |
NOX_LOCA_MultiPointTcubed_MPI_2 | Failed | 990ms | 1s 980ms | Completed (Failed) |
Piro_MatrixFreeDecorator_UnitTests_MPI_4 | Failed | 850ms | 3s 400ms | Completed (Failed) |
Stratimikos_Belos_GCRODR_strattest_MPI_4 | Failed | 1s 310ms | 5s 240ms | Completed (Failed) |
Stratimikos_Thyra_Belos_StatusTest_UnitTests_MPI_1 | Failed | 1s 110ms | 1s 110ms | Completed (Failed) |
It is strongly suspected that most of these failing tests are due to the compiler defect on this system studied in detail in #1208 (closed). The failing Belos tests shown above look the same as reported way back in #1191 (closed). (It is just that this is the first automated build of Trilinos actually testing with this system and these builds that posts to the Trilinos CDash site.)
Related Issues:
- Related to: #1208 (closed), #1191 (closed)