Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2018-11-30T11:16:53Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/2691Three ShyLU_DDFROSch_test_frosch_XXX tests failing in new GCC 4.8.4 + OpenMPI...2018-11-30T11:16:53ZJames WillenbringThree ShyLU_DDFROSch_test_frosch_XXX tests failing in new GCC 4.8.4 + OpenMPI 1.10.1 + OpenMP build*Created by: bartlettroscoe*
**CC:** @trilinos/shylu, @trilinos/framework , @srajama1
## Description
As shown at:
* https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3490484
* https://testing-vm.sandia.g...*Created by: bartlettroscoe*
**CC:** @trilinos/shylu, @trilinos/framework , @srajama1
## Description
As shown at:
* https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3490484
* https://testing-vm.sandia.gov/cdash/viewTest.php?onlyfailed&buildid=3490522
the tests:
* `ShyLU_DDFROSch_test_frosch_interfacesets_2D_MPI_4`
* `ShyLU_DDFROSch_test_frosch_laplacian_epetra_2d_gdsw_MPI_4`
* `ShyLU_DDFROSch_test_frosch_laplacian_epetra_2d_rgdsw_MPI_4`
are failing in the new GCC 4.8.4 + OpenMPI 1.10.1 + OpenMP build (as on the SNL COE RHEL6 machine crf450 which is submitted to CDash).
This build is getting cleaned up to provide the GCC 4.8.4 auto PR build described in #2317 and #2462.
These tests all fail by throwing the exception shown below:
```
terminate called after throwing an instance of 'Xpetra::Exceptions::RuntimeError'
Xpetra::Exceptions::RuntimeError'
what(): /ascldap/users/rabartl/Trilinos.base/NightlyBuilds/SRC_AND_BUILD/Trilinos/packages/xpetra/src/CrsMatrix/Xpetra_EpetraCrsMatrix.hpp:222:
Throw number = 1
Throw test that evaluated to true: true
Xpetra::EpetraCrsMatrix only available for GO=int or GO=long long with EpetraNode (Serial or OpenMP depending on configuration)
```
This then terminates the test program.
## Steps to reproduce
One should be able to reproduce these failing tests on any SNL COE RHEL6 machine that has the SEMS env. For example, on the CEE machine 'ceerws1113', I reproduced this by updating Trilinos and then doing:
```
$ cd <some-build-dir>/
$ source <trilinos-dir>/cmake/std/GCC-4.8.4-OpenMPI-1.10.1-MpiReleaseDebugSharedPtOpenMP_env.sh
$ module list
Currently Loaded Modulefiles:
1) sems-env
2) atdm-env
3) sems-python/2.7.9
4) atdm-cmake/3.11.1
5) sems-git/2.10.1
6) atdm-ninja_fortran/1.7.2
7) sems-gcc/4.8.4
8) sems-openmpi/1.10.1
9) sems-boost/1.63.0/base
10) sems-zlib/1.2.8/base
11) sems-hdf5/1.8.12/parallel
12) sems-netcdf/4.4.1/exo_parallel
13) sems-parmetis/4.0.3/parallel
14) sems-scotch/6.0.3/nopthread_64bit_parallel
15) sems-superlu/4.3/base
$ which cmake
/projects/sems/install/rhel6-x86_64/atdm/binary-install/cmake-3.11.1-Linux-x86_64/bin/cmake
$ rm -r CMake*
$ time cmake \
-C <trilinos-dir>/cmake/std/GCC-4.8.4-OpenMPI-1.10.1-MpiReleaseDebugSharedPtOpenMP.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_ShyLU_DD=ON \
<trilinos-dir> \
&> configure.out
real 0m22.379s
user 0m13.932s
sys 0m5.872s
$ time make -j16 &> make.out
real 34m48.506s
user 310m18.610s
sys 19m41.674s
$ time ctest -j16 &> ctest.out
real 0m4.584s
user 0m17.113s
sys 0m4.140s
```
This produced the test results:
```
$ grep -A 100 "tests failed out of" ctest.out
40% tests passed, 3 tests failed out of 5
Label Time Summary:
ShyLU_DD = 14.19 sec (5 tests)
Total Test time (real) = 4.56 sec
The following tests FAILED:
1 - ShyLU_DDFROSch_test_frosch_laplacian_epetra_2d_gdsw_MPI_4 (Failed)
2 - ShyLU_DDFROSch_test_frosch_laplacian_epetra_2d_rgdsw_MPI_4 (Failed)
5 - ShyLU_DDFROSch_test_frosch_interfacesets_2D_MPI_4 (Failed)
Errors while running CTest
```
The output from these failing tests seem to show the same throws and terminate:
```
terminate called after throwing an instance of 'Xpetra::Exceptions::RuntimeError'
what(): /scratch/rabartl/Trilinos.base/Trilinos/packages/xpetra/src/CrsMatrix/Xpetra_EpetraCrsMatrix.hpp:222:
Throw number = 1
Throw test that evaluated to true: true
Xpetra::EpetraCrsMatrix only available for GO=int or GO=long long with EpetraNode (Serial or OpenMP depending on configuration)
```
## Related Issues
* Blocking Issues: #2462
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/2455Address timing out test Anasazi_Epetra_BlockDavidson_auxtest_MPI_4 in ATDM bu...2018-11-30T11:16:52ZJames WillenbringAddress timing out test Anasazi_Epetra_BlockDavidson_auxtest_MPI_4 in ATDM builds of Trilinos*Created by: bartlettroscoe*
**CC:** @trilinos/anasazi
## Next Action Status
Tests `Anasazi_Epetra_BlockDavidson_auxtest_MPI_4` and `Anasazi_Epetra_LOBPCG_auxtest_MPI_4` are disabled in several builds in the commits 8f23641 and c...*Created by: bartlettroscoe*
**CC:** @trilinos/anasazi
## Next Action Status
Tests `Anasazi_Epetra_BlockDavidson_auxtest_MPI_4` and `Anasazi_Epetra_LOBPCG_auxtest_MPI_4` are disabled in several builds in the commits 8f23641 and c66a268 and did not timeout in any builds on 3/27/2018 (see [below](https://github.com/trilinos/Trilinos/issues/2455#issuecomment-376619629)).
## Description
This Story is to address the test `Anasazi_Epetra_BlockDavidson_auxtest_MPI_4` that times out in several builds as shown in results yesterday on CDash at:
* https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-25&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=testname&compare1=61&value1=Anasazi_Epetra_BlockDavidson_auxtest_MPI_4&field2=status&compare2=62&value2=passed&field3=status&compare3=62&value3=notrun
This shows the test timing out at 10 minutes on the builds:
* `Trilinos-atdm-hansen-shiller-cuda-debug`
* `Trilinos-atdm-hansen-shiller-cuda-opt`
* `Trilinos-atdm-hansen-shiller-gnu-debug-serial`
* `Trilinos-atdm-hansen-shiller-gnu-opt-serial`
and the failing in the builds:
* `Trilinos-atdm-white-ride-cuda-opt`
* `Trilinos-atdm-white-ride-gnu-opt-openmp`
These failures show segfaults and are likely due to the compiler defect reported in #1208 and there are many Anasazi and Belos tests that segfault due to this as shown in #2454.
Therefore, this Story will only consider the timing-out tests, not the failing tests in the 'opt' builds on 'white' and 'ride' (since that is being covered in #2454).
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/2270New timed-out Amesos2 tests in Trilinos-atdm-sems-gcc-7-2-0 build on 2/20/20182019-01-24T23:43:03ZJames WillenbringNew timed-out Amesos2 tests in Trilinos-atdm-sems-gcc-7-2-0 build on 2/20/2018*Created by: bartlettroscoe*
**CC:** @trilinos/amesos2
## Next Action Status
The tests were disabled in all `Trilinos_ENABLE_DEBUG=ON` builds on 2/22/2018 (see [below](https://github.com/trilinos/Trilinos/issues/2270#issuecommen...*Created by: bartlettroscoe*
**CC:** @trilinos/amesos2
## Next Action Status
The tests were disabled in all `Trilinos_ENABLE_DEBUG=ON` builds on 2/22/2018 (see [below](https://github.com/trilinos/Trilinos/issues/2270#issuecomment-367815146)). Next: Fix the tests so that they pass?
## Description
The tests:
* Amesos2_KLU2_UnitTests_MPI_2
* Amesos2_Superlu_UnitTests_MPI_2
timed out at 10 minutes in the build `Trilinos-atdm-sems-gcc-7-2-0` this morning as shown at:
* https://testing.sandia.gov/cdash/index.php?project=Trilinos&parentid=3396946
Prior to this morning, these tests were taking:
* Amesos2_KLU2_UnitTests_MPI_2: 1.5s
* Amesos2_Superlu_UnitTests_MPI_2: 1.7s
It looks like these tests are hanging due to an exception being thrown?
## Steps to Reproduce
Using the `do-configure` script:
```
#!/bin/bash
cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/sems/atdm/SEMSATDMSettings.cmake,cmake/std/MpiReleaseDebugSharedPtSettings.cmake,cmake/std/BasicCiTestingSettings.cmake \
-DDART_TESTING_TIMEOUT:STRING=300.0 \
-DTrilinos_ENABLE_TESTS:BOOL=ON \
-DCTEST_BUILD_FLAGS=-j10 \
-DCTEST_PARALLEL_LEVEL=10 \
"$@" \
$TRILINOS_DIR
```
Anyone should be able to reproduce these failures on any SNL COE RHEL6 machine as shown below:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/sems/atdm/load_atdm_7.2_dev_env.sh
$ ./do-configure -DTrilinos_ENABLE_Amesos2=ON
$ make -j16
$ ctest -j16
```
NOTE: The timeout like `-DDART_TESTING_TIMEOUT:STRING=300.0` is important or ctest will never end.
Keep promoted "ATDM" builds of Trilinos clean