Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2019-06-08T15:27:25Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4989Muelu: MueLu_Maxwell3D-Tpetra_MPI_4 failing on atdm complex build2019-06-08T15:27:25ZJames WillenbringMuelu: MueLu_Maxwell3D-Tpetra_MPI_4 failing on atdm complex build*Created by: fryeguy52*
## Bug Report
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected pa...*Created by: fryeguy52*
## Bug Report
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. "MueLu", "Tpetra", "Kokkos", etc.)>
<???: Add milestone "Initial cleanup of new ATDM builds of Trilinos" or "Keep promoted ATDM builds of Trilinos clean">
<???: Once GitHub Issue is created, add entries for tests to TrilinosATDMStatus/*.csv files>
<???: Add label "PA: ???Project Area???" (e.g. "PA: Linear Solvers", "PA: Data Services")>
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-release-debug&field2=testname&compare2=61&value2=MueLu_Maxwell3D-Tpetra_MPI_4&field3=site&compare3=61&value3=sems-rhel7&field4=buildstarttime&compare4=84&value4=2019-04-22T00%3A00%3A00&field5=buildstarttime&compare5=83&value5=2019-03-23T00%3A00%3A00) the test:
* MueLu_Maxwell3D-Tpetra_MPI_4
is failing in the build:
* Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-release-debug
## Current Status on CDash
[Test results last 5 days](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-release-debug&field2=testname&compare2=61&value2=MueLu_Maxwell3D-Tpetra_MPI_4&field3=site&compare3=61&value3=sems-rhel7&field4=buildstarttime&compare4=83&value4=5%20days%20ago)
## Steps to Reproduce
One should be able to reproduce this failure on with a sems rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for with a sems rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#sems-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j8
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4982MueLu: MueLu_Helmholtz2DParallel_MPI_4 failing on ATDM complex builds2019-06-08T15:27:25ZJames WillenbringMueLu: MueLu_Helmholtz2DParallel_MPI_4 failing on ATDM complex builds*Created by: fryeguy52*
## Bug Report
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
## Description
As shown in [this query](https://testing.sandia...*Created by: fryeguy52*
## Bug Report
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=buildname&compare2=63&value2=-complex-&field3=testname&compare3=65&value3=MueLu_Helmholtz2DParallel_MPI_4&field4=buildstarttime&compare4=84&value4=2019-04-22T00%3A00%3A00&field5=buildstarttime&compare5=83&value5=2019-03-23T00%3A00%3A00) the test:
* MueLu_Helmholtz2DParallel_MPI_4
has been failing since 2019-01-11 in the builds:
* Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-debug
* Trilinos-atdm-sems-rhel7-gnu-7.2.0-openmp-complex-shared-release-debug
* Trilinos-atdm-sems-rhel7-intel-17.0.1-openmp-complex-shared-release-debug
* Trilinos-atdm-sems-rhel7-clang-3.9.0-openmp-complex-shared-release-debug
new commits on 2019-04-11 can be found [here](https://testing.sandia.gov/cdash/viewNotes.php?buildid=4867040#!#note2)
## Current Status on CDash
[results for the current testing day](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=buildname&compare2=63&value2=complex&field3=testname&compare3=65&value3=MueLu_Helmholtz2DParallel_MPI_4&field4=buildstarttime&compare4=84&value4=today&field5=buildstarttime&compare5=83&value5=yesterday)
## Steps to Reproduce
One should be able to reproduce this failure on with a sems rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for with a sems rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#sems-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-sems-rhel6-gnu-7.2.0-openmp-complex-shared-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j8
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4929Link problems with libmuelu breaking most ATDM Trilinos builds starting 4/17/...2020-07-22T01:04:27ZJames WillenbringLink problems with libmuelu breaking most ATDM Trilinos builds starting 4/17/2019*Created by: bartlettroscoe*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "client: ATDM">
<???: Add label "ATDM Sev: Blocker" (by default but could ...*Created by: bartlettroscoe*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "client: ATDM">
<???: Add label "ATDM Sev: Blocker" (by default but could be other "ATDM Sev: XXX")>
<???: Add label "type: bug"?>
<???: Add label for affected packages (e.g. "pkg: MueLu", "pkg: Tpetra", "pkg: Kokkos", etc.)>
<???: Add label "PA: ???Project Area???" (e.g. "PA: Linear Solvers", "PA: Data Services")>
<???: Add milestone "Initial cleanup of new ATDM ..." or "Keep promoted ATDM ...">
<???: Once GitHub Issue is created, add entries for tests to TrilinosATDMStatus/*.csv files>
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/index.php?project=Trilinos&date=2019-04-17&filtercount=1&showfilters=1&field1=buildname&compare1=65&value1=Trilinos-atdm-) there are link errors related to the muelu library. For example, as shown [here](https://testing.sandia.gov/cdash-dev-view/viewBuildError.php?buildid=4904861) it shows link errors like:
```
packages/muelu/src/libmuelu.a(MueLu_CoalesceDropFactory.cpp.o):(.rodata._ZTVN5MueLu7LWGraphIixN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE[_ZTVN5MueLu7LWGraphIixN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE]+0xa8): undefined reference to `MueLu::LWGraph<int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::print(Teuchos::basic_FancyOStream<char, std::char_traits<char> >&, int) const'
packages/muelu/src/libmuelu.a(MueLu_CoalesceDropFactory.cpp.o):(.rodata._ZTVN5MueLu5GraphIixN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE[_ZTVN5MueLu5GraphIixN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE]+0xa8): undefined reference to `MueLu::Graph<int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::print(Teuchos::basic_FancyOStream<char, std::char_traits<char> >&, int) const'
packages/muelu/src/libmuelu.a(MueLu_CoalesceDropFactory.cpp.o):(.rodata._ZTVN5MueLu7LWGraphIiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE[_ZTVN5MueLu7LWGraphIiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE]+0xa8): undefined reference to `MueLu::LWGraph<int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::print(Teuchos::basic_FancyOStream<char, std::char_traits<char> >&, int) const'
packages/muelu/src/libmuelu.a(MueLu_CoalesceDropFactory.cpp.o):(.rodata._ZTVN5MueLu5GraphIiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE[_ZTVN5MueLu5GraphIiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6OpenMPENS1_9HostSpaceEEEEE]+0xa8): undefined reference to `MueLu::Graph<int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::OpenMP, Kokkos::HostSpace> >::print(Teuchos::basic_FancyOStream<char, std::char_traits<char> >&, int) const'
collect2: error: ld returned 1 exit status
```
## Steps to Reproduce
One should be able to reproduce this failure on many of the systems as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4678Stratimikos and Rythmos tests failing on many ATDM builds2019-03-26T15:04:05ZJames WillenbringStratimikos and Rythmos tests failing on many ATDM builds*Created by: fryeguy52*
CC: @trilinos/stratimikos, @srajama1 (Trilinos Linear Solvers Product Lead), @rppawlo (Trilinos Nonlinear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add lab...*Created by: fryeguy52*
CC: @trilinos/stratimikos, @srajama1 (Trilinos Linear Solvers Product Lead), @rppawlo (Trilinos Nonlinear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. "MueLu", "Tpetra", "Kokkos", etc.)>
<???: Add milestone "Initial cleanup of new ATDM builds of Trilinos" or "Keep promoted ATDM builds of Trilinos clean">
<???: Once GitHub Issue is created, add entries for tests to TrilinosATDMStatus/*.csv files>
<???: Add label "PA: ???Project Area???" (e.g. "PA: Linear Solvers", "PA: Data Services")>
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-20&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=status&compare2=61&value2=Failed&field3=testname&compare3=62&value3=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field4=buildstarttime&compare4=83&value4=2019-03-20&field5=buildstarttime&compare5=84&value5=2019-03-21) the tests:
* Stratimikos_test_single_stratimikos_solver_driver_belos_np_MPI_1
* Stratimikos_test_single_stratimikos_solver_driver_belos_ml_MPI_1
* Stratimikos_test_single_stratimikos_solver_driver_belos_ifpack_MPI_1
* Rythmos_timeDiscretizedBackwardEuler_amesos_MPI_1
are failing in many ATDM builds.
[new commits when these started failing](https://testing.sandia.gov/cdash/viewNotes.php?buildid=4754139#!#note4)
## Current Status on CDash
currently failing tests in ATDM builds can be seen [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-20&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=status&compare2=61&value2=Failed&field3=buildstarttime&compare3=83&value3=today&field4=buildstarttime&compare4=84&value4=tomorrow)
## Steps to Reproduce
One should be able to reproduce this failure on with a sems rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for with a sems rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#sems-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-sems-rhel6-gnu-7.2.0-openmp-release
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON \
-DTrilinos_ENABLE_Stratimikos=ON \
-DTrilinos_ENABLE_Rythmos=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j8
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4646Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 timing out on waterman cuda...2019-04-21T01:44:26ZJames WillenbringIfpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 timing out on waterman cuda builds*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https:/...*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-17&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=testname&compare2=61&value2=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildname&compare4=63&value4=cuda&field5=buildstarttime&compare5=83&value5=2019-03-15&field6=buildstarttime&compare6=84&value6=today) the tests:
* Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4
are timing out in the builds:
* Trilinos-atdm-waterman-cuda-9.2-opt
* Trilinos-atdm-waterman-cuda-9.2-debug
* Trilinos-atdm-waterman-cuda-9.2-release-debug
the same test in the white cuda builds finished in about 30 seconds shown [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-17&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=testname&compare2=61&value2=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field3=site&compare3=61&value3=white&field4=buildname&compare4=63&value4=cuda&field5=buildstarttime&compare5=83&value5=2019-03-15&field6=buildstarttime&compare6=84&value6=today)
## Current Status on CDash
[Current status](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-17&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=testname&compare2=61&value2=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildname&compare4=63&value4=cuda&field5=buildstarttime&compare5=83&value5=yesterday&field6=buildstarttime&compare6=84&value6=today)
## Steps to Reproduce
One should be able to reproduce this failure on waterman as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for waterman are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-waterman-cuda-9.2-opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Ifpack2=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -n 20 ctest -j20
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4622Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 failing in ATDM cuda builds2019-03-18T16:12:43ZJames WillenbringIfpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4 failing in ATDM cuda builds*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
## Description
As shown in [this query](https://testing.sandia.gov/cdash/que...*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2019-03-14&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=buildname&compare2=64&value2=-rdc-&field3=testname&compare3=61&value3=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field4=status&compare4=62&value4=Passed&field5=buildstarttime&compare5=83&value5=2019-03-14&field6=buildstarttime&compare6=84&value6=2019-03-15) the tests:
* Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4
are failing in the builds:
* Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug
* Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-shared-release-debug
* Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-rdc-release-debug
* Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-debug
* Trilinos-atdm-waterman-cuda-9.2-release-debug
* Trilinos-atdm-waterman-cuda-9.2-rdc-shared-release-debug
* Trilinos-atdm-waterman-cuda-9.2-rdc-release-debug
* Trilinos-atdm-waterman-cuda-9.2-opt
* Trilinos-atdm-waterman-cuda-9.2-debug
* Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-static-release-debug
* Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-shared-release-debug
## Current Status on CDash
[Currently Status on CDash for all ATDM builds](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Ifpack2_BlockTriDiContainerUnitAndPerfTests_MPI_4&field3=buildstarttime&compare3=84&value3=today&field4=buildstarttime&compare4=83&value4=yesterday)
## Steps to Reproduce on waterman
One should be able to reproduce this failure on waterman as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for waterman are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-waterman-cuda-9.2-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Ifpack2=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -n 20 ctest -j20
```
## Steps to Reproduce on white/ride
One should be able to reproduce this failure on ride or white as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for ride or white are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Ifpack2=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4353Ifpack2_unit_tests_MPI_4 randomly failing on ATDM waterman build2019-04-21T01:32:25ZJames WillenbringIfpack2_unit_tests_MPI_4 randomly failing on ATDM waterman build*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. ...*Created by: fryeguy52*
CC: @trilinos/ifpack2, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
<Checklist>
<???: Add label "ATDM">
<???: Add label "bug"?>
<???: Add label for affected packages (e.g. "MueLu", "Tpetra", "Kokkos", etc.)>
<???: Add milestone "Initial cleanup of new ATDM builds of Trilinos" or "Keep promoted ATDM builds of Trilinos clean">
<???: Once GitHub Issue is created, add entries for tests to TrilinosATDMStatus/*.csv files>
<???: Add label "PA: ???Project Area???" (e.g. "PA: Linear Solvers", "PA: Data Services")>
## Next Action Status
<status-and-or-first-action>
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-waterman-cuda-9.2-opt&field2=testname&compare2=61&value2=Ifpack2_unit_tests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildstarttime&compare4=84&value4=2019-02-08T00%3A00%3A00&field5=buildstarttime&compare5=83&value5=2018-12-27T00%3A00%3A00) the test:
* Ifpack2_unit_tests_MPI_4
is randomly failing in the buils:
* Trilinos-atdm-waterman-cuda-9.2-opt
It has failed roughly 6 times in the last month. Here are some examples of the output when it fails:
```
Error, relErr(Y.get1dView ()[9932],Z.get1dView ()[9932]) = relErr(29832,0) = 1 <= tol = 2.22045e-12: failed!
```
```
p=0 | The following tests FAILED:
p=0 | 48. Ifpack2OverlappingRowMatrix_default_scalar_type_default_local_ordinal_type_default_global_ordinal_type_Test0_UnitTest ...
p=0 |
p=0 | Total Time: 6.49 sec
p=0 |
p=1 | Summary: total = 82, run = 82, passed = 81, failed = 1
p=1 |
p=1 | End Result: TEST FAILED
```
## Current Status on CDash
[2 Week history of this test](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-waterman-cuda-9.2-opt&field2=testname&compare2=61&value2=Ifpack2_unit_tests_MPI_4&field3=site&compare3=61&value3=waterman&field4=buildstarttime&compare4=84&value4=tomorrow&field5=buildstarttime&compare5=83&value5=2%20weeks%20ago)
## Steps to Reproduce
One should be able to reproduce the build on waterman where this test is randomly failing as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for waterman are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-waterman-cuda-9.2-opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Ifpack2=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -n 20 ctest -j20
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/4260Belos tests timing out on ATDM intel-18 mpich build2019-03-27T20:41:42ZJames WillenbringBelos tests timing out on ATDM intel-18 mpich build*Created by: fryeguy52*
CC: @trilinos/belos, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in the links below the t...*Created by: fryeguy52*
CC: @trilinos/belos, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
<status-and-or-first-action>
## Description
As shown in the links below the tests:
* [Belos_rcg_hb_MPI_4](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt&field2=buildstarttime&compare2=84&value2=2019-01-25&field3=buildstarttime&compare3=83&value3=2019-01-01&field4=testname&compare4=61&value4=Belos_rcg_hb_MPI_4)
* [Belos_gcrodr_hb_MPI_4](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt&field2=buildstarttime&compare2=84&value2=2019-01-25&field3=buildstarttime&compare3=83&value3=2019-01-01&field4=testname&compare4=61&value4=Belos_gcrodr_hb_MPI_4)
are randomly timing out in the build:
* Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt
## Current Status on CDash
The current status of the Belos tests on this build for the current testing day can be found [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt&field2=buildstarttime&compare2=84&value2=today&field3=buildstarttime&compare3=83&value3=yesterday&field4=testname&compare4=65&value4=Belos_)
## Steps to Reproduce
One should be able to reproduce a build where this is randomly failing on a machine with a cee rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for a machine with a cee rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce the build where this issue is randomly occurring should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-cee-rhel6_intel-18.0.2_mpich2-3.2_openmp_static_opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Belos=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3994MueLu_Maxwell3D- tests not run due to build failure in ATDM build2018-12-21T02:48:28ZJames WillenbringMueLu_Maxwell3D- tests not run due to build failure in ATDM build*Created by: fryeguy52*
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
Merge of PR #3993 on 12/4/2018 resulted in passing build on [12/5/2018](https://test...*Created by: fryeguy52*
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
Merge of PR #3993 on 12/4/2018 resulted in passing build on [12/5/2018](https://testing.sandia.gov/cdash-dev-view/index.php?project=Trilinos&parentid=4253654).
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt&field2=testname&compare2=65&value2=MueLu_Maxwell3D-&field3=testname&compare3=66&value3=_MPI_4&field4=site&compare4=61&value4=cee-rhel6&field5=buildstarttime&compare5=84&value5=2018-12-04T00%3A00%3A00&field6=buildstarttime&compare6=83&value6=2018-11-04T00%3A00%3A00) the following tests are not being run due to a [build failure](https://testing.sandia.gov/cdash/viewBuildError.php?buildid=4245202) that started on 12/01/2018:
* MueLu_Maxwell3D-Epetra_MPI_4
* MueLu_Maxwell3D-Tpetra-Stratimikos_MPI_4
* MueLu_Maxwell3D-Tpetra_MPI_4
in the build:
* Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt
The error occurs when building `packages/muelu/test/maxwell/CMakeFiles/MueLu_Maxwell3D.dir/Maxwell3D.cpp.o`
Standard error:
```
/scratch/rabartl/Trilinos.base/NightlyBuilds/Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt/SRC_AND_BUILD/Trilinos/packages/muelu/test/maxwell/Maxwell3D.cpp:262:11: error: no viable overloaded '='
tm2 = Teuchos::null;
~~~ ^ ~~~~~~~~~~~~~
/scratch/rabartl/Trilinos.base/NightlyBuilds/Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/src/Teuchos_TimeMonitor.hpp:178:34: note: candidate function (the implicit copy assignment operator) not viable: no known conversion from 'Teuchos::ENull' to 'const Teuchos::TimeMonitor' for 1st argument
class TEUCHOSCOMM_LIB_DLL_EXPORT TimeMonitor :
^
/scratch/rabartl/Trilinos.base/NightlyBuilds/Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt/SRC_AND_BUILD/Trilinos/packages/muelu/test/maxwell/Maxwell3D.cpp:274:11: error: no viable overloaded '='
tm3 = Teuchos::null;
~~~ ^ ~~~~~~~~~~~~~
/scratch/rabartl/Trilinos.base/NightlyBuilds/Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt/SRC_AND_BUILD/Trilinos/packages/teuchos/comm/src/Teuchos_TimeMonitor.hpp:178:34: note: candidate function (the implicit copy assignment operator) not viable: no known conversion from 'Teuchos::ENull' to 'const Teuchos::TimeMonitor' for 1st argument
class TEUCHOSCOMM_LIB_DLL_EXPORT TimeMonitor :
^
2 errors generated.
```
## Current Status on CDash
The current status of these tests/builds for the current testing day can be found [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt&field2=testname&compare2=65&value2=MueLu_Maxwell3D-&field3=testname&compare3=66&value3=_MPI_4&field4=site&compare4=61&value4=cee-rhel6)
## Steps to Reproduce
One should be able to reproduce this failure on a machine with a cee rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for a machine with a cee rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3992Anasazi_Epetra_BKS_norestart_test_MPI_4 failing in seveal ATDM builds.2018-12-20T18:04:13ZJames WillenbringAnasazi_Epetra_BKS_norestart_test_MPI_4 failing in seveal ATDM builds.*Created by: fryeguy52*
CC: @trilinos/anasazi, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
Triggered by the PR #3951 merged to 'develop' on 10/28/2018 that worked around Int...*Created by: fryeguy52*
CC: @trilinos/anasazi, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
Triggered by the PR #3951 merged to 'develop' on 10/28/2018 that worked around Intel 18.0.2 MKL GEEV defect. Next: Try updated Intel MKL 18.0.5 on 'mutrino' (with local revert of #3951) and see all of these failures go away (@fryeguy52) ...
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_BKS_norestart_test_MPI_4&field3=buildstarttime&compare3=83&value3=2018-11-04T00%3A00%3A00&field4=status&compare4=61&value4=Failed) the test:
* Anasazi_Epetra_BKS_norestart_test_MPI_4
is failing in the builds:
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW (since ???)
* Trilinos-atdm-mutrino-intel-opt-openmp-KNL (since ???)
* Trilinos-atdm-cee-rhel6-intel-17.0.1-intelmpi-5.1.2-serial-static-opt (since 11/30/2018)
* Trilinos-atdm-cee-rhel6-gnu-7.2.0-openmpi-1.10.2-serial-static-opt (11/29/2018 & 12/1/2018)
* Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt (on 12/2/2018)
* Trilinos-atdm-cee-rhel6-gnu-4.9.3-openmpi-1.10.2-serial-static-opt (on 12/10/2018)
<more-details>
Looks like some of these failures are random like shown for the build [Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt&field2=testname&compare2=61&value2=Anasazi_Epetra_BKS_norestart_test_MPI_4&field3=site&compare3=61&value3=cee-rhel6&field4=buildstarttime&compare4=83&value4=2018-11-11T00%3A00%3A00) and the build [Trilinos-atdm-cee-rhel6-gnu-7.2.0-openmpi-1.10.2-serial-static-opt](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6-gnu-7.2.0-openmpi-1.10.2-serial-static-opt&field2=testname&compare2=61&value2=Anasazi_Epetra_BKS_norestart_test_MPI_4&field3=site&compare3=61&value3=cee-rhel6&field4=buildstarttime&compare4=84&value4=2018-12-11T00%3A00%3A00&field5=buildstarttime&compare5=83&value5=2018-11-11T00%3A00%3A00).
The errors look like [here](https://testing.sandia.gov/cdash/testDetails.php?test=61150478&build=4276066) for example:
```
Number of iterations performed in BlockKrylovSchur_test.exe: 30
Direct residual norms computed in BlockKrylovSchur_test.exe
Eigenvalue Residual
----------------------------------------
1.199112e+05 1.296543e-07
1.196455e+05 1.185550e-07
1.192047e+05 4.530562e-04
1.185918e+05 1.497329e-04
1.178109e+05 4.552932e-04
End Result: TEST FAILED
-------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[25128,1],1]
Exit code: 255
--------------------------------------------------------------------------
...
```
## Current Status on CDash
The current status of these tests/builds for the current testing day can be found [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=6&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=buildname&compare2=62&value2=Trilinos-atdm-cee-rhel6-intel-18.0.2-mpich2-3.2-serial-static-opt&field3=testname&compare3=61&value3=Anasazi_Epetra_BKS_norestart_test_MPI_4&field4=buildstarttime&compare4=83&value4=1%20day%20ago&field5=status&compare5=61&value5=Failed&field6=site&compare6=62&value6=mutrino)
## Steps to Reproduce
One should be able to reproduce this failure on a machine with a cee rhel6 environment as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for a machine with a cee rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-cee-rhel6-intel-17.0.1-intelmpi-5.1.2-serial-static-opt
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Anasazi=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3989Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4 in many ATDM builds2018-12-20T17:28:41ZJames WillenbringAnasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4 in many ATDM builds*Created by: fryeguy52*
CC: @trilinos/anasazi, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
The merge of PR #4031 to 'develop' on 12/13/2018 seems to have resulted in the ...*Created by: fryeguy52*
CC: @trilinos/anasazi, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
## Next Action Status
The merge of PR #4031 to 'develop' on 12/13/2018 seems to have resulted in the test `Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4` passing in all ATDM Trilinos builds. It passed in all 41 ATDM Trilinos builds on 2018-12-19 as shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&date=2018-12-19&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4) (and there were no missing builds for testing day 2018-12-19 so this should be complete test results).
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4&field3=buildstarttime&compare3=84&value3=2018-12-04T00%3A00%3A00&field4=buildstarttime&compare4=83&value4=2018-11-04T00%3A00%3A00&field5=status&compare5=61&value5=Failed) the test: `Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4` is has failed in many ATDM builds since 11/24/2018 all the builds where this has failed in that time are are:
* Trilinos-atdm-sems-rhel6-intel-opt-openmp
* Trilinos-atdm-mutrino-intel-opt-openmp-KNL
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW
* Trilinos-atdm-chama-intel-opt-openmp
* Trilinos-atdm-chama-intel-debug-openmp
* Trilinos-atdm-cee-rhel6-intel-17.0.1-intelmpi-5.1.2-serial-static-opt
* Trilinos-atdm-cee-rhel6-gnu-7.2.0-openmpi-1.10.2-serial-static-opt
* Trilinos-atdm-cee-rhel6-gnu-4.9.3-openmpi-1.10.2-serial-static-opt
* Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt
The test has been failing everyday since 11/29/2018 in the builds:
* Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt
* Trilinos-atdm-cee-rhel6-gnu-4.9.3-openmpi-1.10.2-serial-static-opt
* Trilinos-atdm-cee-rhel6-gnu-7.2.0-openmpi-1.10.2-serial-static-opt
the test output looks like this in these cases:
```
Building Map
Setting up info for filling matrix
Creating matrix
Filling matrix
Calling FillComplete on matrix
Setting Anasazi parameters
Creating initial vector for solver
Creating eigenproblem
Creating eigensolver (GeneralizedDavidsonSolMgr)
Solving eigenproblem
[ceerws1113:51638] *** An error occurred in MPI_Allreduce
[ceerws1113:51638] *** reported by process [999489537,2]
[ceerws1113:51638] *** on communicator MPI_COMM_WORLD
[ceerws1113:51638] *** MPI_ERR_IN_STATUS: error code in status
[ceerws1113:51638] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[ceerws1113:51638] *** and potentially your MPI job)
[ceerws1113:51629] 1 more process has sent help message help-mpi-errors.txt / mpi_errors_are_fatal
[ceerws1113:51629] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
```
## Current Status on CDash
The current status of this test on all ATDM builds can be found [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4)
History for the last week on ATDM builds can be seen [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_GeneralizedDavidson_nh_test_MPI_4&field3=buildstarttime&compare3=83&value3=7%20days%20ago)
## Steps to Reproduce on CEE RHEL6
One should be able to reproduce this failure on a machine with a cee rhel6 environment because it has been failing there everyday. The process is described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for a machine with a cee rhel6 environment are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-cee-rhel6-gnu-4.9.3-openmpi-1.10.2-serial-static-op
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Anasazi=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3897MueLu_UnitTests[Blocked][Epetra|Tpetra]_MPI_4 failing randomly on several ATD...2019-02-11T17:29:49ZJames WillenbringMueLu_UnitTests[Blocked][Epetra|Tpetra]_MPI_4 failing randomly on several ATDM builds*Created by: fryeguy52*
CC: @trilinos/MueLu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #4046 merged to 'develop' on 12/18/2018 may fix these random failures. Next: Watch for more...*Created by: fryeguy52*
CC: @trilinos/MueLu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #4046 merged to 'develop' on 12/18/2018 may fix these random failures. Next: Watch for more failures over the coming days and weeks to see if there are any more failures ...
## Description
As shown in the links below the tests:
* [MueLu_UnitTestsBlockedEpetra_MPI_4](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=buildname&compare2=62&value2=Trilinos-atdm-cee-rhel6-gnu-7.2.0-opt-serial&field3=testname&compare3=61&value3=MueLu_UnitTestsBlockedEpetra_MPI_4&field4=status&compare4=61&value4=failed&field5=buildstarttime&compare5=83&value5=2018-10-17)
* [MueLu_UnitTestsEpetra_MPI_4](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=MueLu_UnitTestsEpetra_MPI_4&field3=status&compare3=61&value3=failed&field4=details&compare4=64&value4=Timeout&field5=buildstarttime&compare5=83&value5=2018-10-16)
* [MueLu_UnitTestsEpetra_MPI_1](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=MueLu_UnitTestsEpetra_MPI_1&field3=status&compare3=61&value3=failed&field4=details&compare4=64&value4=Timeout&field5=buildstarttime&compare5=83&value5=2018-10-16)
* [MueLu_UnitTestsTpetra_MPI_1](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=MueLu_UnitTestsTpetra_MPI_1&field3=status&compare3=61&value3=failed&field4=details&compare4=64&value4=Timeout&field5=buildstarttime&compare5=83&value5=2018-10-16)
* [MueLu_UnitTestsTpetra_MPI_1](https://testing.sandia.gov/cdash/testDetails.php?test=61339618&build=4287170)
are randomly failing across several builds. They has failed several times in the last month on different builds. The builds where we have seen failures are:
* Trilinos-atdm-cee-rhel6-gnu-4.9.3-opt-serial
* Trilinos-atdm-cee-rhel6-gnu-opt-serial
* Trilinos-atdm-cee-rhel6-intel-opt-serial
* Trilinos-atdm-hansen-shiller-gnu-opt-openmp
* Trilinos-atdm-hansen-shiller-gnu-opt-serial
* Trilinos-atdm-hansen-shiller-gnu-opt-serial
* Trilinos-atdm-hansen-shiller-intel-debug-openmp
* Trilinos-atdm-hansen-shiller-intel-debug-serial
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW
* Trilinos-atdm-mutrino-intel-opt-openmp-KNL
* Trilinos-atdm-sems-rhel6-gnu-debug-openmp
* Trilinos-atdm-sems-rhel6-intel-opt-openmp
* Trilinos-atdm-serrano-intel-opt-openmp
* Trilinos-atdm-waterman-gnu-opt-openmp
* Trilinos-atdm-waterman-gnu-release-debug-openmp
* Trilinos-atdm-white-ride-cuda-9.2-opt
* Trilinos-atdm-white-ride-gnu-opt-openmp
It looks like that in each case something similar to the following appears in the 'openmp' builds:
```
...
p=0: *** Caught standard std::exception of type 'Xpetra::Exceptions::RuntimeError' :
EpetraExt::MatrixMarketFileToCrsMatrix return value of -1
[FAILED] (0.0902 sec) Hierarchy_double_int_int_Kokkos_Compat_KokkosOpenMPWrapperNode_Write_UnitTest
Location: /home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-intel-debug-openmp/SRC_AND_BUILD/Trilinos/packages/muelu/test/unit_tests/Hierarchy.cpp:889
...
The following tests FAILED:
116. Hierarchy_double_int_int_Kokkos_Compat_KokkosOpenMPWrapperNode_Write_UnitTest ...
...
```
and the 'serial' builds show:
```
...
p=0: *** Caught standard std::exception of type 'Xpetra::Exceptions::RuntimeError' :
EpetraExt::MatrixMarketFileToCrsMatrix return value of -1
[FAILED] (0.00618 sec) Hierarchy_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_Write_UnitTest
Location: /jenkins/slave/workspace/Trilinos-atdm-sems-rhel6-gnu-debug-serial/SRC_AND_BUILD/Trilinos/packages/muelu/test/unit_tests/Hierarchy.cpp:889
...
The following tests FAILED:
116. Hierarchy_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_Write_UnitTest ...
...
```
It is just that one failing unit test 116 called `Hierarchy_double_int_int_Kokkos_Compat_KokkosSerialWrapperNode_Write_UnitTest` in the 'serial' builds and called `Hierarchy_double_int_int_Kokkos_Compat_KokkosOpenMPWrapperNode_Write_UnitTest` in the 'openmp' builds.
The first failure showed up on 2018-10-21
## Current Status on CDash
To see failures for these tests in the last month click [here](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=65&value2=MueLu_UnitTests&field3=status&compare3=61&value3=failed&field4=details&compare4=64&value4=Timeout&field5=buildstarttime&compare5=83&value5=30%20days%20ago).
## Steps to Reproduce
This may be very difficult to reproduce because it is failing infrequently on a single build but nearly every other day across all the builds. Instructions for reproducing ATDM builds can be found at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for ride or white are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite
The exact commands to reproduce one build where this has failed on white or ride are:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-white-ride-gnu-opt-openmp
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3655MueLu: CreatePreconditioner_PDESystem hangs/fails in fused Jacobi under CUDA2018-11-30T03:12:10ZJames WillenbringMueLu: CreatePreconditioner_PDESystem hangs/fails in fused Jacobi under CUDA*Created by: jhux2*
I first saw this on waterman, but it's also happening on geminga:
```
[snip]
Eigenvalue estimate
Calculating max eigenvalue estimate now (max iters = 10)
Prolongator damping factor = 0.68 (1.33 / 1.94)
Fused ...*Created by: jhux2*
I first saw this on waterman, but it's also happening on geminga:
```
[snip]
Eigenvalue estimate
Calculating max eigenvalue estimate now (max iters = 10)
Prolongator damping factor = 0.68 (1.33 / 1.94)
Fused (I-omega*D^{-1} A)*Ptent
[hang]
```
2018-10-17 MueLu dashboard
Linux-gcc-5.3.0-OPENMPI-1.8.7_RELEASE_KOKKOS-REFACTOR_EXPERIMENTAL_CUDA-8.0.44
[MueLu_UnitTestsTpetra_MPI_1](https://testing.sandia.gov/cdash/testDetails.php?test=56901030&build=4060598)
[MueLu_UnitTestsTpetra_MPI_4](https://testing.sandia.gov/cdash/testDetails.php?test=56901000&build=4060598)
Blocks: #2674, #3482
@trilinos/muelu @csiefer2 Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3640MueLu_UnitTestsBlockedEpetra_MPI_1 failing on ATDM cee-rhel6-clang-opt-serial...2018-12-20T18:23:36ZJames WillenbringMueLu_UnitTestsBlockedEpetra_MPI_1 failing on ATDM cee-rhel6-clang-opt-serial build*Created by: fryeguy52*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #4072 merged to 'develop' on 12/19/2018 for test `MueLu_UnitTestsBlockedEpetra_MPI_1` tha...*Created by: fryeguy52*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #4072 merged to 'develop' on 12/19/2018 for test `MueLu_UnitTestsBlockedEpetra_MPI_1` that was failing every day. Test passed passed on 12/20/2018.
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-15&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-cee-rhel6-&field2=status&compare2=62&value2=passed) the tests:
* MueLu_UnitTestsBlockedEpetra_MPI_1
are failing in the builds:
* Trilinos-atdm-cee-rhel6-clang-opt-serial
failing due to seg fault
```
[ceerws1113:37972] *** Process received signal ***
[ceerws1113:37972] Signal: Segmentation fault (11)
[ceerws1113:37972] Signal code: Address not mapped (1)
[ceerws1113:37972] Failing at address: (nil)
```
## Steps to Reproduce
One should be able to reproduce this failure any CEE LAN RHEL6 SRN machine as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the CEE LAN RHEL6 SRN machines are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cee-rhel6-clang-opt-serial
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3637Teko_testdriver_tpetra_MPI_1 Failing in ATDM cee-rhel6-clang-opt-serial build2018-11-30T03:18:02ZJames WillenbringTeko_testdriver_tpetra_MPI_1 Failing in ATDM cee-rhel6-clang-opt-serial build*Created by: fryeguy52*
CC: @trilinos/teko , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
Test failed every day for 4 days starting 10/12/2018 when this build was set up till 10/17/201...*Created by: fryeguy52*
CC: @trilinos/teko , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
Test failed every day for 4 days starting 10/12/2018 when this build was set up till 10/17/2018 but starting passing on 10/18/2018 and has passed for 3 consecutive days as of 10/20/2018 as shown [here](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-18&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-cee-rhel6-clang-opt-serial&field2=testname&compare2=61&value2=Teko_testdriver_tpetra_MPI_1&field3=buildstarttime&compare3=83&value3=2018-10-10). (It is not clear what new changes occured on 10/18/2018 to allow this test to start passing but let's not look a gift horse in the mouth.)
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-15&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-cee-rhel6-&field2=status&compare2=62&value2=passed) the tests:
* Teko_testdriver_tpetra_MPI_1
are failing in the builds:
* Trilinos-atdm-cee-rhel6-clang-opt-serial
This was failing on the rhel6 sems environment jobs earlier and the solution was a tageted disabling of part of the test. Is this the same issue? see #2656
```
Test "LSCStabilized_tpetra" completed ... FAILED (1)
```
...
```
Tests Passed: 85, Tests Failed: 1
(Incidently, you want no failures)
Teko tests failed
```
## Steps to Reproduce
One should be able to reproduce this failure any CEE LAN RHEL6 SRN machine as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the CEE LAN RHEL6 SRN machines are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cee-rhel6-clang-opt-serial
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Teko=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3556MueLu: uncoupled aggregation phase 3 failing on CUDA builds2018-11-30T03:12:10ZJames WillenbringMueLu: uncoupled aggregation phase 3 failing on CUDA builds*Created by: lucbv*
@trilinos/muelu
@rstumin
## Expectations
All tests should pass on the [CUDA build](https://testing.sandia.gov/cdash/buildSummary.php?buildid=4003452) especially since they might be used by the auto-tester...
...*Created by: lucbv*
@trilinos/muelu
@rstumin
## Expectations
All tests should pass on the [CUDA build](https://testing.sandia.gov/cdash/buildSummary.php?buildid=4003452) especially since they might be used by the auto-tester...
## Current Behavior
The two CUDA builds with `KOKKOS_REFACTOR=ON` on geminga are failing due to a failure to aggregates all nodes. This seem like a bug newly introduced in phase 3 of uncoupled aggregation.
## Motivation and Context
Mainly we want to maintain a clean dashboard.
## Definition of Done
- [ ] The CUDA builds with `KOKKOS_REFACTOR=ON` are clean and no tests fail
## Possible Solution
I am not sure but it looks like a logic error so probably looking at aggregation phase 3 will be enoughhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3541ShyLU_DD tests build failure in targeted CUDA PR bulid Trilinos-atdm-white-ri...2019-01-28T18:04:16ZJames WillenbringShyLU_DD tests build failure in targeted CUDA PR bulid Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt starting 10/2/2018*Created by: bartlettroscoe*
CC: @trilinos/shylu, @srajama1 (Trilinos Linear Solvers Product Area Lead), @fryeguy52, @roeverf, @searhein
## Next Action Status
After merge of PR #4248 to 'develop' on 1/23/2019, ShyLU_DD build and ...*Created by: bartlettroscoe*
CC: @trilinos/shylu, @srajama1 (Trilinos Linear Solvers Product Area Lead), @fryeguy52, @roeverf, @searhein
## Next Action Status
After merge of PR #4248 to 'develop' on 1/23/2019, ShyLU_DD build and tests in build `Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug-pt` on 'ride' was 100% clean on 1/24/2019.
## Description
Starting today, there are two build errors for the ShyLU_DD package in the build `Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt ` on 'white' and 'ride' as shown [here](https://testing.sandia.gov/cdash-dev-view/viewBuildError.php?buildid=3998259) which shows build errors starting with:
```
/home/jenkins/white/workspace/Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt/SRC_AND_BUILD/Trilinos/packages/shylu/shylu_dd/frosch/test/Thyra_Tpetra/main.cpp(104): error: "EpetraNode" is ambiguous
```
ShyLU_DD was building just fine yesterday in this build as shown [here](https://testing.sandia.gov/cdash-dev-view/viewBuildError.php?type=0&buildid=3995410).
Looking at the commits pulled today shown [here](https://testing.sandia.gov/cdash-dev-view/viewNotes.php?buildid=3998221#!#note6), it seems likely this might have been caused by one of the commits from @roeverf to the ShyLU_DD package merged in the PR #3472 merged to 'develop' by @searhein on 10/1/2018 as shown [here](https://github.com/trilinos/Trilinos/pull/3472#event-1876847991).
This is an important build because we are targeting this build on 'white' and 'ride' as a Trilinos PR testing build (see #2464 ).
## Current Status on CDash
The current status of `ShyLU_DD` in this build and tests over the last few days can be seen in [this CDash query](https://testing.sandia.gov/cdash-dev-view/index.php?project=Trilinos&date=2019-01-22&filtercount=4&showfilters=1&filtercombine=and&field1=subprojects&compare1=93&value1=ShyLU_DD&field2=buildname&compare2=61&value2=Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-release-debug-pt&field3=site&compare3=61&value3=ride&field4=buildstarttime&compare4=83&value4=1%20week%20ago).
## Steps to reproduce
One should be able to reproduce these build errors on either 'white' or 'ride' by cloning the Trilinos git repo, checking out the 'develop' branch, creating a build directory, and then doing:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnvAllPtPackages.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_ShyLU_DD=ON \
$TRILINOS_DIR
$ make NP=16
```
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3497Belos_gcrodr_hb_MPI_4 failing in ATDM builds on mutrino2018-12-12T21:22:57ZJames WillenbringBelos_gcrodr_hb_MPI_4 failing in ATDM builds on mutrino*Created by: fryeguy52*
CC: @trilinos/belos , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #3951 merged to 'develop' on 11/28/2018 resulted in this test passing in the Intel 18.0.2 ...*Created by: fryeguy52*
CC: @trilinos/belos , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #3951 merged to 'develop' on 11/28/2018 resulted in this test passing in the Intel 18.0.2 builds on 'mutrino' and the 'cee-rhel6' builds on 12/1/2018 and in all builds for several days as of 12/3/2018.
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-09-24&filtercount=5&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=site&compare2=61&value2=mutrino&field3=status&compare3=62&value3=passed&field4=buildstarttime&compare4=83&value4=2018-09-01&field5=testname&compare5=63&value5=Belos) the test:
* Belos_gcrodr_hb_MPI_4
is failing in the builds:
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW
* Trilinos-atdm-mutrino-intel-opt-openmp-KNL
some test output:
```
*** Error in `/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/BUILD/packages/belos/epetra/test/GCRODR/Belos_gcrodr_hb.exe': free(): invalid pointer: 0x000001000011bba0 ***
*** Error in `/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/BUILD/packages/belos/epetra/test/GCRODR/Belos_gcrodr_hb.exe': free(): invalid pointer: 0x00000100004b4980 ***
*** Error in `/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/BUILD/packages/belos/epetra/test/GCRODR/Belos_gcrodr_hb.exe': free(): invalid pointer: 0x00000100004b4980 ***
*** Error in `/lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/BUILD/packages/belos/epetra/test/GCRODR/Belos_gcrodr_hb.exe': free(): invalid pointer: 0x00000100004b4980 ***
```
## Steps to Reproduce
One should be able to reproduce this failure on the machine mutrino as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system mutrino are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#mutrino
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh intel-opt-openmp-HSW
$ cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make -j16
$ salloc -N 1 -p standard -J $JOB_NAME ctest -j16
```Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3491MueLu_UnitTestsTpetra_MPI_ tests timing out on ATDM cuda 9.2 builds on waterm...2018-11-30T03:12:10ZJames WillenbringMueLu_UnitTestsTpetra_MPI_ tests timing out on ATDM cuda 9.2 builds on waterman, ride, and white*Created by: fryeguy52*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #3498 merged on 9/25/2018 which reduces cost of the expensive BlockCrs unit tests and PR ...*Created by: fryeguy52*
CC: @trilinos/muelu , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
PR #3498 merged on 9/25/2018 which reduces cost of the expensive BlockCrs unit tests and PR #3517 merged on 9/26/2018 split up MueLu_UnitTests into multiple executables. On 2/27/2018 all MueLu tests (including new split up `MueLu_UnitTests*` tests) passe on all promoted "ATDM" builds and all 'waterman' builds.
## Description
As shown in [this query](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-09-04&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=63&value1=Trilinos-atdm-&field2=testname&compare2=65&value2=MueLu_UnitTestsTpetra_MPI_&field3=status&compare3=62&value3=Passed&field4=buildstarttime&compare4=83&value4=2018-09-20) the tests:
* MueLu_UnitTestsTpetra_MPI_1
* MueLu_UnitTestsTpetra_MPI_4
are failing often in the builds:
* Trilinos-atdm-white-ride-cuda-9.2-opt
* Trilinos-atdm-white-ride-cuda-9.2-debug
the test:
* MueLu_UnitTestsTpetra_MPI_4
is also failing every night on waterman in the builds:
* Trilinos-atdm-waterman-cuda-9.2-opt
* Trilinos-atdm-waterman-cuda-9.2-debug
All of the failures are from timeouts
## Steps to Reproduce on white
One should be able to reproduce this failure on the machine white as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system white are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16
```
## Steps to Reproduce on waterman
One should be able to reproduce this failure on the machine waterman as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system white are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ make NP=20
$ bsub -x -Is -n 20 ctest -j20
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3417Test failures in ATDM config gnu debug builds on Power8/9 machines 2018-11-30T03:12:10ZJames WillenbringTest failures in ATDM config gnu debug builds on Power8/9 machines *Created by: fryeguy52*
CC: @trilinos/tpetra, @trilinos/belos ,
@srajama1 (Trilinos Linear Solvers Product Lead)
@kddevin (Trilinos Data Services Product Lead)
@bartlettroscoe
## Next Action Status
PR #3420 merged on 9/10/20...*Created by: fryeguy52*
CC: @trilinos/tpetra, @trilinos/belos ,
@srajama1 (Trilinos Linear Solvers Product Lead)
@kddevin (Trilinos Data Services Product Lead)
@bartlettroscoe
## Next Action Status
PR #3420 merged on 9/10/2018 fixed all but one test on the `gnu-debug-openmp` builds on 'white'/'ride' on 9/11/2018 (the test `TpetraCore_gemm_MPI_1` is timing out). Next: Make the test `TpetraCore_gemm_MPI_1` run faster in that build or disable it?
## Description
As shown in [this query(white/ride)](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-09-04&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-white-ride-gnu-debug-openmp&field2=status&compare2=62&value2=Passed&field3=buildstarttime&compare3=83&value3=2018-09-10) and [this query(waterman)](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-09-04&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-waterman-gnu-debug-openmp&field2=status&compare2=62&value2=Passed&field3=buildstarttime&compare3=83&value3=2018-09-10) the tests:
- Anasazi_Tpetra_BlockDavidson_Lap_test_MPI_4
- Anasazi_Tpetra_BlockKrylovSchur_Lap_test_MPI_4
- Anasazi_Tpetra_IRTR_Lap_test_MPI_4
- Anasazi_Tpetra_MVOPTester_MPI_4
- Anasazi_Tpetra_TraceMin_largest_standard_test_MPI_4
- Anasazi_Tpetra_TraceMin_smallest_proj_test_MPI_4
- Anasazi_Tpetra_TraceMin_smallest_schur_test_MPI_4
- Anasazi_Tpetra_TraceMinDavidson_largest_standard_test_MPI_4
- Belos_Issue_3235_MPI_2
- Belos_SolverFactory_MPI_4
- Belos_Tpetra_BlockGMRES_hb_test_MPI_4
- Belos_Tpetra_MultipleSolves_MPI_4
- Belos_Tpetra_MVOPTester_complex_test_MPI_4
- Ifpack2_AdditiveSchwarz_RILUK_MPI_4
- Ifpack2_Cheby_belos_MPI_1
- Ifpack2_GS_belos_MPI_1
- Ifpack2_ILUT_5w_2_MPI_1
- Ifpack2_ILUT_5w_no_diag_MPI_1
- Ifpack2_ILUT_belos_MPI_1
- Ifpack2_ILUT_hb_belos_MPI_2
- Ifpack2_ILUT_hb_belos_MPI_4
- Ifpack2_Jac_sm_belos_MPI_1
- Ifpack2_Jacobi_belos_constGraph_MPI_4
- Ifpack2_Jacobi_belos_MPI_1
- Ifpack2_Jacobi_hb_belos_MPI_1
- Ifpack2_Jacobi_hb_belos_MPI_2
- Ifpack2_RILUK_hb_belos_MPI_2
- Ifpack2_RILUK_hb_belos_MPI_4
- Ifpack2_SGS_belos_MPI_1
- Ifpack2_small_gmres_belos_MPI_1
- MueLu_Maxwell3D-Tpetra-Stratimikos_MPI_4
- MueLu_Stratimikos_MPI_4
- MueLu_Stratimikos2_MPI_4
- NOX_Tpetra_1DFEM_MPI_4
- NOX_Tpetra_Heq_MPI_4
- NOX_Tpetra_MultiVectorOpsTests_MPI_4
- PanzerAdaptersSTK_CurlLaplacianExample
- PanzerAdaptersSTK_main_driver_energy-ss
- PanzerAdaptersSTK_main_driver_energy-ss-blocked-tp
- PanzerAdaptersSTK_MixedPoissonExample
- PanzerAdaptersSTK_projection_MPI_2
- PanzerMiniEM_MiniEM-BlockPrec_Augmentation_MPI_1
- PanzerMiniEM_MiniEM-BlockPrec_Augmentation_MPI_4
- PanzerMiniEM_MiniEM-BlockPrec_RefMaxwell_MPI_1
- PanzerMiniEM_MiniEM-BlockPrec_RefMaxwell_MPI_4
- Teko_DiagonallyScaledPreconditioner_MPI_1
- Teko_testdriver_tpetra_MPI_1
- Teko_testdriver_tpetra_MPI_2
- ThyraTpetraAdapters_TpetraThyraWrappersUnitTests_MPI_4
- ThyraTpetraAdapters_TpetraThyraWrappersUnitTests_serial_MPI_1
- TpetraCore_gemm_MPI_1
- TpetraCore_MultiVector_UnitTests_MPI_4
are failing in the builds:
* Trilinos-atdm-waterman-gnu-debug-openmp
* Trilinos-atdm-white-ride-gnu-debug-openmp (on both white and ride)
many of the tests have the following output in common
```
** On entry to DGEMM parameter number 8 had an illegal value
** On entry to DGEMM parameter number 8 had an illegal value
** On entry to DGEMM parameter number 8 had an illegal value
** On entry to DGEMM parameter number 8 had an illegal value
--------------------------------------------------------------------------
mpiexec has exited due to process rank 0 with PID 0 on
node waterman2 exiting improperly. There are three reasons this could occur:
1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.
2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"
3. this process called "MPI_Abort" or "orte_abort" and the mca parameter
orte_create_session_dirs is set to false. In this case, the run-time cannot
detect that the abort call was an abnormal termination. Hence, the only
error message you will receive is this one.
This may have caused other processes in the application to be
terminated by signals sent by mpiexec (as reported here).
You can avoid this message by specifying -quiet on the mpiexec command line.
--------------------------------------------------------------------------
```
## Steps to Reproduce (white/ride)
One should be able to reproduce this failure on the machine ride/white as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system ride/white are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#ridewhite
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh gnu-debug-openmp
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON \
-DTrilinos_ENABLE_<PACKAGE_NAME>=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16
```
(where `<PACKAGE_NAME>` is some package you want to build and run tests for.)
## Steps to Reproduce (waterman)
One should be able to reproduce this failure on the machine waterman as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system waterman are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#waterman
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh gnu-debug-openmp
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON \
-DTrilinos_ENABLE_<PACKAGE_NAME>=ON \
$TRILINOS_DIR
$ make NP=20
$ bsub -x -Is -n 20 ctest -j20
```
(where `<PACKAGE_NAME>` is some package you want to build and run tests for.)
Keep promoted "ATDM" builds of Trilinos clean