Trilinos merge requestshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests2019-01-10T16:20:57Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/4164Return 100% passing cee-hrl6-gnu-7.2.0 to 'ATDM' CDash group (TRIL-212)2019-01-10T16:20:57ZJames WillenbringReturn 100% passing cee-hrl6-gnu-7.2.0 to 'ATDM' CDash group (TRIL-212)*Created by: bartlettroscoe*
This build was 100% passing after the switch from static to shared libs.
Therefore, we can promote back to the 'ATDM' CDash group.
*Created by: bartlettroscoe*
This build was 100% passing after the switch from static to shared libs.
Therefore, we can promote back to the 'ATDM' CDash group.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/4064Enable TeuchosNumerics_LAPACK_test_MPI_1 test in 'debug' Power8 and Power8 bu...2018-12-18T09:10:21ZJames WillenbringEnable TeuchosNumerics_LAPACK_test_MPI_1 test in 'debug' Power8 and Power8 builds and disable just the STEQR() test*Created by: bartlettroscoe*
CC: @trilinos/teuchos, @fryeguy52
## Description
Enable TeuchosNumerics_LAPACK_test_MPI_1 test in 'debug' Power8 and Power8 builds and disable just the STEQR() test.
NOTE: This test was being disa...*Created by: bartlettroscoe*
CC: @trilinos/teuchos, @fryeguy52
## Description
Enable TeuchosNumerics_LAPACK_test_MPI_1 test in 'debug' Power8 and Power8 builds and disable just the STEQR() test.
NOTE: This test was being disabled in some `release-debug` builds where it did not need to be disabled. In fact, the `STEQR()` function does not segfault when using optimized compiler options (which are used in a `release-debug` build).
## Motivation and Context
We need to be running LAPACK tests for the functions being used by Trilinos and not disable them all (see #2410).
## How Has This Been Tested?
On 'ride' I did:
```
$ bsub -x -Is -q rhel7F -n 16 \
./checkin-test-atdm.sh \
cuda-9.2-debug gnu-openmp-debug \
--enable-packages=TeuchosNumerics --local-do-all
```
which returned:
```
PASSED (NOT READY TO PUSH): Trilinos: ride10
Mon Dec 17 08:51:43 MST 2018
Enabled Packages: TeuchosNumerics
Build test results:
-------------------
1) cuda-9.2-debug => passed: passed=16,notpassed=0 (1.50 min)
2) gnu-openmp-debug => passed: passed=16,notpassed=0 (0.60 min)
```
and showed:
```
$ for build_name in cuda-9.2-debug gnu-openmp-debug ; do grep -nH TeuchosNumerics_LAPACK_test ${build_name}/ctest.out ; done | grep "Pass"
cuda-9.2-debug/ctest.out:33:16/16 Test #9: TeuchosNumerics_LAPACK_test_MPI_1 .................. Passed 0.43 sec
gnu-openmp-debug/ctest.out:30:13/16 Test #9: TeuchosNumerics_LAPACK_test_MPI_1 .................. Passed 1.88 sec
```
And I ran this on 'waterman' using:
```
$ bsub -x -Is -n 20 \
./checkin-test-atdm.sh \
cuda-9.2-debug cuda-9.2-release-debug gnu-openmp-release-debug \
--enable-packages=TeuchosNumerics --local-do-all
```
which returned:
```
PASSED (NOT READY TO PUSH): Trilinos: waterman1
Mon Dec 17 08:56:33 MST 2018
Enabled Packages: TeuchosNumerics
Build test results:
-------------------
1) cuda-9.2-debug => passed: passed=16,notpassed=0 (1.20 min)
2) cuda-9.2-release-debug => passed: passed=16,notpassed=0 (1.34 min)
3) gnu-openmp-release-debug => passed: passed=16,notpassed=0 (1.08 min)
```
and showed:
```
$ for build_name in cuda-9.2-debug cuda-9.2-release-debug gnu-openmp-release-debug ; do grep -nH TeuchosNumerics_LAPACK_test ${build_name}/ctest.out ; done | grep "Pass"
cuda-9.2-debug/ctest.out:32:15/16 Test #9: TeuchosNumerics_LAPACK_test_MPI_1 .................. Passed 2.20 sec
cuda-9.2-release-debug/ctest.out:29:12/16 Test #9: TeuchosNumerics_LAPACK_test_MPI_1 .................. Passed 2.24 sec
gnu-openmp-release-debug/ctest.out:26: 9/16 Test #9: TeuchosNumerics_LAPACK_test_MPI_1 .................. Passed 4.10 sec
```
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
- [x] All new and existing tests passed.
- [x] No new compiler warnings were introduced.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/4040Set CMAKE_SKIP_INSTALL_RPATH=ON for static builds on 'mutrino' (TRIL-241)2018-12-13T11:41:11ZJames WillenbringSet CMAKE_SKIP_INSTALL_RPATH=ON for static builds on 'mutrino' (TRIL-241)*Created by: bartlettroscoe*
CC: @bathmatt, @jmgate, @krcb
This is needed to get around an install error on 'mutrino' (see [TRIL-241](https://sems-atlassian-son.sandia.gov/jira/browse/TRIL-241)).
NOTE: This needed a FORCE set for...*Created by: bartlettroscoe*
CC: @bathmatt, @jmgate, @krcb
This is needed to get around an install error on 'mutrino' (see [TRIL-241](https://sems-atlassian-son.sandia.gov/jira/browse/TRIL-241)).
NOTE: This needed a FORCE set for the cache var since CMake seems to set the
default as 'NO'!
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3723Remove MueLu_ENABLE_Epetra=OFF for EMPIRE ATDM Trilinos config (#2674, #2319)2018-10-24T02:34:27ZJames WillenbringRemove MueLu_ENABLE_Epetra=OFF for EMPIRE ATDM Trilinos config (#2674, #2319)*Created by: bartlettroscoe*
@trilinos/muelu, @jhux2
## Description
The CUDA bulid for MueLu was fixed so this disable should not be needed
anymore.
## Motivation and Context
This test was disabled on CUDA builds because i...*Created by: bartlettroscoe*
@trilinos/muelu, @jhux2
## Description
The CUDA bulid for MueLu was fixed so this disable should not be needed
anymore.
## Motivation and Context
This test was disabled on CUDA builds because it was broken as described in https://github.com/trilinos/Trilinos/issues/2319#issuecomment-373118441.
## How Has This Been Tested?
I tested this branch as described in https://github.com/trilinos/Trilinos/issues/2674#issuecomment-432292545. I think there are no new failures except for the test `PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-3` in the 'hansen' build:
```
12) cuda-9.0-opt-Kepler37 Results:
----------------------------------
99% tests passed, 2 tests failed out of 253
Subproject Time Summary:
MueLu = 3549.34 sec*proc (96 tests)
Panzer = 9395.80 sec*proc (157 tests)
Total Test time (real) = 1656.76 sec
The following tests FAILED:
227 - PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-3 (Failed)
228 - PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4 (Timeout)
Errors while running CTest
```
I hope we don't get bit by this too bad.
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3614Fix path to cmake on 'shiller' (TRIL-222)2018-10-12T16:23:51ZJames WillenbringFix path to cmake on 'shiller' (TRIL-222)*Created by: bartlettroscoe*
CC: @jmgate, @fryeguy52
## Description
Looks like 'shiller' does not mount /home at /ascldap/users/ while 'hansen'
does (where we run ATDM Trilinos builds). This is fixed by switching to
/home/raba...*Created by: bartlettroscoe*
CC: @jmgate, @fryeguy52
## Description
Looks like 'shiller' does not mount /home at /ascldap/users/ while 'hansen'
does (where we run ATDM Trilinos builds). This is fixed by switching to
/home/rabartl/ which works on both 'hanen' and 'shiller'.
## Motivation and Context
Builds on 'shiller' can't find CMake 3.11.2 without this change. Needed for EMPIRE adoption as per https://software-sandbox.sandia.gov/jira/browse/TRIL-222.
## How Has This Been Tested?
On this branch I on 'shiller' I did:
```
$ . cmake/std/atdm/load-env.sh gnu-opt-openmp
Hostname 'shiller01' matches known ATDM host 'shiller' and system 'shiller'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for ATDM_CONFIG_JOB_NAME='gnu-opt-openmp'
No KOKKOS_ARCH specified so using system default
Using hansen/shiller compiler stack GNU to build RELEASE code with Kokkos node type OPENMP and KOKKOS_ARCH=HSW
$ which cmake
~/install/hansen-shiller/cmake-3.11.2/bin/cmake
$ cmake --version
cmake version 3.11.2
CMake suite maintained and supported by Kitware (kitware.com/cmake).
```
and on 'hansen' I did:
```
$ . cmake/std/atdm/load-env.sh gnu-opt-openmp
Hostname 'hansen01' matches known ATDM host 'hansen' and system 'shiller'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for ATDM_CONFIG_JOB_NAME='gnu-opt-openmp'
No KOKKOS_ARCH specified so using system default
Using hansen/shiller compiler stack GNU to build RELEASE code with Kokkos node type OPENMP and KOKKOS_ARCH=HSW
$ which cmake
/home/rabartl/install/hansen-shiller/cmake-3.11.2/bin/cmake
$ cmake --version
cmake version 3.11.2
CMake suite maintained and supported by Kitware (kitware.com/cmake).
```
Looks good.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3559Atdm waterman disables2018-10-08T19:39:31ZJames WillenbringAtdm waterman disables*Created by: fryeguy52*
@trilinos/framework @bartlettroscoe
## Description
This disables the test `PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3` on the atdm waterman builds:
* Trilinos-atdm-waterman-cuda-9.2-debug
...*Created by: fryeguy52*
@trilinos/framework @bartlettroscoe
## Description
This disables the test `PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3` on the atdm waterman builds:
* Trilinos-atdm-waterman-cuda-9.2-debug
* Trilinos-atdm-waterman-cuda-9.2-opt
see #2751
This also disables a unit test that is timing out in the test `KokkosKernels_sparse_serial_MPI_1` on the atdm waterman builds:
* Trilinos-atdm-waterman-cuda-9.2-debug
* Trilinos-atdm-waterman-gnu-debug-openmp
see #3438 and #2964
`KokkosKernels_sparse_serial_MPI_1` does not timeout anymore on these builds when I tested it
Trilinos-atdm-waterman-cuda-9.2-debug
```
7/8 Test #6: KokkosKernels_sparse_serial_MPI_1 ... Passed 230.71 sec
```
Trilinos-atdm-waterman-gnu-debug-openmp
```
6/7 Test #5: KokkosKernels_sparse_serial_MPI_1 ... Passed 217.19 sec
```Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3522Exclude EpetraExt/HDF5 test if EpetraExt_ENABLE_HDF5 is not true (#3484)2018-09-27T16:17:29ZJames WillenbringExclude EpetraExt/HDF5 test if EpetraExt_ENABLE_HDF5 is not true (#3484)*Created by: bartlettroscoe*
@trilinos/epetraext
## Description
The current ATDM Trilinos build sets EpetraExt_ENABLE_HDF5=OFF even though
HDF5 is enabled. This test was being unconditionally added incorrectly. With
EpetraExt...*Created by: bartlettroscoe*
@trilinos/epetraext
## Description
The current ATDM Trilinos build sets EpetraExt_ENABLE_HDF5=OFF even though
HDF5 is enabled. This test was being unconditionally added incorrectly. With
EpetraExt_ENABLE_HDF5=OFF, TriBITS does not add the HDF5 include directories
and therefore the test does not build.
This commit corrects the problem by only including this test if
EpetraExt_ENABLE_HDF5 is TRUE.
## Motivation and Context
This breaks the build in the ATDM-based build `Trilinos-atdm-white-ride-cuda-9.2-debug-pt` (see #3484) that we are trying to get cleaned up for a Trilinos PR CUDA build (see #2464).
## How Has This Been Tested?
I tested this locally on 'ride' using:
```
$ cd ~/Trilinos.base/BUILDS/RIDE/CUDA/ATDM_CUDA_OPT/
. ~/Trilinos.base/Trilinos/cmake/std/atdm/load-env.sh cuda-opt
Hostname 'ride6' matches known ATDM host 'ride' and system 'ride'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for JOB_NAME='cuda-opt'
No KOKKOS_ARCH specified so using system default
Using white/ride compiler stack CUDA to build RELEASE code with Kokkos node type CUDA and KOKKOS_ARCH=Power8,Kepler37
ModuleCmd_Switch.c(215):ERROR:105: Unable to locate a modulefile for 'openmpi/2.1.2/gcc/7.2.0/cuda/9.2.88'
$ rm -r CMake*
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnvAllPtPackages.cmake \
-DTrilinos_ENABLE_TESTS=ON \
-DTrilinos_ENABLE_EpetraExt=ON
"$@" \
$HOME/Trilinos.base/Trilinos
$ make NP=16 &> make.out
$ bsub -x -Is -q rhel7F -n 16 ctest -j16 &> ctest.out
```
this built and gave the test results:
```
100% tests passed, 0 tests failed out of 10
Subproject Time Summary:
EpetraExt = 46.19 sec*proc (10 tests)
Total Test time (real) = 3.47 sec
```
The PR builds will test to see that this test runs when `EpetraExt_ENABLE_HDF5` is **not** set to `OFF`. (Therefore, I will not set `AT: AUTOMERGE` because I want to inspect the test results before I merge.)
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
- [x] All new and existing tests passed.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3516Disable PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4 on water...2018-09-26T22:22:15ZJames WillenbringDisable PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4 on waterman CUDA (#3340)*Created by: bartlettroscoe*
@trilinos/panzer
## Description
This test runs out of CUDA memory even when runs all on its own on 'waterman'.
We disabling this for now and the Panzer developers can then debug this
offline.
## ...*Created by: bartlettroscoe*
@trilinos/panzer
## Description
This test runs out of CUDA memory even when runs all on its own on 'waterman'.
We disabling this for now and the Panzer developers can then debug this
offline.
## Motivation and Context
There is little value in running tests every day that we know are going to fail (see #3340). This test can be fixed offline.
I was given the okay to disable this test in https://github.com/trilinos/Trilinos/issues/3340#issuecomment-424699362.
## How Has This Been Tested?
On 'waterman' I ran:
```
$ ./checkin-test-atdm.sh all --enable-packages=Panzer --configure
```
which returned:
```
PASSED (NOT READY TO PUSH): Trilinos: waterman11
Wed Sep 26 13:02:59 MDT 2018
Enabled Packages: Panzer
Build test results:
-------------------
0) MPI_RELEASE_DEBUG_SHARED_PT_OPENMP => Test case MPI_RELEASE_DEBUG_SHARED_PT_OPENMP was not run! => Does not affect push readiness! (-1.00 min)
1) gnu-debug-openmp-Power9-Volta70 => passed: configure-only passed => Not ready to push! (1.05 min)
2) gnu-opt-openmp-Power9-Volta70 => passed: configure-only passed => Not ready to push! (1.04 min)
3) cuda-debug-Power9-Volta70 => passed: configure-only passed => Not ready to push! (1.79 min)
4) cuda-opt-Power9-Volta70 => passed: configure-only passed => Not ready to push! (1.83 min)
```
I then verified that this test was correctly disabled in the correct builds with:
```
$ find . -maxdepth 2 -name configure.out \
-exec grep -nH PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4_DISABLE {} \; \
| grep "NOT added"
./cuda-debug-Power9-Volta70/configure.out:851:-- PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4: NOT added test because PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4_DISABLE='ON'!
./cuda-opt-Power9-Volta70/configure.out:849:-- PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4: NOT added test because PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4_DISABLE='ON'!
```
The fact that it did not find this in the two GNU builds shows that the test is not disabled in those.
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3506Change parallel build level on white/ride from 128 to 64 (#2464)2018-09-26T01:22:01ZJames WillenbringChange parallel build level on white/ride from 128 to 64 (#2464)*Created by: bartlettroscoe*
From talking with Si Hammond, he suggests that you will not get any real
speedup going voer 64 build proceses on 'white' and 'ride' and this might help
to reduce the random 'bsub' crashes durring building ...*Created by: bartlettroscoe*
From talking with Si Hammond, he suggests that you will not get any real
speedup going voer 64 build proceses on 'white' and 'ride' and this might help
to reduce the random 'bsub' crashes durring building on white/ride (see #2464).
I did not test this at all but this change is so simple and basic I think it would be very hard for this to break anything on white/ride.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3481ATDM - rename the gnu waterman tweaks files2018-09-21T14:29:28ZJames WillenbringATDM - rename the gnu waterman tweaks files*Created by: fryeguy52*
@trilinos/framework @bartlettroscoe
## Description
There is a "tweaks" file for each ATDM build that needs to have a specific name. This fixes the names of those files for the gnu waterman builds.
## Rel...*Created by: fryeguy52*
@trilinos/framework @bartlettroscoe
## Description
There is a "tweaks" file for each ATDM build that needs to have a specific name. This fixes the names of those files for the gnu waterman builds.
## Related Issues
#2474
#3454
## How Has This Been Tested?
```
-- Reading in configuration options from cmake/std/atdm/ATDMDevEnv.cmake ...
-- ATDM_JOB_NAME_KEYS_STR='GNU-RELEASE-OPENMP-POWER9'
-- ATDM_TWEAKS_FILES='/ascldap/users/jfrye/Trilinos/cmake/std/atdm/waterman/tweaks/GNU-RELEASE-OPENMP-POWER9.cmake'
-- Including ATDM build treaks file /ascldap/users/jfrye/Trilinos/cmake/std/atdm/waterman/tweaks/GNU-RELEASE-OPENMP-POWER9.cmake ...
-- Setting default Piro_MatrixFreeDecorator_UnitTests_MPI_4_DISABLE=ON
```
```
Test project /ascldap/users/jfrye/test_build
Start 1: Piro_UnitTests_MPI_1
Start 2: Piro_Epetra_MatrixFreeOperator_UnitTests_MPI_4
Start 3: Piro_EvalModel_MPI_4
Start 4: Piro_ThyraSolver_MPI_4
Start 5: Piro_AnalysisDriver_MPI_4
Start 6: Piro_SecondOrderIntegrator_MPI_1
1/11 Test #2: Piro_Epetra_MatrixFreeOperator_UnitTests_MPI_4 ... Passed 1.42 sec
Start 7: Piro_NOXSolver_UnitTests_MPI_4
2/11 Test #3: Piro_EvalModel_MPI_4 ............................. Passed 1.42 sec
Start 8: Piro_LOCASolver_UnitTests_MPI_4
3/11 Test #6: Piro_SecondOrderIntegrator_MPI_1 ................. Passed 1.46 sec
4/11 Test #1: Piro_UnitTests_MPI_1 ............................. Passed 1.49 sec
Start 9: Piro_RythmosSolver_UnitTests_MPI_4
5/11 Test #5: Piro_AnalysisDriver_MPI_4 ........................ Passed 1.50 sec
Start 10: Piro_Epetra_RythmosSolver_UnitTests_MPI_4
6/11 Test #4: Piro_ThyraSolver_MPI_4 ........................... Passed 1.60 sec
Start 11: Piro_TempusSolver_UnitTests_MPI_4
7/11 Test #10: Piro_Epetra_RythmosSolver_UnitTests_MPI_4 ........ Passed 0.97 sec
8/11 Test #11: Piro_TempusSolver_UnitTests_MPI_4 ................ Passed 0.90 sec
9/11 Test #7: Piro_NOXSolver_UnitTests_MPI_4 ................... Passed 1.11 sec
10/11 Test #9: Piro_RythmosSolver_UnitTests_MPI_4 ............... Passed 1.10 sec
11/11 Test #8: Piro_LOCASolver_UnitTests_MPI_4 .................. Passed 1.35 sec
100% tests passed, 0 tests failed out of 11
```
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3454Atdm waterman disables2018-09-19T14:54:06ZJames WillenbringAtdm waterman disables*Created by: fryeguy52*
@trilinos/framework, @bartlettroscoe
## Description
The following tests are being disabled in waterman builds where they are perpetually failing:
* Belos_Tpetra_PseudoBlockCG_hb_test_MPI_4
* Piro_MatrixFre...*Created by: fryeguy52*
@trilinos/framework, @bartlettroscoe
## Description
The following tests are being disabled in waterman builds where they are perpetually failing:
* Belos_Tpetra_PseudoBlockCG_hb_test_MPI_4
* Piro_MatrixFreeDecorator_UnitTests_MPI_4
* KokkosKernels_sparse_openmp_MPI_1
* TeuchosNumerics_LAPACK_test_MPI_1
## Motivation and Context
These tests are continually failing on waterman and are being disabled to get the waterman builds clean
## Related Issues
* #3173
* #2466
* #2474
* #2410
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3395Disable piro test that is failing on waterman ATDM builds2018-09-06T22:05:31ZJames WillenbringDisable piro test that is failing on waterman ATDM builds*Created by: fryeguy52*
Disable the test:
Piro_MatrixFreeDecorator_UnitTests_MPI_4
On the builds:
Trilinos-atdm-waterman-gnu-opt-openmp
Trilinos-atdm-waterman-cuda-9.2-opt
@trilinos/framework, @bartlettroscoe
## Description...*Created by: fryeguy52*
Disable the test:
Piro_MatrixFreeDecorator_UnitTests_MPI_4
On the builds:
Trilinos-atdm-waterman-gnu-opt-openmp
Trilinos-atdm-waterman-cuda-9.2-opt
@trilinos/framework, @bartlettroscoe
## Description
the test `Piro_MatrixFreeDecorator_UnitTests_MPI_4` has been failing consistently on waterman builds and the agreed course of action is to disable the test for these two builds see issue: #2474
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3363Switch from OpenMPI 3.1.0 to 2.1.2 on 'waterman' (TRIL-213)2018-08-28T01:33:05ZJames WillenbringSwitch from OpenMPI 3.1.0 to 2.1.2 on 'waterman' (TRIL-213)*Created by: bartlettroscoe*
CC: @fryeguy52, @mhoemmen, @kddevin, @rppawlo
## Description
Switches from OpenMPI 3.1.0 to OpenMPI 2.1.2 env on Power9 'waterman'. We were told by @nmhamster today that OpenMPI 3.1 really does not w...*Created by: bartlettroscoe*
CC: @fryeguy52, @mhoemmen, @kddevin, @rppawlo
## Description
Switches from OpenMPI 3.1.0 to OpenMPI 2.1.2 env on Power9 'waterman'. We were told by @nmhamster today that OpenMPI 3.1 really does not work on the Power9 (and is a known issue apparently) and to use the OpenMPI 2.1.2 env instead.
## Motivation and Context
This appears to fix a bunch of failing tests including those in #3344, #3331 and perhaps others.
## How Has This Been Tested?
On 'white' I ran:
```
$ bsub -x -Is -n 20 \
./checkin-test-atdm.sh cuda-opt-Power9-Volta70 \
--enable-packages=Kokkos,Teuchos,Zoltan2,Ifpack2,Tpetra,SEACAS,Panzer \
--local-do-all
```
and it returned:
```
99% tests passed, 2 tests failed out of 645
Subproject Time Summary:
Ifpack2 = 646.49 sec*proc (36 tests)
Kokkos = 475.50 sec*proc (27 tests)
Panzer = 7880.59 sec*proc (158 tests)
SEACAS = 24.01 sec*proc (20 tests)
Teuchos = 207.66 sec*proc (129 tests)
Tpetra = 2326.59 sec*proc (173 tests)
Zoltan2 = 1269.06 sec*proc (102 tests)
Total Test time (real) = 1888.47 sec
The following tests FAILED:
619 - PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4 (Failed)
623 - PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3 (Timeout)
```
This is worth trying on the full ATDM Trilinos build.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3236Increase KNL testing timeout from 5 to 9 hours (TRIL-196)2018-08-06T14:02:44ZJames WillenbringIncrease KNL testing timeout from 5 to 9 hours (TRIL-196)*Created by: bartlettroscoe*
@fryeguy52
## Description
The tests on the KNL build are taking an absurd amount of time but let's see
if we can get them to complete in 9 hours! AT the old timeout of 5 hours it
the debug build ra...*Created by: bartlettroscoe*
@fryeguy52
## Description
The tests on the KNL build are taking an absurd amount of time but let's see
if we can get them to complete in 9 hours! AT the old timeout of 5 hours it
the debug build ran 1504 out of 1801 tests so I am hopeful this will allow
them to complete.
This is only taking up one compute node on 'mutrino' so this is not such a
crime at this point.
This is being driven by [TRIL-196](https://software-sandbox.sandia.gov/jira/browse/TRIL-196)
## Motivation and Context
Want to see if we can get these tests to complete to see if there are errors. We can fix the runtime problem later.
## How Has This Been Tested?
I did not test this but the changes are simple and super safe and can only impact these two KNL build (which are just "Specialized" builds currently).
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/31003069 white cuda debug stokhos build error2018-07-13T18:12:30ZJames Willenbring3069 white cuda debug stokhos build error*Created by: bartlettroscoe*
CC: @trilinos/stokhos
## Description
The main contribution of the PR is that is fixes the build error for the stokhos_muelu lib in #3069. It also contains updated build-reference documentation on the...*Created by: bartlettroscoe*
CC: @trilinos/stokhos
## Description
The main contribution of the PR is that is fixes the build error for the stokhos_muelu lib in #3069. It also contains updated build-reference documentation on the causes (see commits).
## Motivation and Context
We need the build error #3069 to be fixed and we want to provide documentation so that other people can avoid this.
## How Has This Been Tested?
I tested this locally on `white` as described below. The full build of Stokhos passes now but there are several test failures. (But we will create new GitHub issues for those once this posts to CDash after the merge.)
<details>
<summary>
<b>DETAILED TEST RESULTS:</b> (click to expand)
</summary>
Testing on 'white':
```
$ cd ~/Trilinos.base/BUILD/WHITE/CUDA/CUDA-DEBUG/
$ . load-env.sh
Hostname 'white11' matches known ATDM host 'white' and system 'ride'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for JOB_NAME='cuda-debug'
Using white/ride compiler stack CUDA to build DEBUG code with Kokkos node type CUDA
$ rm -r CMake*
$ rm -r packages/
$ time cmake -GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnvAllPtPackages.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Stokhos=ON \
~/Trilinos.base/Trilinos \
&> configure.out
real 1m37.779s
user 0m59.169s
sys 0m17.638s
$ time make NP=32 &> make.out
real 51m55.785s
user 1320m38.264s
sys 233m46.608s
$ time bsub -x -Is -q rhel7F -n 16 ctest -j8 --timeout 600 &> ctest.out
real 10m44.521s
user 0m0.013s
sys 0m0.040s
```
The returned the result:
```
20% tests passed, 67 tests failed out of 84
Subproject Time Summary:
Stokhos = 2648.79 sec*proc (84 tests)
Total Test time (real) = 643.47 sec
The following tests FAILED:
1 - Stokhos_LegendreBasisUnitTest_MPI_1 (Failed)
2 - Stokhos_NormalizedLegendreBasisUnitTest_MPI_1 (Failed)
3 - Stokhos_HermiteBasisUnitTest_MPI_1 (Failed)
4 - Stokhos_NormalizedHermiteBasisUnitTest_MPI_1 (Failed)
5 - Stokhos_JacobiBasisUnitTest_MPI_1 (Failed)
6 - Stokhos_QuadExpansionUnitTest_MPI_1 (Failed)
7 - Stokhos_QuadraturePseudoSpectralExpansionUnitTest_MPI_1 (Failed)
8 - Stokhos_TensorProductPseudoSpectralExpansionUnitTest_MPI_1 (Failed)
9 - Stokhos_SmolyakPseudoSpectralExpansionUnitTest_MPI_1 (Failed)
10 - Stokhos_AlgebraicExpansionUnitTest_MPI_1 (Failed)
12 - Stokhos_DivisionOperatorUnitTest_MPI_1 (Failed)
13 - Stokhos_StieltjesUnitTest_MPI_1 (Failed)
14 - Stokhos_LanczosUnitTest_MPI_1 (Failed)
15 - Stokhos_GramSchmidtUnitTest_MPI_1 (Failed)
16 - Stokhos_Sparse3TensorUnitTest_MPI_1 (Failed)
17 - Stokhos_ExponentialRandomFieldUnitTest_MPI_1 (Failed)
18 - Stokhos_LogNormalUnitTest_MPI_1 (Failed)
20 - Stokhos_ProductBasisUtilsUnitTest_MPI_1 (Failed)
21 - Stokhos_TensorProductBasisUnitTest_MPI_1 (Failed)
22 - Stokhos_TotalOrderBasisUnitTest_MPI_1 (Failed)
23 - Stokhos_SmolyakBasisUnitTest_MPI_1 (Failed)
24 - Stokhos_TensorProductPseudoSpectralOperatorUnitTest_MPI_1 (Failed)
25 - Stokhos_LexicographicTreeBasisUnitTest_MPI_1 (Failed)
26 - Stokhos_SparseGridQuadratureUnitTest_MPI_1 (Failed)
27 - Stokhos_MatrixFreeOperatorUnitTest_MPI_1 (Failed)
28 - Stokhos_InterlacedOpUnitTest_MPI_2 (Failed)
29 - Stokhos_BasisInteractionGraphUnitTest_MPI_1 (Failed)
30 - Stokhos_AdaptivityToolsUnitTest_MPI_1 (Failed)
32 - Stokhos_InterlacedMapUnitTest_MPI_2 (Failed)
35 - Stokhos_SacadoPCEUnitTest_MPI_1 (Failed)
36 - Stokhos_SacadoETPCEUnitTest_MPI_1 (Failed)
37 - Stokhos_SacadoPCESerializationTests_MPI_1 (Failed)
38 - Stokhos_SacadoPCECommTests_MPI_1 (Failed)
39 - Stokhos_SacadoUQPCEUnitTest_MPI_1 (Failed)
40 - Stokhos_SacadoUQPCESerializationTests_MPI_1 (Failed)
41 - Stokhos_SacadoUQPCECommTests_MPI_1 (Failed)
42 - Stokhos_KokkosViewUQPCEUnitTest_Serial_MPI_1 (Failed)
43 - Stokhos_KokkosViewUQPCEUnitTest_Cuda_MPI_1 (Failed)
44 - Stokhos_KokkosCrsMatrixUQPCEUnitTest_Serial_MPI_1 (Failed)
45 - Stokhos_KokkosCrsMatrixUQPCEUnitTest_Cuda_MPI_1 (Failed)
46 - Stokhos_TpetraCrsMatrixUQPCEUnitTest_Serial_MPI_4 (Failed)
47 - Stokhos_TpetraCrsMatrixUQPCEUnitTest_Cuda_MPI_4 (Failed)
59 - Stokhos_TpetraCrsMatrixMPVectorUnitTest_Cuda_MPI_4 (Timeout)
60 - Stokhos_KokkosArrayKernelsUnitTest_Serial_MPI_1 (Failed)
61 - Stokhos_KokkosArrayKernelsUnitTest_Cuda_MPI_1 (Failed)
63 - Stokhos_hermite_example_MPI_1 (Failed)
64 - Stokhos_Linear2D_Diffusion_PCE_Example_MPI_2 (Failed)
65 - Stokhos_Linear2D_Diffusion_PCE_Interlaced_Example_MPI_2 (Failed)
66 - Stokhos_nox_example_MPI_1 (Failed)
67 - Stokhos_Linear2D_Diffusion_PCE_NOX_Example_MPI_2 (Failed)
68 - Stokhos_Linear2D_Diffusion_GMRES_Mean_Based_MPI_2 (Failed)
69 - Stokhos_Linear2D_Diffusion_GMRES_AGS_MPI_2 (Failed)
70 - Stokhos_Linear2D_Diffusion_CG_AGS_MPI_2 (Failed)
71 - Stokhos_Linear2D_Diffusion_GMRES_GS_MPI_2 (Failed)
72 - Stokhos_Linear2D_Diffusion_GMRES_AJ_MPI_2 (Failed)
73 - Stokhos_Linear2D_Diffusion_GMRES_KP_MPI_2 (Failed)
74 - Stokhos_Linear2D_Diffusion_GS_MPI_2 (Failed)
75 - Stokhos_Linear2D_Diffusion_JA_MPI_2 (Failed)
76 - Stokhos_Linear2D_Diffusion_LN_MPI_2 (Failed)
77 - Stokhos_Linear2D_Diffusion_GSLN_MPI_2 (Failed)
78 - Stokhos_Linear2D_Diffusion_GMRES_FA_MPI_2 (Failed)
79 - Stokhos_Linear2D_Diffusion_GMRES_KL_MPI_2 (Failed)
80 - Stokhos_Linear2D_Diffusion_GMRES_KLR_MPI_2 (Failed)
81 - Stokhos_uq_handbook_nonlinear_sg_example_MPI_1 (Failed)
82 - Stokhos_sacado_example_MPI_1 (Failed)
83 - Stokhos_division_example_MPI_1 (Failed)
84 - Stokhos_sacado_ensemble_example_MPI_1 (Failed)
Errors while running CTest
```
So that passed the build, but there are a bunch of failing Stokhos tests. We will deal with that in a new issue.
</details>
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
- [x] I have updated the documentation accordingly.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3081Fix ROL CUDA build failure (#3072)2018-07-10T01:37:34ZJames WillenbringFix ROL CUDA build failure (#3072)*Created by: bartlettroscoe*
CC: @trilinos/rol, @dridzal (ROL package lead)
## Description
Fixes the ROL CUDA build failure described in #3072. The fix was trivial (not sure why other compilers did not catch this or at least prov...*Created by: bartlettroscoe*
CC: @trilinos/rol, @dridzal (ROL package lead)
## Description
Fixes the ROL CUDA build failure described in #3072. The fix was trivial (not sure why other compilers did not catch this or at least prove a warning).
I also included a commit to add debug print info for `nvcc_wrapper` (see kokkos/nvcc_wrapper#19 and kokkos/nvcc_wrapper#20).
## Motivation and Context
ROL was not building for a CUDA build (see #3072). We wold like an auto PR CUDA build that includes all Primary Tested packages and ROL is a PT package (see #2464). Also, SPARC uses ROL and adding support for SPARC means testing ROL on all of the platforms where SPARC uses ROL and CUDA is an important build on many of those platforms.
## How Has This Been Tested?
I tested this on 'white' with:
```
$ cd ~/Trilinos.base/BUILD/WHITE/CUDA/CUDA-DEBUG/
$ source ~/Trilinos.base/Trilinos/cmake/std/atdm/load-env.sh cuda-debug
Hostname 'white11' matches known ATDM host 'white' and system 'ride'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for JOB_NAME='cuda-debug'
Using white/ride compiler stack CUDA to build DEBUG code with Kokkos node type CUDA
$ time cmake \
-GNinja
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnvAllPtPackages.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_ROL=ON \
~/Trilinos.base/Trilinos \
&> configure.out
real 1m43.759s
user 0m58.268s
sys 0m17.081s
$ time make NP=16 &> make.out
real 54m28.573s
user 696m12.668s
sys 80m53.877s
$ time bsub -x -Is -q rhel7F -n 16 ctest -j16 --timeout 600 &> ctest.out
real 14m51.969s
user 0m0.032s
sys 0m0.035s
```
and the build passed and the test results were:
```
90% tests passed, 16 tests failed out of 156
Subproject Time Summary:
ROL = 11219.28 sec*proc (156 tests)
Total Test time (real) = 890.82 sec
The following tests FAILED:
32 - ROL_test_elementwise_TpetraMultiVector_MPI_4 (Failed)
130 - ROL_example_PDE-OPT_0ld_poisson_example_01_MPI_4 (Failed)
131 - ROL_example_PDE-OPT_0ld_stefan-boltzmann_example_03_MPI_4 (Failed)
134 - ROL_example_PDE-OPT_0ld_adv-diff-react_example_01_MPI_4 (Failed)
135 - ROL_example_PDE-OPT_0ld_adv-diff-react_example_02_MPI_4 (Timeout)
136 - ROL_example_PDE-OPT_0ld_stoch-adv-diff_example_01_MPI_4 (Timeout)
137 - ROL_example_PDE-OPT_poisson_example_01_MPI_4 (Failed)
139 - ROL_example_PDE-OPT_stefan-boltzmann_example_01_MPI_4 (Failed)
141 - ROL_example_PDE-OPT_stefan-boltzmann_example_03_MPI_4 (Failed)
142 - ROL_example_PDE-OPT_adv-diff-react_example_02_MPI_4 (Failed)
143 - ROL_example_PDE-OPT_navier-stokes_example_01_MPI_4 (Timeout)
144 - ROL_example_PDE-OPT_navier-stokes_example_02_MPI_4 (Failed)
145 - ROL_example_PDE-OPT_obstacle_example_01_MPI_4 (Failed)
150 - ROL_example_PDE-OPT_nonlinear-elliptic_example_01_MPI_4 (Failed)
151 - ROL_example_PDE-OPT_nonlinear-elliptic_example_02_MPI_4 (Failed)
152 - ROL_example_PDE-OPT_topo-opt_poisson_example_01_MPI_4 (Failed)
Errors while running CTest
```
Those are the same 16 tests already shown failing in the build `Trilinos-atdm-white-ride-cuda-debug-pt-all-at-once` for example shown [here](https://testing-vm.sandia.gov/cdash/viewTest.php?onlyfailed&buildid=3698659). (I will create a new GitHub issue for those failing tests once this PR is merge.)
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/3011Disable several failing tests on different ATDM platforms and promote full bu...2018-06-25T14:07:04ZJames WillenbringDisable several failing tests on different ATDM platforms and promote full builds on 'serrano' and 'mutrino' to "ATDM" CDash Track/Group*Created by: bartlettroscoe*
CC: @fryeguy52
## Description
Contains commits to disable some long failing (or randomly failing) tests on several platforms. See teh commits themselves for details.
## Motivation and Context
N...*Created by: bartlettroscoe*
CC: @fryeguy52
## Description
Contains commits to disable some long failing (or randomly failing) tests on several platforms. See teh commits themselves for details.
## Motivation and Context
Need to clean up existing promoted builds and clean up builds were are trying to promote.
## How Has This Been Tested?
I ran configures on all of the impacted machines for all of the impacted builds and verified that the expected tests are now being disabled.
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/2946Atdm mutrino use sbatch2018-06-15T12:56:05ZJames WillenbringAtdm mutrino use sbatch*Created by: fryeguy52*
## Description
This does a few small things for the ATDM configuration
1. Switch the mutrino jobs to use sbatch. This change seems to fix several of the failing seacas tests in #2815
2. promote the full ...*Created by: fryeguy52*
## Description
This does a few small things for the ATDM configuration
1. Switch the mutrino jobs to use sbatch. This change seems to fix several of the failing seacas tests in #2815
2. promote the full configuration jobs on `chama` to the ATDM CDash Group. Which are both passing all tests today as shown [here](https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&date=&filtercombine=and&filtercount=3&showfilters=1&filtercombine=and&field1=buildname&compare1=63&value1=Trilinos-atdm-chama&field2=buildname&compare2=64&value2=panzer&field3=buildstarttime&compare3=83&value3=Jun%2014%2C%202018)
3. Disable the test `Piro_MatrixFreeDecorator_UnitTests_MPI_4_DISABLE` on both `serrano` and `mutrino` because it is failing randomly as described in #2474
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/2860Some small 'serrano' updates for the ATDM Trilinos configuration2018-06-01T12:21:06ZJames WillenbringSome small 'serrano' updates for the ATDM Trilinos configuration*Created by: bartlettroscoe*
CC: @fryeguy52
The commits are self explanatory.
I tested that the exported function `atdm_run_script_on_compute_node` can be used in child process scripts.
*Created by: bartlettroscoe*
CC: @fryeguy52
The commits are self explanatory.
I tested that the exported function `atdm_run_script_on_compute_node` can be used in child process scripts.
Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/merge_requests/2824Tril 198 cmake 3.11.2 ninja 1.8.22018-05-25T17:34:40ZJames WillenbringTril 198 cmake 3.11.2 ninja 1.8.2*Created by: bartlettroscoe*
CC: @fryeguy52
## Description
Updated ATDM Trilinos builds on 'white' and 'ride' to use manually installed CMake 3.11.2 and Ninja 1.8.2 (which supports CMake Fortran builds). I also did some refactor...*Created by: bartlettroscoe*
CC: @fryeguy52
## Description
Updated ATDM Trilinos builds on 'white' and 'ride' to use manually installed CMake 3.11.2 and Ninja 1.8.2 (which supports CMake Fortran builds). I also did some refactoring to remove duplication. See the commit log messages and diffs. There are not many changes here.
## Motivation and Context
There is hope that this will address several issues.
* The ninja module load failures in the Jenkins builds on 'ride' (see [TRIL-208](https://software-sandbox.sandia.gov/jira/browse/TRIL-208)).
* Crashing of 'bsub' on 'white' and 'ride' (see [TRIL-198](https://software-sandbox.sandia.gov/jira/browse/TRIL-198)). (Hoping the new libuv implementation of the ctest test job runner may magically fix the problems with 'bsub' crashing while running tests with ctest -S.)
## How Has This Been Tested?
I did quite a bit of manual testing for these changes. I think this is pretty solid.
* Do local testing on 'white':
- checkin-test-atdm.sh [Done]
- Jenkins driver [Done]
* Set up experimental Jenkins job 'Trilinos-atdm-white-ride-cuda-opt' on jenkins-son.sandia.gov that runs a build on branch tril-198-cmake-3.11.2-ninja-1.8.2 of my fork. => See [Jenkins log[(https://jenkins-son.sandia.gov/view/Trilinos%20ATDM/job/Trilinos-atdm-white-ride-cuda-opt/9/console) and [CDash build](https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3546221).
* Do local testing on 'ride':
- checkin-test-atdm.sh [Done]
- Jenkins driver [Done]
* Set up experimental Jenkins jobs on jenkins-srn.sandia.gov that runs build on branch tril-198-cmake-3.11.2-ninja-1.8.2 => See [Jenkins log](https://jenkins-srn.sandia.gov/view/Trilinos%20ATDM/job/Trilinos-atdm-white-ride-cuda-debug-exp/1/console) and [CDash build](https://testing-vm.sandia.gov/cdash/index.php?project=Trilinos&parentid=3546254).
* Do local testing on 'serrano' (to test change in split Jenkins driver):
- checkin-test-atdm.sh [Done]
- Jenkins driver [Done]
## Checklist
- [x] My commit messages mention the appropriate GitHub issue numbers.
- [x] All new and existing tests passed.
Initial cleanup of new ATDM builds of Trilinos