Trilinos issueshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues2018-12-20T20:25:33Zhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3638Teko_ModALPreconditioner_MPI_1 Failing in build Trilinos-atdm-cee-rhel6-clang...2018-12-20T20:25:33ZJames WillenbringTeko_ModALPreconditioner_MPI_1 Failing in build Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt*Created by: fryeguy52*
CC: @trilinos/teko , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
With the merge of PR #4079 to 'develop' on 12/19/2018, this test `Teko_ModALPreconditioner_MPI...*Created by: fryeguy52*
CC: @trilinos/teko , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
With the merge of PR #4079 to 'develop' on 12/19/2018, this test `Teko_ModALPreconditioner_MPI_1` is disabled and shown missing in the build `Trilinos-atdm-cee-rhel6-clang-5.0.1-openmpi-1.10.2-serial-static-opt ` on 12/20/2018.
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-15&filtercount=2&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-cee-rhel6-&field2=status&compare2=62&value2=passed) the tests:
* Teko_ModALPreconditioner_MPI_1
are failing in the builds:
* Trilinos-atdm-cee-rhel6-clang-opt-serial
failing from a seg fault:
```
[ceerws1113:36105] *** Process received signal ***
[ceerws1113:36105] Signal: Segmentation fault (11)
[ceerws1113:36105] Signal code: Address not mapped (1)
[ceerws1113:36105] Failing at address: (nil)
```
## Steps to Reproduce
One should be able to reproduce this failure any CEE LAN RHEL6 SRN machine as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the CEE LAN RHEL6 SRN machines are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#cee-rhel6-environment
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cee-rhel6-clang-opt-serial
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Teko=ON \
$TRILINOS_DIR
$ make NP=16
$ ctest -j16
```Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3629Amesos2 doesn't build most of its examples2018-10-15T22:34:18ZJames WillenbringAmesos2 doesn't build most of its examples*Created by: mhoemmen*
@trilinos/amesos2
All but one of the Amesos2 examples in `amesos2/example/CMakeLists.txt` are commented out. That means they aren't tested and we don't even know if they build. Did somebody not know what CMa...*Created by: mhoemmen*
@trilinos/amesos2
All but one of the Amesos2 examples in `amesos2/example/CMakeLists.txt` are commented out. That means they aren't tested and we don't even know if they build. Did somebody not know what CMake options they needed to use?https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3628Did CMake 3.10.0 requirement break the check-in test script?2018-10-31T23:57:31ZJames WillenbringDid CMake 3.10.0 requirement break the check-in test script?*Created by: mhoemmen*
@trilinos/framework @bartlettroscoe
The latest changes that require CMake 3.10.0 seem to have broken the check-in test script. I invoked the script like this:
```
.../Trilinos/checkin-test.py --ctest-timeou...*Created by: mhoemmen*
@trilinos/framework @bartlettroscoe
The latest changes that require CMake 3.10.0 seem to have broken the check-in test script. I invoked the script like this:
```
.../Trilinos/checkin-test.py --ctest-timeout=400 --disable-packages=PyTrilinos,Claps,TriKota,Domi,STKSearch,Moertel,Shards --skip-case-no-email --allow-no-pull --enable-all-packages=off --default-builds= --extra-builds=MPI_DEBUG_EX --enable-packages=TpetraCore,Zoltan2,Amesos2 --configure
```
with the following modules loaded:
```
1) sems-env 3) sems-cmake/3.12.2 5) sems-openmpi/1.10.1 7) sems-boost/1.59.0/base 9) sems-hdf5/1.8.12/parallel 11) sems-zlib/1.2.8/base
2) kokkos-env 4) sems-gcc/4.9.3 6) sems-python/2.7.9 8) sems-superlu/4.3/base 10) sems-netcdf/4.4.1/exo_parallel 12) sems-parmetis/4.0.3/parallel
```
I get the following output:
```
...
B) Do the configuration with CMake (MPI_DEBUG_EX) ...
Running: rm CMakeCache.txt
Running: rm -rf CMakeFiles
Running: ./do-configure
Writing console output to file configure.out ...
Runtime for command = 0.302733 minutes
Configure failed returning 1!
Traceback (most recent call last):
File "/scratch/prj/Trilinos/Trilinos/cmake/tribits/ci_support/CheckinTest.py", line 1563, in runBuildTestCase
raise Exception("Configure failed!")
Exception: Configure failed!
E) Analyze the overall results and send email notification (MPI_DEBUG_EX) ...
E.1) Determine what passed and failed ...
The pull step was not performed!
The configure FAILED!
```
Should I consider the check-in test script dead? It was a useful tool & I'm sad to see it go.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3622Trilinos::Details::LinearSolver: Add ability to get final residual norm/vecto...2018-10-15T05:01:39ZJames WillenbringTrilinos::Details::LinearSolver: Add ability to get final residual norm/vector on request*Created by: mhoemmen*
@trilinos/teuchos @trilinos/tpetra @vbrunini
This makes sense if the solver already computes the final residual norm/vector, as it would avoid users needing to recompute it.*Created by: mhoemmen*
@trilinos/teuchos @trilinos/tpetra @vbrunini
This makes sense if the solver already computes the final residual norm/vector, as it would avoid users needing to recompute it.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3618Is NOX buildable with python?2019-02-28T16:53:05ZJames WillenbringIs NOX buildable with python?*Created by: VictorEijkhout*
```
/admin/rpms/BUILD/trilinos-git/packages/PyTrilinos/src/NOX.__init__.i:130: Error: Unable to find 'NOX_Version.H'
packages/PyTrilinos/src/CMakeFiles/PyTrilinos_NOX___init__.dir/build.make:64: recipe for...*Created by: VictorEijkhout*
```
/admin/rpms/BUILD/trilinos-git/packages/PyTrilinos/src/NOX.__init__.i:130: Error: Unable to find 'NOX_Version.H'
packages/PyTrilinos/src/CMakeFiles/PyTrilinos_NOX___init__.dir/build.make:64: recipe for target 'packages/PyTrilinos/src/NOX.__init__PYTHON_wrap.cpp' failed
make[2]: *** [packages/PyTrilinos/src/NOX.__init__PYTHON_wrap.cpp] Error 1
[trilinos.log.gz](https://github.com/trilinos/Trilinos/files/2474856/trilinos.log.gz)
```https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3596Amesos2: Build warnings with all Scalar types enabled, Node=Serial only2018-10-10T22:31:59ZJames WillenbringAmesos2: Build warnings with all Scalar types enabled, Node=Serial only*Created by: mhoemmen*
@trilinos/amesos2
When I enable all four supported Scalar types, I get build warnings that look like this:
```
.../Trilinos/packages/xpetra/src/MultiVector/Xpetra_EpetraMultiVector.cpp:123:16: warning: expli...*Created by: mhoemmen*
@trilinos/amesos2
When I enable all four supported Scalar types, I get build warnings that look like this:
```
.../Trilinos/packages/xpetra/src/MultiVector/Xpetra_EpetraMultiVector.cpp:123:16: warning: explicit instantiation of 'EpetraMultiVectorT<int,
Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >' that occurs after an explicit specialization has no effect [-Winstantiation-after-specialization]
template class EpetraMultiVectorT<int, Kokkos::Compat::KokkosSerialWrapperNode >;
^
.../Trilinos/packages/xpetra/src/MultiVector/Xpetra_EpetraMultiVector.hpp:340:9: note: previous template specialization is here
class EpetraMultiVectorT<int, EpetraNode>
^
.../Trilinos/packages/xpetra/src/MultiVector/Xpetra_EpetraMultiVector.cpp:167:16: warning: explicit instantiation of 'EpetraMultiVectorT<long long,
Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >' that occurs after an explicit specialization has no effect [-Winstantiation-after-specialization]
template class EpetraMultiVectorT<long long, Kokkos::Compat::KokkosSerialWrapperNode >;
^
.../Trilinos/packages/xpetra/src/MultiVector/Xpetra_EpetraMultiVector.hpp:754:9: note: previous template specialization is here
class EpetraMultiVectorT<long long, EpetraNode>
^
2 warnings generated.
```https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3585Test Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 appears to be randomly fail...2019-04-02T18:21:50ZJames WillenbringTest Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 appears to be randomly failing in many builds including CI, PR, and ATDM builds*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/anasazi, @srajama1 (Trilinos Linear Solver Product Area Lead)
## Next Action Status
PR #4052 merged to 'develop' on 12/18/2018 but still failing after that. Next: Tr...*Created by: bartlettroscoe*
CC: @trilinos/framework, @trilinos/anasazi, @srajama1 (Trilinos Linear Solver Product Area Lead)
## Next Action Status
PR #4052 merged to 'develop' on 12/18/2018 but still failing after that. Next: Try to fix again?
## Description
It would seem that the test `Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4` is very occasionally randomly failing in various builds. As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-10-09&filtercount=4&showfilters=1&filtercombine=and&field1=testname&compare1=61&value1=Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4&field2=status&compare2=61&value2=failed&field3=details&compare3=64&value3=timeout&field4=buildstarttime&compare4=83&value4=2018-07-01), this test failed 10 times since 7/1/2018 in the builds:
* `Linux-GCC-4.8.4-MPI_RELEASE_DEBUG_SHARED_PT_OPENMP_CI` (post-push CI build): 1 time (today)
* `PR-XXXX-test-Trilinos_pullrequest_gcc_4.9.3-YYYY` (standard PR build): 4 times
* `PR-XXXX-test-Trilinos_pullrequest_gcc_4.8.4-YYYY` (standard PR build): 1 time
* `Trilinos-atdm-chama-intel-debug-openmp` (standard ATDM build): 1 time
* `Trilinos-atdm-rhel6-gnu-opt-openmp` (standard ATDM build): 2 times
* `Trilinos-atdm-waterman-cuda-9.2-debug` (standard ATDM build): 1 time
In each of these 10 failures in the last 3 months, such as the CI failure today shown [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=56264374&build=4031303), it shows failures like:
```
projectAndNormalizeGen() returned rank 5
|| <S,S> - I || after : 2.65912e-11
1|| S_in - X1*C1 - X2*C2 - S_out*B || : 1.70776e-09
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv tolerance exceeded! test failed!
```
The location of these failures seems to change in this test but all of the failures appear to be "tolerance exceeded! test failed!"
Is there some type of non-deterministic behavior in this test or in the underlying Anasazi code that allows for these types of random failures?
## Steps to Reproduce
Given that this test seems to be failing randomly only very occasionally, this might be hard to reproduce locally. But given that this has failed in the post-push GCC 4.8.4 CI build and the GCC 4.9.3 PR build one might be able to use one of those.
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3580Tpetra::Distributor: Fix "slow path" so we can use MPI_Isend2019-02-17T23:34:06ZJames WillenbringTpetra::Distributor: Fix "slow path" so we can use MPI_Isend*Created by: mhoemmen*
@trilinos/tpetra @jjellio @csiefer2
Fix the "slow path" of `Distributor::doPosts`, so we can use nonblocking sends (`MPI_Isend`). The "slow path" kicks in when the data to send are not neatly grouped in conti...*Created by: mhoemmen*
@trilinos/tpetra @jjellio @csiefer2
Fix the "slow path" of `Distributor::doPosts`, so we can use nonblocking sends (`MPI_Isend`). The "slow path" kicks in when the data to send are not neatly grouped in contiguous chunks per process. It permutes the data into contiguous-by-target-process-rank chunks for sending. Currently, the slow path uses the same send buffer for all the messages. This means that it cannot use nonblocking sends.
We must fix both the "three-argument" (all messages have the same size) and "four-argument" (different messages may have different sizes) overloads of `doPosts`, and both the `Teuchos::ArrayRCP` and `Kokkos::View` versions of each.
## Motivation and Context
This is part of the overall effort to improve MPI+CUDA performance and make Tpetra's boundary exchange and sparse matrix-vector multiply communication nonblocking.
## Definition of Done
- [x] Fix 3-argument `Teuchos::ArrayRCP` overload of `doPosts`
- [x] Fix 3-argument `Kokkos::View` overload of `doPosts`
- [x] Fix 4-argument `Teuchos::ArrayRCP` overload of `doPosts`
- [x] Fix 4-argument `Kokkos::View` overload of `doPosts`
## Related Issues
* Part of #383 https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3574Ifpack2 + Kokkos + complex float : compilation error2019-03-08T14:28:23ZJames WillenbringIfpack2 + Kokkos + complex float : compilation error*Created by: davydden*
## Expectations
Trilinos `trilinos-release-12-14-branch` builds with ETI and complex and float types
## Current Behavior
```
8051 In file included from /Users/davydden/spack/var/spack/stage/trilinos-12...*Created by: davydden*
## Expectations
Trilinos `trilinos-release-12-14-branch` builds with ETI and complex and float types
## Current Behavior
```
8051 In file included from /Users/davydden/spack/var/spack/stage/trilinos-12.14-hpmbucm6pphwdcf3p6hhqcac2cm5qts6/Trilinos/spack-build/packages/ifpack2/src/Ifpack2_BlockTriDiContainer_Serial.cpp:50:
8052 In file included from /Users/davydden/spack/var/spack/stage/trilinos-12.14-hpmbucm6pphwdcf3p6hhqcac2cm5qts6/Trilinos/packages/ifpack2/src/Ifpack2_BlockTriDiContainer_def.hpp:52:
>> 8053 /Users/davydden/spack/var/spack/stage/trilinos-12.14-hpmbucm6pphwdcf3p6hhqcac2cm5qts6/Trilinos/packages/kokkos-kernels/src/batched/KokkosBatched_Util.hpp:195:7: error: static_assert failed "KokkosKernels:: Invalid SIMD<> type."
8054 static_assert( std::is_same<T,bool>::value ||
8055 ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
8056 /Users/davydden/spack/var/spack/stage/trilinos-12.14-hpmbucm6pphwdcf3p6hhqcac2cm5qts6/Trilinos/packages/ifpack2/src/Ifpack2_BlockTriDiContainer_impl.hpp:1568:28: note: in instantiation of template class 'KokkosBatched::Experimental::SIMD<Kokkos::complex<floa
t> >' requested here
8057 B.assign_data( &vector_values(i0+1,0,0) );
8058 ^
8059 /Users/davydden/spack/var/spack/stage/trilinos-12.14-hpmbucm6pphwdcf3p6hhqcac2cm5qts6/Trilinos/packages/ifpack2/src/Ifpack2_BlockTriDiContainer_impl.hpp:1651:9: note: in instantiation of member function 'Ifpack2::BlockTriDiContainerDetails::ExtractAndFactori
zeTridiags<Tpetra::Classes::RowMatrix<std::__1::complex<float>, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::factorize' requested here
```
## Steps to Reproduce
configure and build with:
```
-DTrilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON
-DTpetra_INST_DOUBLE:BOOL=ON
-DTpetra_INST_INT_LONG:BOOL=ON
-DTpetra_INST_COMPLEX_DOUBLE=ON
-DTpetra_INST_COMPLEX_FLOAT=ON
-DTpetra_INST_FLOAT=ON
-DTpetra_INST_SERIAL=ON
-DTeuchos_ENABLE_COMPLEX=ON
-DTeuchos_ENABLE_FLOAT=ON
```
## Your Environment
macOS Mojave
Apple Clang 10.0.0
gfortran 8.2.0
## Additional Information
full config/build logs:
[logs.zip](https://github.com/trilinos/Trilinos/files/2454254/logs.zip)
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3543ROL tests failing in targeted CUDA PR build Trilinos-atdm-white-ride-cuda-9.2...2019-04-06T00:16:37ZJames WillenbringROL tests failing in targeted CUDA PR build Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt*Created by: bartlettroscoe*
CC: @trilinos/rol , @rppawlo (Trilinos Nonlinear Solvers Product Area Lead)
## Next Action Status
## Description
The ROL package has 66 failing tests in the build `Trilinos-atdm-white-ride-cuda-9.2-...*Created by: bartlettroscoe*
CC: @trilinos/rol , @rppawlo (Trilinos Nonlinear Solvers Product Area Lead)
## Next Action Status
## Description
The ROL package has 66 failing tests in the build `Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt ` on 'white' and 'ride' as shown [here](https://testing.sandia.gov/cdash-dev-view/viewTest.php?onlyfailed&buildid=3998251) which shows the failing tests:
* ROL_example_PDE-OPT_0ld_adv-diff-react_example_01_MPI_4
* ROL_example_PDE-OPT_0ld_adv-diff-react_example_02_MPI_4
* ROL_example_PDE-OPT_0ld_poisson_example_01_MPI_4
* ROL_example_PDE-OPT_0ld_stefan-boltzmann_example_03_MPI_4
* ROL_example_PDE-OPT_navier-stokes_example_01_MPI_4
* ROL_example_PDE-OPT_navier-stokes_example_02_MPI_4
* ROL_example_PDE-OPT_nonlinear-elliptic_example_01_MPI_4
* ROL_example_PDE-OPT_nonlinear-elliptic_example_02_MPI_4
* ROL_example_PDE-OPT_obstacle_example_01_MPI_4
* ROL_example_PDE-OPT_stefan-boltzmann_example_01_MPI_4
* ROL_example_PDE-OPT_stefan-boltzmann_example_03_MPI_4
* ROL_example_PDE-OPT_topo-opt_poisson_example_01_MPI_4
* ROL_example_tempus_example_parabolic_modeleval_MPI_1
* ROL_example_tempus_example_parabolic_thyravec_MPI_1
* ROL_test_elementwise_TpetraMultiVector_MPI_4
The first failing test `ROL_example_PDE-OPT_0ld_adv-diff-react_example_01_MPI_4` with detailed output shown [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=55725358&build=3998251) shows:
```
Total number of processors: 4
Number of nodes = 1089
Number of cells = 1024
Number of edges = 2112
Cell offsets across processors: {0, 256, 512, 768}
terminate called after throwing an instance of 'std::runtime_error'
terminate called after throwing an instance of 'std::runtime_error'
what(): cudaGetLastError() error( cudaErrorIllegalAddress): an illegal memory access was encountered /home/jenkins/white/workspace/Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Cuda/Kokkos_CudaExec.hpp:401
Traceback functionality not available
what(): cudaGetLastError() error( cudaErrorIllegalAddress): an illegal memory access was encountered /home/jenkins/white/workspace/Trilinos-atdm-white-ride-cuda-9.2-release-debug-pt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Cuda/Kokkos_CudaExec.hpp:401
Traceback functionality not available
[white27:11203] *** Process received signal ***
[white27:11204] *** Process received signal ***
[white27:11203] Signal: Aborted (6)
[white27:11203] Signal code: (-6)
[white27:11204] Signal: Aborted (6)
[white27:11204] Signal code: (-6)
[white27:11203] [ 0] [white27:11204] [ 0] [0x3fff90070478]
[white27:11203] [ 1] [0x3fffa3f00478]
...
```
Randomly looking at the output of several of the other tests I looked at all show errors like shown above.
This is an important build because we are targeting this build on 'white' and 'ride' as a Trilinos PR testing build (see #2464 ). Also, SPARC uses ROL and as part of https://software-sandbox.sandia.gov/jira/browse/TRIL-212 we are about to update the ATDM Trilinos configuration to test ROL on many platforms (including CUDA builds) so it is critical to get these tests cleaned up for ATDM.
## Steps to reproduce
One should be able to reproduce these build errors on either 'white' or 'ride' by cloning the Trilinos git repo, checking out the 'develop' branch, creating a build directory, and then doing:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_ROL=ON \
$TRILINOS_DIR
$ make NP=16
$ bsub -x -Is -q rhel7F -n 16 ctest -j16
```Initial cleanup of new ATDM builds of Trilinoshttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3523install gitdist with trilinos install2018-09-27T17:02:35ZJames Willenbringinstall gitdist with trilinos install*Created by: rppawlo*
Is there any way we can get gitdist installed into the bin directory during trilinos installs? Might need a configure flag to enable/disable this capability?
@bartlettroscoe
@bathmatt *Created by: rppawlo*
Is there any way we can get gitdist installed into the bin directory during trilinos installs? Might need a configure flag to enable/disable this capability?
@bartlettroscoe
@bathmatt https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3515Framework: POSIX compatibility of build scripts2018-09-26T22:37:23ZJames WillenbringFramework: POSIX compatibility of build scripts*Created by: cgcgcg*
@trilinos/framework
## Expectations
It would be nice if the build scripts would work on all shells, and not just on bash. Examples of things that seem to create trouble are comparisons with `==` instead of `=` ...*Created by: cgcgcg*
@trilinos/framework
## Expectations
It would be nice if the build scripts would work on all shells, and not just on bash. Examples of things that seem to create trouble are comparisons with `==` instead of `=` and usage of the variable `BASH_SOURCE`.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3469Configuration of CUDA-enabled build fails2019-04-18T18:03:54ZJames WillenbringConfiguration of CUDA-enabled build fails*Created by: pelesh*
Configuration of a CUDA-enabled Tpetra/Kokkos build fails at the point where CMake checks for C++ compiler. The error message is:
```
Linking CXX executable cmTC_babeb
/usr/tce/packages/cmake/cmake-3.5.2/bi...*Created by: pelesh*
Configuration of a CUDA-enabled Tpetra/Kokkos build fails at the point where CMake checks for C++ compiler. The error message is:
```
Linking CXX executable cmTC_babeb
/usr/tce/packages/cmake/cmake-3.5.2/bin/cmake -E cmake_link_script
CMakeFiles/cmTC_babeb.dir/link.txt --verbose=1
/usr/tce/packages/openmpi/openmpi-2.0.0-gcc-6.1.0/bin/mpicxx
CMakeFiles/cmTC_babeb.dir/testCXXCompiler.cxx.o -o cmTC_babeb -rdynamic
nvcc fatal : Unknown option 'rdynamic,-fexceptions,-pthread'
gmake[1]: *** [cmTC_babeb] Error 1
```
It seems as if Kokkos' `nvc_wrapper` does not insert `-Xlinker` flags correctly.
I am building from current master branch on a system with following configuration:
```
$ module list
Currently Loaded Modules:
1) StdEnv 2) gcc/6.1.0 3) cuda/9.1.85 4) openmpi/2.0.0 5) cmake/3.5.2
```
The same issue occurs with different openmpi, CUDA, CMake and gcc versions.
I am following Tpetra build instructions from `$Trilinos/packages/tpetra/doc/FAQ.txt`, and I set environment variables like this:
```
export OMPI_CXX=${Trilinos}/packages/kokkos/bin/nvcc_wrapper
export CUDA_LAUNCH_BLOCKING=1
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
```
I use following configuration script to set cmake options:
```
#!/bin/bash
cmake \
-D CMAKE_C_COMPILER="mpicc" \
-D CMAKE_CXX_COMPILER="mpicxx" \
-D CMAKE_Fortran_COMPILER="mpif90" \
\
-D Trilinos_CXX11_FLAGS="-std=c++11 --expt-extended-lambda" \
\
-D Trilinos_ENABLE_OpenMP=OFF \
-D TPL_ENABLE_Pthread=OFF \
\
-D Trilinos_ENABLE_Teuchos=ON \
-D Trilinos_ENABLE_Tpetra=ON \
-D Tpetra_INST_SERIAL=ON \
-D Tpetra_INST_OPENMP=OFF \
\
-D Trilinos_ENABLE_Kokkos=ON \
-D Trilinos_ENABLE_KokkosCore=ON \
-D Kokkos_ENABLE_Serial=ON \
-D Kokkos_ENABLE_OpenMP=OFF \
-D Kokkos_ENABLE_Pthread=OFF \
-D Kokkos_ENABLE_Cuda=ON \
-D Kokkos_ENABLE_Cuda_UVM=ON \
-D Kokkos_ENABLE_Cuda_Lambda=ON \
-D Kokkos_ENABLE_Cuda_Relocatable_Device_Code=OFF \
-D TPL_ENABLE_CUDA=ON \
\
-D TPL_ENABLE_MPI=ON \
-D MPI_USE_COMPILER_WRAPPERS=ON \
\
-D Tpetra_INST_CUDA:BOOL=ON \
../Trilinos
```
Any advice how to get past this point would be most appreciated.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3471Tpetra: Consider adding vector element-wise division function2018-09-20T16:07:19ZJames WillenbringTpetra: Consider adding vector element-wise division function*Created by: pelesh*
Please consider adding element-wise division function for Tpetra (multi)vector:
z(i) = x(i)/y(i) for all i
This function is needed for Tpetra interface to [SUNDIALS](http://github.com/LLNL/sundials), but I guess i...*Created by: pelesh*
Please consider adding element-wise division function for Tpetra (multi)vector:
z(i) = x(i)/y(i) for all i
This function is needed for Tpetra interface to [SUNDIALS](http://github.com/LLNL/sundials), but I guess it may be useful elsewhere.
https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3501Framework: Windows build not posting to dashboard2018-09-29T03:06:13ZJames WillenbringFramework: Windows build not posting to dashboard*Created by: csiefer2*
Again.*Created by: csiefer2*
Again.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3499Anasazi tests failing in intel-18.0.2 builds on 'mutrino' and 'cee-rhel6' envs2019-03-26T16:13:53ZJames WillenbringAnasazi tests failing in intel-18.0.2 builds on 'mutrino' and 'cee-rhel6' envs*Created by: fryeguy52*
CC: @trilinos/anasazi , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTes...*Created by: fryeguy52*
CC: @trilinos/anasazi , @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe
## Next Action Status
## Description
As shown in [this query](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-09-24&filtercount=5&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=site&compare2=61&value2=mutrino&field3=status&compare3=62&value3=passed&field4=buildstarttime&compare4=83&value4=2018-09-01&field5=testname&compare5=63&value5=Anasazi) the tests:
* `Anasazi_MultiVecTraitsTest2_MPI_4`
* `Anasazi_Epetra_BKS_norestart_test_MPI_4`
are failing in the builds:
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW
* Trilinos-atdm-mutrino-intel-opt-openmp-KNL
both of these tests started failing on 9-22-2018.
The test `Anasazi_Epetra_BKS_norestart_test_MPI_4` is also failing in the build `Trilinos-atdm-cee-rhel6-intel-18.0.2-mpich2-3.2-serial-static-opt` for the 'cee-rhel6' inv since it was first set up.
The first failures of the test `Anasazi_MultiVecTraitsTest2_MPI_4` on 9/22/2018 is shown [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=55028214&build=3963372) which shows:
```
Check B_view = CloneViewNonConst(B, ind):
ind: [0, 2, 4, 6, 8]
static_cast<size_t> (B_view->getNumVectors ()) = 5 == static_cast<size_t> (ind.size ()) = 5 : passed
norms of CloneViewNonConst(B, ind): [2.42234, 2.43667, 2.43783, 2.39508, 2.97253]
B_view_norms[j] = 2.42233923795730766e+00 == normsB1[ind[j]] = 2.42233923795730766e+00 : passed
B_view_norms[j] = 2.43666989157577962e+00 == normsB1[ind[j]] = 2.43666989157578007e+00 : FAILED ==> /lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/Trilinos/packages/anasazi/tpetra/test/MVOPTester/MultiVecTraitsTest2.cpp:573
...
[FAILED] (0.158 sec) MultiVecTraits_TpetraSetBlock4_UnitTest
Location: /lscratch1/jenkins/mutrino-slave/workspace/Trilinos-atdm-mutrino-intel-opt-openmp-HSW/SRC_AND_BUILD/Trilinos/packages/anasazi/tpetra/test/MVOPTester/MultiVecTraitsTest2.cpp:434
The following tests FAILED:
3. MultiVecTraits_TpetraSetBlock4_UnitTest ...
Total Time: 1.55 sec
Summary: total = 4, run = 4, passed = 3, failed = 1
```
The first failures of the test `Anasazi_Epetra_BKS_norestart_test_MPI_4` on 9/22/2018 is shown [here](https://testing.sandia.gov/cdash-dev-view/testDetails.php?test=55028175&build=3963372) which shows:
```
Anasazi_Epetra_BKS_norestart_test.exe: malloc.c:2392: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
Anasazi_Epetra_BKS_norestart_test.exe: malloc.c:2392: sysmalloc: Assertion `(old_top == (((mbinptr) (((char *) &((av)->bins[((1) - 1) * 2])) - __builtin_offsetof (struct malloc_chunk, fd)))) && old_size == 0) || ((unsigned long) (old_size) >= (unsigned long)((((__builtin_offsetof (struct malloc_chunk, fd_nextsize))+((2 *(sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 *(sizeof(size_t))) - 1)) & ~((2 *(sizeof(size_t)) < __alignof__ (long double) ? __alignof__ (long double) : 2 *(sizeof(size_t))) - 1))) && ((old_top)->size & 0x1) && ((unsigned long) old_end & (pagesize - 1)) == 0)' failed.
srun: error: nid00012: tasks 0,3: Segmentation fault
srun: Terminating job step 11643442.1635
srun: error: nid00012: task 1: Aborted
slurmstepd: error: *** STEP 11643442.1635 ON nid00012 CANCELLED AT 2018-09-22T07:50:20 ***
srun: error: nid00012: task 2: Aborted (core dumped)
```
New commits for this build can be seen [here](https://testing.sandia.gov/cdash-dev-view/viewNotes.php?buildid=3963370#!#note6)
## Current Status on CDash
See:
* [Non passing Anasazi tests in 'mutrino' builds last 2 days](https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=5&showfilters=1&filtercombine=and&field1=testname&compare1=65&value1=Anasazi_&field2=buildname&compare2=65&value2=Trilinos-atdm-mutrino-&field3=groupname&compare3=62&value3=Experimental&field4=status&compare4=62&value4=passed&field5=buildstarttime&compare5=83&value5=2%20days%20ago)
## Steps to Reproduce
One should be able to reproduce this failure on the machine mutrino as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system mutrino are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#mutrino
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh intel-opt-openmp-HSW
$ cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Anasazi=ON \
$TRILINOS_DIR
$ make -j16
$ salloc -N 1 -p standard -J $JOB_NAME ctest -j16
```Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3496SEACAS tests 'Not Run' on ATDM mutrino builds2018-11-30T11:15:41ZJames WillenbringSEACAS tests 'Not Run' on ATDM mutrino builds*Created by: fryeguy52*
CC: @trilinos/seacas , @kddevin (Trilinos Data Services Product Lead), @bartlettroscoe
## Next Action Status
Tests disabled in PR in #3530 merged on 9/29/2018 and these tests went missing in this build o...*Created by: fryeguy52*
CC: @trilinos/seacas , @kddevin (Trilinos Data Services Product Lead), @bartlettroscoe
## Next Action Status
Tests disabled in PR in #3530 merged on 9/29/2018 and these tests went missing in this build on 9/30/2018 as shown [here](https://testing.sandia.gov/cdash-dev-view/viewTest.php?buildid=3993235). Next: Fix the tests?
## Description
several seacas tests are showing up as "not run" in the ATDM builds on mutrino. As shown [here](https://testing.sandia.gov/cdash-dev-view/queryTests.php?project=Trilinos&date=2018-09-24&filtercount=4&showfilters=1&filtercombine=and&field1=groupname&compare1=61&value1=ATDM&field2=site&compare2=61&value2=mutrino&field3=status&compare3=62&value3=passed&field4=buildstarttime&compare4=83&value4=2018-09-01) the tests are:
* SEACASAprepro_aprepro_test_exodus
* SEACASIoss_exodus32_to_exodus32
* SEACASIoss_exodus32_to_exodus32_pnetcdf
* SEACASIoss_exodus32_to_exodus64
are not run in the build:
* Trilinos-atdm-mutrino-intel-opt-openmp-HSW
the test output on CDash for all 4 of these is:
```
Unable to find required file: CMND_PATH-NOTFOUND
```
## Steps to Reproduce
One should be able to reproduce this failure on the machine mutrino as described in:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md
More specifically, the commands given for the system mutrino are provided at:
* https://github.com/trilinos/Trilinos/blob/develop/cmake/std/atdm/README.md#mutrino
The exact commands to reproduce this issue should be:
```
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh intel-opt-openmp-HSW
$ cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_SEACAS=ON \
$TRILINOS_DIR
$ make -j16
$ salloc -N 1 -p standard -J $JOB_NAME ctest -j16
```
Keep promoted "ATDM" builds of Trilinos cleanhttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3447download for Trilinos 12.12.1 is broken?2018-09-20T16:12:53ZJames Willenbringdownload for Trilinos 12.12.1 is broken?*Created by: boegel*
When I try to download Trilinos 12.12.1 via https://trilinos.org/download/, I end up with a broken link.
I considered downloading from https://github.com/trilinos/Trilinos/releases instead, but the source tarball...*Created by: boegel*
When I try to download Trilinos 12.12.1 via https://trilinos.org/download/, I end up with a broken link.
I considered downloading from https://github.com/trilinos/Trilinos/releases instead, but the source tarball tagged there for 12.12.1 seems to be something entirely different (e.g. `CTrilinos` is not in there).
Is the broken download via the website a known problem?
Is there another way to download the same `trilinos-12.12.1-Source.tar.gz` that was available via https://trilinos.org/download?
@trilinos/packagehttps://gitlab.osti.gov/jmwille/Trilinos/-/issues/3495Back-port PR #3483 (#3480 fix) to 12.14?2018-09-25T15:34:35ZJames WillenbringBack-port PR #3483 (#3480 fix) to 12.14?*Created by: mhoemmen*
@trilinos/muelu @Rombur requested back-porting PR #3483 (fixing #3480) to 12.14.*Created by: mhoemmen*
@trilinos/muelu @Rombur requested back-porting PR #3483 (fixing #3480) to 12.14.https://gitlab.osti.gov/jmwille/Trilinos/-/issues/3494Epetra lessons don't show code2018-09-27T18:05:14ZJames WillenbringEpetra lessons don't show code*Created by: davydden*
PackageName: Epetra
HTML pages for Epetra Lessons (i.e. https://trilinos.org/docs/dev/packages/epetra/doc/html/Epetra_Lesson02.html) do not render the code. This works as expected for TPetra (i.e. https://trili...*Created by: davydden*
PackageName: Epetra
HTML pages for Epetra Lessons (i.e. https://trilinos.org/docs/dev/packages/epetra/doc/html/Epetra_Lesson02.html) do not render the code. This works as expected for TPetra (i.e. https://trilinos.org/docs/dev/packages/tpetra/doc/html/Tpetra_Lesson01.html)
@trilinos/epetra