MueLu build failures in new ATDM Trilinos sems-rhel7+cuda+complex builds
Created by: bartlettroscoe
CC: @trilinos/muelu, @srajama1 (Trilinos Linear Solvers Product Lead), @bartlettroscoe, @fryeguy52
Next Action Status
Description
As shown in this query, MueLu has build errors in library code in the new cuda+complex builds:
Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-shared-release-debug
Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-static-release-debug
using the 'sems-rhel7' env.
The build errors shown here and here show errors building the source files ExplicitInstantiation/MueLu_TentativePFactory_kokkos.cpp
showing errors like:
Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-shared-release-debug/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Kokkos_View.hpp(816): error: calling a constexpr __host__ function("std::real<double> ") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy<int, ::Kokkos::Cuda > , ::Kokkos::Cuda> ::operator () const") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
and ExplicitInstantiation/MueLu_TentativePFactory_kokkos.cpp
showing errors like:
Trilinos-atdm-sems-rhel7-cuda-9.2-Volta70-complex-shared-release-debug/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Kokkos_View.hpp(971): error: calling a constexpr __host__ function("std::complex<double> ::complex") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy<int, ::Kokkos::Cuda > , ::Kokkos::Cuda> ::operator () const") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.
Current Status on CDash
The current status of these builds over the last 7 days can be see in this query.
Steps to Reproduce
These builds are from the CEE LAN machine 'ascicgpu14' and someone with access to the CEE LAN should be able to log onto 'ascicgpu15' and reproduce these failures in as described in:
More specifically, the commands given for the system `sems-rhel7' are provided at:
The exact commands to reproduce this issue should be:
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh \
sems-rhel7-cuda-9.2-Volta70-complex-shared-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ ninja -j16
Since some developers do not have access to the SRN CEE LAN, it is likely that these build errors can also be produce on other machines that have a CUDA build. For example, one can likely reproduce these build errors on the SON machine 'white' as described at:
using the commands:
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-9.2-complex-release-debug
$ cmake \
-GNinja \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_MueLu=ON \
$TRILINOS_DIR
$ ninja -j16