MueLu: Compiling MueLu header files failes with CUDA support enabled in Kokkos
*Created by: masterleinad* @trilinos/muelu ## Current Behavior Compiling a file just containing ``` #include <MueLu.hpp> ``` fails with ``` /export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed /export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed /export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(2461): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelReduce< ::Kokkos::Impl::CudaFunctorAdapter< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda > , unsigned long, void> , ::Kokkos::RangePolicy< ::Kokkos::Cuda > , ::Kokkos::InvalidType, ::Kokkos::Cuda> ::operator ()") is not allowed /export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed 4 errors detected in the compilation of "/tmp/tmpxft_00006db9_00000000-7_trilinos_precondition_muelu.cpp1.ii". ``` Unfortunately, this error doesn't really tell which function it is complaining about ("") and there is no additional information who tried to instantiate the template. ## Your Environment The commit used is 4f15e6fb356295d8ba1e022e94d8b0bad732e082 and I configured `Trilinos` using ``` -DTrilinos_ENABLE_Amesos=ON \ -DTrilinos_ENABLE_Epetra=ON \ -DTrilinos_ENABLE_EpetraExt=ON \ -DTrilinos_ENABLE_Ifpack=ON \ -DTrilinos_ENABLE_AztecOO=ON \ -DTrilinos_ENABLE_Sacado=OFF \ -DTrilinos_ENABLE_Kokkos=ON \ -DTrilinos_ENABLE_Teuchos=ON \ -DTrilinos_ENABLE_MueLu=ON \ -DTrilinos_ENABLE_ML=ON \ -DTrilinos_ENABLE_ROL=ON \ -DTrilinos_ENABLE_Tpetra=ON \ -DTrilinos_ENABLE_Zoltan=ON \ -DTrilinos_ENABLE_TESTS=ON \ -DTrilinos_VERBOSE_CONFIGURE=OFF \ -DTPL_ENABLE_MPI=ON \ -DBUILD_SHARED_LIBS=ON \ -DCMAKE_VERBOSE_MAKEFILE=OFF \ -DCMAKE_BUILD_TYPE=RELEASE \ -DCMAKE_INSTALL_PREFIX:PATH=$HOME/Trilinos-dev \ -DCMAKE_CXX_FLAGS="-std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \ -DTPL_ENABLE_CUDA=ON \ -DKokkos_ENABLE_Cuda=ON \ -DKokkos_ENABLE_Cuda_Lambda:BOOL=ON \ -DKokkos_ENABLE_Cuda_UVM:BOOL=ON \ ``` The compiler used (via `nvcc_wrapper`) is gcc-5.5 (`openmpi-3.1.3`) and `cuda-8.0`. ## Related Issues Related to https://github.com/dealii/dealii/pull/7634 and https://github.com/dealii/dealii/issues/6856.
issue