MueLu: Compiling MueLu header files failes with CUDA support enabled in Kokkos
*Created by: masterleinad*
@trilinos/muelu
## Current Behavior
Compiling a file just containing
```
#include <MueLu.hpp>
```
fails with
```
/export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed
/export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed
/export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(2461): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelReduce< ::Kokkos::Impl::CudaFunctorAdapter< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda > , unsigned long, void> , ::Kokkos::RangePolicy< ::Kokkos::Cuda > , ::Kokkos::InvalidType, ::Kokkos::Cuda> ::operator ()") is not allowed
/export/home/darndt/Trilinos-dev/include/Cuda/Kokkos_Cuda_Parallel.hpp(451): error: calling a __host__ function("") from a __device__ function("Kokkos::Impl::ParallelFor< ::, ::Kokkos::RangePolicy< ::Kokkos::Cuda, ::Kokkos::IndexType<int> > , ::Kokkos::Cuda> ::operator ()") is not allowed
4 errors detected in the compilation of "/tmp/tmpxft_00006db9_00000000-7_trilinos_precondition_muelu.cpp1.ii".
```
Unfortunately, this error doesn't really tell which function it is complaining about ("") and there is no additional information who tried to instantiate the template.
## Your Environment
The commit used is 4f15e6fb356295d8ba1e022e94d8b0bad732e082 and I configured `Trilinos` using
```
-DTrilinos_ENABLE_Amesos=ON \
-DTrilinos_ENABLE_Epetra=ON \
-DTrilinos_ENABLE_EpetraExt=ON \
-DTrilinos_ENABLE_Ifpack=ON \
-DTrilinos_ENABLE_AztecOO=ON \
-DTrilinos_ENABLE_Sacado=OFF \
-DTrilinos_ENABLE_Kokkos=ON \
-DTrilinos_ENABLE_Teuchos=ON \
-DTrilinos_ENABLE_MueLu=ON \
-DTrilinos_ENABLE_ML=ON \
-DTrilinos_ENABLE_ROL=ON \
-DTrilinos_ENABLE_Tpetra=ON \
-DTrilinos_ENABLE_Zoltan=ON \
-DTrilinos_ENABLE_TESTS=ON \
-DTrilinos_VERBOSE_CONFIGURE=OFF \
-DTPL_ENABLE_MPI=ON \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_VERBOSE_MAKEFILE=OFF \
-DCMAKE_BUILD_TYPE=RELEASE \
-DCMAKE_INSTALL_PREFIX:PATH=$HOME/Trilinos-dev \
-DCMAKE_CXX_FLAGS="-std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \
-DTPL_ENABLE_CUDA=ON \
-DKokkos_ENABLE_Cuda=ON \
-DKokkos_ENABLE_Cuda_Lambda:BOOL=ON \
-DKokkos_ENABLE_Cuda_UVM:BOOL=ON \
```
The compiler used (via `nvcc_wrapper`) is gcc-5.5 (`openmpi-3.1.3`) and `cuda-8.0`.
## Related Issues
Related to https://github.com/dealii/dealii/pull/7634 and https://github.com/dealii/dealii/issues/6856.
issue