Sacado: Unit Test Fad_KokkosTests.hpp linking error
*Created by: Yatagarasu50469* @trilinos/sacado ## Expectations Successful linkage and compilation of CUDA with Trilinos on a Mac OS X 10.13.6 system through modification of the configuration listed, and/or a patch for the Fad_KokkosTests.hpp file. ## Current Behavior Salutations, I have been trying to compile Trilinos with CUDA acceleration for the past couple of weeks on a Mac workstation resulting in the following error(s) below (Directory names replaced for post): <pre><code>[ 72%] Linking CXX executable Sacado_CacheFadCommTests.exe /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests.hpp:1588:26: error: ambiguous partial specializations of 'inner_layout<Kokkos::LayoutContiguous<Kokkos::LayoutLeft, 32> >' typedef typename Kokkos::inner_layout< Layout> ::type TestLayout; ^ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests_Cuda_Hierarchical_SFad.cpp:55:3805: note: in instantiation of member function 'Kokkos_View_Fad_Unmanaged_UnitTest<Sacado::Fad::SFad<double, 64>, Kokkos::LayoutContiguous<Kokkos::LayoutLeft, 32>, Kokkos::Cuda>::runUnitTestImpl' requested here /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:88:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32] struct inner_layout< LayoutContiguous< Layout, N> > { ^ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:93:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32] struct inner_layout< LayoutContiguous< Layout, N> > { ^ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:98:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32] struct inner_layout< LayoutContiguous< Layout, N> > { ^ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:103:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32] struct inner_layout< LayoutContiguous< Layout, N> > { ^ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests.hpp:1629:1: error: no matching function for call to 'deep_copy' Kokkos::deep_copy(v, h_v); ^~~~~~~~~~~~~~~~~ /Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/kokkos/core/src/Kokkos_CopyViews.hpp:1216:1: note: candidate template ignored: could not match 'View<type-parameter-0-0, type-parameter-0-1...>' against 'int' deep_copy(const View< DT, DP...> & ^</code></pre> ## Motivation and Context Ultimately I am attempting to optimize my workstation for development work with Peridigm, but for now seeking the ability to use CUDA acceleration within a Mac Trilinos build. ## Your Environment **Operating system and version:** - OS X 10.13.6 - Trilinos-devel **Compiler and TPL versions:** - Apple LLVM version 9.0.0 (clang-900.0.39.2) - CUDA Toolkit 9.1 - HDF5 1.10.2.1 - NetCDF 4.6.1.2 - Boost 1.59.0 - Openblas 0.3.1 - CMake 3.12.0 **Relevant configure flags or configure script:** (Directory names replaced for post) <pre><code>export OMPI_CXX=/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/kokkos/bin/nvcc_wrapper export NVCC_WRAPPER_DEFAULT_COMPILER=clang++ export CUDA_LAUNCH_BLOCKING=1 export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1 cmake -D CMAKE_INSTALL_PREFIX:PATH=/usr/local/trilinos \ -D CMAKE_CXX_FLAGS:STRING="--std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \ -D CMAKE_Fortran_COMPILER=/usr/local/Cellar/open-mpi/3.1.1/bin/mpif90 \ -D CMAKE_CXX_USE_RESPONSE_FILE_FOR_OBJECTS=ON \ -D TPL_ENABLE_MPI=ON \ -D TPL_ENABLE_CUDA=ON \ -D Kokkos_ENABLE_Serial=ON \ -D Kokkos_ENABLE_OpenMP=OFF \ -D Kokkos_ENABLE_Pthread=OFF \ -D Kokkos_ENABLE_Cuda=ON \ -D Kokkos_ENABLE_Cuda_UVM=ON \ -D Kokkos_ENABLE_Cuda_RDC=ON \ -D Kokkos_ENABLE_Cuda_Lambda:BOOL=ON \ -D CMAKE_BUILD_TYPE:STRING=RELEASE \ -D MPI_BASE_DIR:PATH=/usr/local/Cellar/open-mpi/3.1.1/ \ -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \ -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \ -D Trilinos_ENABLE_Teuchos:BOOL=ON \ -D Trilinos_ENABLE_Shards:BOOL=ON \ -D Trilinos_ENABLE_Sacado:BOOL=ON \ -D Trilinos_ENABLE_Epetra:BOOL=ON \ -D Trilinos_ENABLE_EpetraExt:BOOL=ON \ -D Trilinos_ENABLE_Ifpack:BOOL=ON \ -D Trilinos_ENABLE_AztecOO:BOOL=ON \ -D Trilinos_ENABLE_Amesos:BOOL=ON \ -D Trilinos_ENABLE_Anasazi:BOOL=ON \ -D Trilinos_ENABLE_Belos:BOOL=ON \ -D Trilinos_ENABLE_ML:BOOL=ON \ -D Trilinos_ENABLE_Phalanx:BOOL=ON \ -D Trilinos_ENABLE_Intrepid:BOOL=ON \ -D Trilinos_ENABLE_NOX:BOOL=ON \ -D Trilinos_ENABLE_Stratimikos:BOOL=ON \ -D Trilinos_ENABLE_Thyra:BOOL=ON \ -D Trilinos_ENABLE_Rythmos:BOOL=ON \ -D Trilinos_ENABLE_MOOCHO:BOOL=ON \ -D Trilinos_ENABLE_TriKota:BOOL=OFF \ -D Trilinos_ENABLE_Stokhos:BOOL=ON \ -D Trilinos_ENABLE_Zoltan:BOOL=ON \ -D Trilinos_ENABLE_Piro:BOOL=ON \ -D Trilinos_ENABLE_Teko:BOOL=ON \ -D Trilinos_ENABLE_SEACASIoss:BOOL=ON \ -D Trilinos_ENABLE_SEACAS:BOOL=ON \ -D Trilinos_ENABLE_SEACASBlot:BOOL=ON \ -D Trilinos_ENABLE_Pamgen:BOOL=ON \ -D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \ -D Trilinos_ENABLE_TESTS:BOOL=ON \ -D TPL_ENABLE_HDF5:BOOL=ON \ -D HDF5_INCLUDE_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/include/ \ -D HDF5_LIBRARY_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/lib/ \ -D TPL_ENABLE_Netcdf:BOOL=ON \ -D Netcdf_INCLUDE_DIRS:PATH=/usr/local/Cellar/netcdf/4.3.3.1/include \ -D Netcdf_LIBRARY_DIRS:PATH=/usr/local/Cellar/netcdf/4.3.3.1/lib \ -D TPL_ENABLE_BLAS:BOOL=ON \ -D TPL_ENABLE_LAPACK:BOOL=ON \ -D TPL_ENABLE_Boost:BOOL=ON \ -D Boost_INCLUDE_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/include \ -D Boost_LIBRARY_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/lib \ -D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \ -D Trilinos_VERBOSE_CONFIGURE:BOOL=OFF \ ../</code></pre> ## Additional Information **What I have tried so far** - Multiple versions of the CUDA toolkit going back to 8.0, with the same results (regressed Command Line Tools to match recommendations in NVIDIA's online documentation) - Using `mpicc`, and `g++` for the NVCC Wrapper Default Compiler as well as `mpif77` for the Fortran compiler - Trilinos will compile with `-D Trilinos_ENABLE_TESTS:BOOL=OFF`, but then fails to function correctly. - Trilinos will compile with `-D Sacado_ENABLE_TESTS:STRING=OFF`, but then fails to run any tests, or function correctly when linked with Peridigm. - Attempted installations using the 12.12.1 release and master branch resulting in the same errors. **What does compile and function** - A configuration without CUDA enabled: <pre><code>cmake -D CMAKE_INSTALL_PREFIX:PATH=/usr/local/trilinos \ -D CMAKE_CXX_FLAGS:STRING="-O2 -std=c++11 -pedantic -ftrapv -Wall -Wno-long-long" \ -D CMAKE_BUILD_TYPE:STRING=RELEASE \ -D MPI_BASE_DIR:PATH=/usr/local/Cellar/open-mpi/3.1.1/ \ -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \ -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \ -D Trilinos_ENABLE_Teuchos:BOOL=ON \ -D Trilinos_ENABLE_Shards:BOOL=ON \ -D Trilinos_ENABLE_Sacado:BOOL=ON \ -D Trilinos_ENABLE_Epetra:BOOL=ON \ -D Trilinos_ENABLE_EpetraExt:BOOL=ON \ -D Trilinos_ENABLE_Ifpack:BOOL=ON \ -D Trilinos_ENABLE_AztecOO:BOOL=ON \ -D Trilinos_ENABLE_Amesos:BOOL=ON \ -D Trilinos_ENABLE_Anasazi:BOOL=ON \ -D Trilinos_ENABLE_Belos:BOOL=ON \ -D Trilinos_ENABLE_ML:BOOL=ON \ -D Trilinos_ENABLE_Phalanx:BOOL=ON \ -D Trilinos_ENABLE_Intrepid:BOOL=ON \ -D Trilinos_ENABLE_NOX:BOOL=ON \ -D Trilinos_ENABLE_Stratimikos:BOOL=ON \ -D Trilinos_ENABLE_Thyra:BOOL=ON \ -D Trilinos_ENABLE_Rythmos:BOOL=ON \ -D Trilinos_ENABLE_MOOCHO:BOOL=ON \ -D Trilinos_ENABLE_TriKota:BOOL=OFF \ -D Trilinos_ENABLE_Stokhos:BOOL=ON \ -D Trilinos_ENABLE_Zoltan:BOOL=ON \ -D Trilinos_ENABLE_Piro:BOOL=ON \ -D Trilinos_ENABLE_Teko:BOOL=ON \ -D Trilinos_ENABLE_SEACASIoss:BOOL=ON \ -D Trilinos_ENABLE_SEACAS:BOOL=ON \ -D Trilinos_ENABLE_SEACASBlot:BOOL=ON \ -D Trilinos_ENABLE_Pamgen:BOOL=ON \ -D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \ -D Trilinos_ENABLE_TESTS:BOOL=ON \ -D TPL_ENABLE_HDF5:BOOL=ON \ -D HDF5_INCLUDE_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/include/ \ -D HDF5_LIBRARY_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/lib/ \ -D TPL_ENABLE_Netcdf:BOOL=ON \ -D Netcdf_INCLUDE_DIRS:PATH=/usr/local/Cellar/netcdf/4.6.1_2/include \ -D Netcdf_LIBRARY_DIRS:PATH=/usr/local/Cellar/netcdf/4.6.1_2/lib \ -D TPL_ENABLE_MPI:BOOL=ON \ -D TPL_ENABLE_BLAS:BOOL=ON \ -D TPL_ENABLE_LAPACK:BOOL=ON \ -D TPL_ENABLE_Boost:BOOL=ON \ -D Boost_INCLUDE_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/include \ -D Boost_LIBRARY_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/lib \ -D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \ -D Trilinos_VERBOSE_CONFIGURE:BOOL=OFF \ ../</code></pre> - A Kokkos 2.7.00 build outside of Trilinos (though tests were not performed) <pre><code>export CUDA_LAUNCH_BLOCKING=1 export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1 cmake \ -D CMAKE_CXX_COMPILER=/Users/NAME/DIRECTORY/tmp/kokkos-2.7.00/bin/nvcc_wrapper \ -D CMAKE_CXX_FLAGS:STRING="-O2 -std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \ -D CMAKE_CXX_USE_RESPONSE_FILE_FOR_OBJECTS=ON \ -D KOKKOS_ENABLE_CUDA=ON \ -D Kokkos_ENABLE_Cuda_UVM:BOOL=ON \ -D KOKKOS_ENABLE_CUDA_LAMBDA=ON \ -D Kokkos_ENABLE_LIBRT=OFF \ ../</code></pre>
issue