Sacado: Unit Test Fad_KokkosTests.hpp linking error
Created by: Yatagarasu50469
@trilinos/sacado
Expectations
Successful linkage and compilation of CUDA with Trilinos on a Mac OS X 10.13.6 system through modification of the configuration listed, and/or a patch for the Fad_KokkosTests.hpp file.
Current Behavior
Salutations, I have been trying to compile Trilinos with CUDA acceleration for the past couple of weeks on a Mac workstation resulting in the following error(s) below (Directory names replaced for post):
[ 72%] Linking CXX executable Sacado_CacheFadCommTests.exe
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests.hpp:1588:26: error: ambiguous partial specializations of 'inner_layout >'
typedef typename Kokkos::inner_layout< Layout> ::type TestLayout;
^
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests_Cuda_Hierarchical_SFad.cpp:55:3805: note: in instantiation of member function
'Kokkos_View_Fad_Unmanaged_UnitTest, Kokkos::LayoutContiguous, Kokkos::Cuda>::runUnitTestImpl' requested here
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:88:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32]
struct inner_layout< LayoutContiguous< Layout, N> > {
^
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:93:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32]
struct inner_layout< LayoutContiguous< Layout, N> > {
^
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:98:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32]
struct inner_layout< LayoutContiguous< Layout, N> > {
^
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/src/Kokkos_LayoutContiguous.hpp:103:8: note: partial specialization matches [with Layout = Kokkos::LayoutLeft, N = 32]
struct inner_layout< LayoutContiguous< Layout, N> > {
^
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/sacado/test/UnitTests/Fad_KokkosTests.hpp:1629:1: error: no matching function for call to 'deep_copy'
Kokkos::deep_copy(v, h_v);
^~~~~~~~~~~~~~~~~
/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/kokkos/core/src/Kokkos_CopyViews.hpp:1216:1: note: candidate template ignored: could not match 'View' against 'int'
deep_copy(const View< DT, DP...> &
^
Motivation and Context
Ultimately I am attempting to optimize my workstation for development work with Peridigm, but for now seeking the ability to use CUDA acceleration within a Mac Trilinos build.
Your Environment
Operating system and version:
- OS X 10.13.6
- Trilinos-devel
Compiler and TPL versions:
- Apple LLVM version 9.0.0 (clang-900.0.39.2)
- CUDA Toolkit 9.1
- HDF5 1.10.2.1
- NetCDF 4.6.1.2
- Boost 1.59.0
- Openblas 0.3.1
- CMake 3.12.0
Relevant configure flags or configure script:
(Directory names replaced for post)
export OMPI_CXX=/Users/NAME/DIRECTORY/tmp/Trilinos-develop/packages/kokkos/bin/nvcc_wrapper
export NVCC_WRAPPER_DEFAULT_COMPILER=clang++
export CUDA_LAUNCH_BLOCKING=1
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
cmake -D CMAKE_INSTALL_PREFIX:PATH=/usr/local/trilinos \
-D CMAKE_CXX_FLAGS:STRING="--std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \
-D CMAKE_Fortran_COMPILER=/usr/local/Cellar/open-mpi/3.1.1/bin/mpif90 \
-D CMAKE_CXX_USE_RESPONSE_FILE_FOR_OBJECTS=ON \
-D TPL_ENABLE_MPI=ON \
-D TPL_ENABLE_CUDA=ON \
-D Kokkos_ENABLE_Serial=ON \
-D Kokkos_ENABLE_OpenMP=OFF \
-D Kokkos_ENABLE_Pthread=OFF \
-D Kokkos_ENABLE_Cuda=ON \
-D Kokkos_ENABLE_Cuda_UVM=ON \
-D Kokkos_ENABLE_Cuda_RDC=ON \
-D Kokkos_ENABLE_Cuda_Lambda:BOOL=ON \
-D CMAKE_BUILD_TYPE:STRING=RELEASE \
-D MPI_BASE_DIR:PATH=/usr/local/Cellar/open-mpi/3.1.1/ \
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \
-D Trilinos_ENABLE_Teuchos:BOOL=ON \
-D Trilinos_ENABLE_Shards:BOOL=ON \
-D Trilinos_ENABLE_Sacado:BOOL=ON \
-D Trilinos_ENABLE_Epetra:BOOL=ON \
-D Trilinos_ENABLE_EpetraExt:BOOL=ON \
-D Trilinos_ENABLE_Ifpack:BOOL=ON \
-D Trilinos_ENABLE_AztecOO:BOOL=ON \
-D Trilinos_ENABLE_Amesos:BOOL=ON \
-D Trilinos_ENABLE_Anasazi:BOOL=ON \
-D Trilinos_ENABLE_Belos:BOOL=ON \
-D Trilinos_ENABLE_ML:BOOL=ON \
-D Trilinos_ENABLE_Phalanx:BOOL=ON \
-D Trilinos_ENABLE_Intrepid:BOOL=ON \
-D Trilinos_ENABLE_NOX:BOOL=ON \
-D Trilinos_ENABLE_Stratimikos:BOOL=ON \
-D Trilinos_ENABLE_Thyra:BOOL=ON \
-D Trilinos_ENABLE_Rythmos:BOOL=ON \
-D Trilinos_ENABLE_MOOCHO:BOOL=ON \
-D Trilinos_ENABLE_TriKota:BOOL=OFF \
-D Trilinos_ENABLE_Stokhos:BOOL=ON \
-D Trilinos_ENABLE_Zoltan:BOOL=ON \
-D Trilinos_ENABLE_Piro:BOOL=ON \
-D Trilinos_ENABLE_Teko:BOOL=ON \
-D Trilinos_ENABLE_SEACASIoss:BOOL=ON \
-D Trilinos_ENABLE_SEACAS:BOOL=ON \
-D Trilinos_ENABLE_SEACASBlot:BOOL=ON \
-D Trilinos_ENABLE_Pamgen:BOOL=ON \
-D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \
-D Trilinos_ENABLE_TESTS:BOOL=ON \
-D TPL_ENABLE_HDF5:BOOL=ON \
-D HDF5_INCLUDE_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/include/ \
-D HDF5_LIBRARY_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/lib/ \
-D TPL_ENABLE_Netcdf:BOOL=ON \
-D Netcdf_INCLUDE_DIRS:PATH=/usr/local/Cellar/netcdf/4.3.3.1/include \
-D Netcdf_LIBRARY_DIRS:PATH=/usr/local/Cellar/netcdf/4.3.3.1/lib \
-D TPL_ENABLE_BLAS:BOOL=ON \
-D TPL_ENABLE_LAPACK:BOOL=ON \
-D TPL_ENABLE_Boost:BOOL=ON \
-D Boost_INCLUDE_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/include \
-D Boost_LIBRARY_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/lib \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \
-D Trilinos_VERBOSE_CONFIGURE:BOOL=OFF \
../
Additional Information
What I have tried so far
- Multiple versions of the CUDA toolkit going back to 8.0, with the same results (regressed Command Line Tools to match recommendations in NVIDIA's online documentation)
- Using
mpicc
, andg++
for the NVCC Wrapper Default Compiler as well asmpif77
for the Fortran compiler - Trilinos will compile with
-D Trilinos_ENABLE_TESTS:BOOL=OFF
, but then fails to function correctly. - Trilinos will compile with
-D Sacado_ENABLE_TESTS:STRING=OFF
, but then fails to run any tests, or function correctly when linked with Peridigm. - Attempted installations using the 12.12.1 release and master branch resulting in the same errors.
What does compile and function
- A configuration without CUDA enabled:
cmake -D CMAKE_INSTALL_PREFIX:PATH=/usr/local/trilinos \
-D CMAKE_CXX_FLAGS:STRING="-O2 -std=c++11 -pedantic -ftrapv -Wall -Wno-long-long" \
-D CMAKE_BUILD_TYPE:STRING=RELEASE \
-D MPI_BASE_DIR:PATH=/usr/local/Cellar/open-mpi/3.1.1/ \
-D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
-D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \
-D Trilinos_ENABLE_Teuchos:BOOL=ON \
-D Trilinos_ENABLE_Shards:BOOL=ON \
-D Trilinos_ENABLE_Sacado:BOOL=ON \
-D Trilinos_ENABLE_Epetra:BOOL=ON \
-D Trilinos_ENABLE_EpetraExt:BOOL=ON \
-D Trilinos_ENABLE_Ifpack:BOOL=ON \
-D Trilinos_ENABLE_AztecOO:BOOL=ON \
-D Trilinos_ENABLE_Amesos:BOOL=ON \
-D Trilinos_ENABLE_Anasazi:BOOL=ON \
-D Trilinos_ENABLE_Belos:BOOL=ON \
-D Trilinos_ENABLE_ML:BOOL=ON \
-D Trilinos_ENABLE_Phalanx:BOOL=ON \
-D Trilinos_ENABLE_Intrepid:BOOL=ON \
-D Trilinos_ENABLE_NOX:BOOL=ON \
-D Trilinos_ENABLE_Stratimikos:BOOL=ON \
-D Trilinos_ENABLE_Thyra:BOOL=ON \
-D Trilinos_ENABLE_Rythmos:BOOL=ON \
-D Trilinos_ENABLE_MOOCHO:BOOL=ON \
-D Trilinos_ENABLE_TriKota:BOOL=OFF \
-D Trilinos_ENABLE_Stokhos:BOOL=ON \
-D Trilinos_ENABLE_Zoltan:BOOL=ON \
-D Trilinos_ENABLE_Piro:BOOL=ON \
-D Trilinos_ENABLE_Teko:BOOL=ON \
-D Trilinos_ENABLE_SEACASIoss:BOOL=ON \
-D Trilinos_ENABLE_SEACAS:BOOL=ON \
-D Trilinos_ENABLE_SEACASBlot:BOOL=ON \
-D Trilinos_ENABLE_Pamgen:BOOL=ON \
-D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \
-D Trilinos_ENABLE_TESTS:BOOL=ON \
-D TPL_ENABLE_HDF5:BOOL=ON \
-D HDF5_INCLUDE_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/include/ \
-D HDF5_LIBRARY_DIRS:PATH=/usr/local/Cellar/hdf5/1.10.2_1/lib/ \
-D TPL_ENABLE_Netcdf:BOOL=ON \
-D Netcdf_INCLUDE_DIRS:PATH=/usr/local/Cellar/netcdf/4.6.1_2/include \
-D Netcdf_LIBRARY_DIRS:PATH=/usr/local/Cellar/netcdf/4.6.1_2/lib \
-D TPL_ENABLE_MPI:BOOL=ON \
-D TPL_ENABLE_BLAS:BOOL=ON \
-D TPL_ENABLE_LAPACK:BOOL=ON \
-D TPL_ENABLE_Boost:BOOL=ON \
-D Boost_INCLUDE_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/include \
-D Boost_LIBRARY_DIRS:PATH=/usr/local/Cellar/boost\@1.59/1.59.0/lib \
-D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \
-D Trilinos_VERBOSE_CONFIGURE:BOOL=OFF \
../
- A Kokkos 2.7.00 build outside of Trilinos (though tests were not performed)
export CUDA_LAUNCH_BLOCKING=1
export CUDA_MANAGED_FORCE_DEVICE_ALLOC=1
cmake \
-D CMAKE_CXX_COMPILER=/Users/NAME/DIRECTORY/tmp/kokkos-2.7.00/bin/nvcc_wrapper \
-D CMAKE_CXX_FLAGS:STRING="-O2 -std=c++11 --expt-extended-lambda -g -lineinfo -Xcudafe --diag_suppress=conversion_function_not_usable -Xcudafe --diag_suppress=cc_clobber_ignored -Xcudafe --diag_suppress=code_is_unreachable" \
-D CMAKE_CXX_USE_RESPONSE_FILE_FOR_OBJECTS=ON \
-D KOKKOS_ENABLE_CUDA=ON \
-D Kokkos_ENABLE_Cuda_UVM:BOOL=ON \
-D KOKKOS_ENABLE_CUDA_LAMBDA=ON \
-D Kokkos_ENABLE_LIBRT=OFF \
../