Panzer examples failures with new ATDM CUDA builds on hansen/shiller
Created by: bartlettroscoe
Summary
CC: @trilinos/nox, @fryeguy52
Next Action Status
Panzer examples build as of 3/29/2018 and any remaining test/example failures are being addressed in other issues #2454 (closed) and #2471 (closed).
Description
The Panzer examples don't currently build whenn building with the ATDM CUDA build configuration. There are build failures shown at:
for the builds:
-
Trilinos-atdm-hansen-shiller-cuda-debug
: https://testing.sandia.gov/cdash/index.php?project=Trilinos&parentid=3412693 -
Trilinos-atdm-hansen-shiller-cuda-opt
: https://testing.sandia.gov/cdash/index.php?project=Trilinos&parentid=3412702
This shows a build failure for the file packages/panzer/adapters-stk/tutorial/siamCse17/mySourceTerm.cpp
which the beginning looks like:
/home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-opt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Cuda/Kokkos_CudaExec.hpp(397): error: The closure type for a lambda ("lambda [](panzer::index_t)->void") cannot be used in the template argument type of a __global__ function template instantiation, unless the lambda is defined within a __device__ or __global__ function, or the lambda is a 'extended lambda' and the flag --expt-extended-lambda is specified
detected during:
instantiation of "Kokkos::Impl::cuda_parallel_launch_local_memory" based on template argument <Kokkos::Impl::ParallelFor<lambda [](panzer::index_t)->void, Kokkos::RangePolicy<Kokkos::Cuda>, Kokkos::Cuda>>
(398): here
instantiation of "Kokkos::Impl::CudaParallelLaunch<DriverType, Kokkos::LaunchBounds<0U, 0U>, false>::CudaParallelLaunch(const DriverType &, const dim3 &, const dim3 &, int, cudaStream_t) [with DriverType=Kokkos::Impl::ParallelFor<lambda [](panzer::index_t)->void, Kokkos::RangePolicy<Kokkos::Cuda>, Kokkos::Cuda>]"
/home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-opt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Cuda/Kokkos_Cuda_Parallel.hpp(370): here
instantiation of "void Kokkos::Impl::ParallelFor<FunctorType, Kokkos::RangePolicy<Traits...>, Kokkos::Cuda>::execute() const [with FunctorType=lambda [](panzer::index_t)->void, Traits=<Kokkos::Cuda>]"
/home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-opt/SRC_AND_BUILD/Trilinos/packages/kokkos/core/src/Kokkos_Parallel.hpp(224): here
instantiation of "void Kokkos::parallel_for(size_t, const FunctorType &, const std::string &) [with FunctorType=lambda [](panzer::index_t)->void]"
/home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-opt/SRC_AND_BUILD/Trilinos/packages/panzer/adapters-stk/tutorial/siamCse17/mySourceTermImpl.hpp(143): here
instantiation of "void MySourceTerm<EvalT, Traits>::evaluateFields(Traits::EvalData) [with EvalT=panzer::Traits::Residual, Traits=panzer::Traits]"
/home/jenkins/hansen/workspace/Trilinos-atdm-hansen-shiller-cuda-opt/SRC_AND_BUILD/Trilinos/packages/panzer/adapters-stk/tutorial/siamCse17/mySourceTerm.cpp(56): here
...
6 errors detected in the compilation of "/tmp/tmpxft_000011a6_00000000-7_mySourceTerm.cpp1.ii".
The reset of the build failures are link failures for the executables:
PanzerAdaptersSTK_step01.exe
PanzerAdaptersSTK_me_main_driver.exe
PanzerAdaptersSTK_main_driver.exe
PanzerMiniEM_BlockPrec.exe
All of these show similar link failures that look like:
../../../../muelu/adapters/libmuelu-adapters.a(MueLu_RefMaxwell.cpp.o): In function `MueLu::RefMaxwell<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::compute()':
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZN5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE7computeEv[_ZN5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE7computeEv]+0x3f37): undefined reference to `Ifpack2::Hiptmair<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::Hiptmair(Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&, Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&, Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&)'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZN5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE7computeEv[_ZN5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE7computeEv]+0x418d): undefined reference to `Ifpack2::Hiptmair<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > >::Hiptmair(Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&, Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&, Teuchos::RCP<Tpetra::RowMatrix<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const> const&)'
../../../../muelu/adapters/libmuelu-adapters.a(MueLu_RefMaxwell.cpp.o): In function `MueLu::RefMaxwell<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::apply(Xpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const&, Xpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >&, Teuchos::ETransp, double, double) const':
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x44b): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::MultiVector(Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const&)'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x466): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::MultiVector(Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const&)'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x48c): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x494): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x4e3): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::MultiVector(Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const&)'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x4fe): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::MultiVector(Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> > const&)'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x527): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x52f): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x632): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
tmpxft_00002f35_00000000-4_MueLu_RefMaxwell.cudafe1.cpp:(.text._ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd[_ZNK5MueLu10RefMaxwellIdiiN6Kokkos6Compat23KokkosDeviceWrapperNodeINS1_6SerialENS1_9HostSpaceEEEE5applyERKN6Xpetra11MultiVectorIdiiS6_EERSA_N7Teuchos7ETranspEdd]+0x63a): undefined reference to `Tpetra::MultiVector<double, int, int, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >::~MultiVector()'
collect2: error: ld returned 1 exit status
The reason that the EMPIRE build of Panzer does not show this build failure is that it enabled -DPanzer_ENABLE_TESTS=ON
which does not enable Panzer examples. But the option -DTrilinos_ENABLE_TESTS=ON
causes the default enabled of Panzer tests and examples (yes that is confusing behavior but that is what it is).
The options for addressing this are:
- Fix the build failures for these CUDA builds, or
- Disable Panzer examples for these specific ATDM CUDA builds and fix them later (if desired)
Steps to Reproduce:
The instructions to reproduce these build failures can be found starting at:
and clicking "Reproducing ATDM builds locally" which takes you to:
Basically, on hansen
or shiller
, you just clone the Trilinos repo (with location depicted as $TRILINOS_DIR
below), get on the develop
branch. Then create a build directory and do the configure and build as:
$ cd <some_build_dir>/
$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh cuda-opt
$ cmake \
-DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
-DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Panzer=ON \
-DATDM_TWEAKS_FILES = \
$TRILINOS_DIR
$ make -j16