Skip to content

3069 white cuda debug stokhos build error

Created by: bartlettroscoe

CC: @trilinos/stokhos

Description

The main contribution of the PR is that is fixes the build error for the stokhos_muelu lib in #3069 (closed). It also contains updated build-reference documentation on the causes (see commits).

Motivation and Context

We need the build error #3069 (closed) to be fixed and we want to provide documentation so that other people can avoid this.

How Has This Been Tested?

I tested this locally on white as described below. The full build of Stokhos passes now but there are several test failures. (But we will create new GitHub issues for those once this posts to CDash after the merge.)

DETAILED TEST RESULTS: (click to expand)

Testing on 'white':

$ cd ~/Trilinos.base/BUILD/WHITE/CUDA/CUDA-DEBUG/

$ . load-env.sh
Hostname 'white11' matches known ATDM host 'white' and system 'ride'
ATDM_CONFIG_TRILNOS_DIR = /home/rabartl/Trilinos.base/Trilinos
Setting default compiler and build options for JOB_NAME='cuda-debug'
Using white/ride compiler stack CUDA to build DEBUG code with Kokkos node type CUDA

$ rm -r CMake*

$ rm -r packages/

$ time cmake -GNinja \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnvAllPtPackages.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Stokhos=ON \
  ~/Trilinos.base/Trilinos \
  &> configure.out

real    1m37.779s
user    0m59.169s
sys     0m17.638s

$ time make NP=32 &> make.out

real    51m55.785s
user    1320m38.264s
sys     233m46.608s

$ time bsub -x -Is -q rhel7F -n 16 ctest -j8 --timeout 600 &> ctest.out

real    10m44.521s
user    0m0.013s
sys     0m0.040s

The returned the result:

20% tests passed, 67 tests failed out of 84

Subproject Time Summary:
Stokhos    = 2648.79 sec*proc (84 tests)

Total Test time (real) = 643.47 sec

The following tests FAILED:
	  1 - Stokhos_LegendreBasisUnitTest_MPI_1 (Failed)
	  2 - Stokhos_NormalizedLegendreBasisUnitTest_MPI_1 (Failed)
	  3 - Stokhos_HermiteBasisUnitTest_MPI_1 (Failed)
	  4 - Stokhos_NormalizedHermiteBasisUnitTest_MPI_1 (Failed)
	  5 - Stokhos_JacobiBasisUnitTest_MPI_1 (Failed)
	  6 - Stokhos_QuadExpansionUnitTest_MPI_1 (Failed)
	  7 - Stokhos_QuadraturePseudoSpectralExpansionUnitTest_MPI_1 (Failed)
	  8 - Stokhos_TensorProductPseudoSpectralExpansionUnitTest_MPI_1 (Failed)
	  9 - Stokhos_SmolyakPseudoSpectralExpansionUnitTest_MPI_1 (Failed)
	 10 - Stokhos_AlgebraicExpansionUnitTest_MPI_1 (Failed)
	 12 - Stokhos_DivisionOperatorUnitTest_MPI_1 (Failed)
	 13 - Stokhos_StieltjesUnitTest_MPI_1 (Failed)
	 14 - Stokhos_LanczosUnitTest_MPI_1 (Failed)
	 15 - Stokhos_GramSchmidtUnitTest_MPI_1 (Failed)
	 16 - Stokhos_Sparse3TensorUnitTest_MPI_1 (Failed)
	 17 - Stokhos_ExponentialRandomFieldUnitTest_MPI_1 (Failed)
	 18 - Stokhos_LogNormalUnitTest_MPI_1 (Failed)
	 20 - Stokhos_ProductBasisUtilsUnitTest_MPI_1 (Failed)
	 21 - Stokhos_TensorProductBasisUnitTest_MPI_1 (Failed)
	 22 - Stokhos_TotalOrderBasisUnitTest_MPI_1 (Failed)
	 23 - Stokhos_SmolyakBasisUnitTest_MPI_1 (Failed)
	 24 - Stokhos_TensorProductPseudoSpectralOperatorUnitTest_MPI_1 (Failed)
	 25 - Stokhos_LexicographicTreeBasisUnitTest_MPI_1 (Failed)
	 26 - Stokhos_SparseGridQuadratureUnitTest_MPI_1 (Failed)
	 27 - Stokhos_MatrixFreeOperatorUnitTest_MPI_1 (Failed)
	 28 - Stokhos_InterlacedOpUnitTest_MPI_2 (Failed)
	 29 - Stokhos_BasisInteractionGraphUnitTest_MPI_1 (Failed)
	 30 - Stokhos_AdaptivityToolsUnitTest_MPI_1 (Failed)
	 32 - Stokhos_InterlacedMapUnitTest_MPI_2 (Failed)
	 35 - Stokhos_SacadoPCEUnitTest_MPI_1 (Failed)
	 36 - Stokhos_SacadoETPCEUnitTest_MPI_1 (Failed)
	 37 - Stokhos_SacadoPCESerializationTests_MPI_1 (Failed)
	 38 - Stokhos_SacadoPCECommTests_MPI_1 (Failed)
	 39 - Stokhos_SacadoUQPCEUnitTest_MPI_1 (Failed)
	 40 - Stokhos_SacadoUQPCESerializationTests_MPI_1 (Failed)
	 41 - Stokhos_SacadoUQPCECommTests_MPI_1 (Failed)
	 42 - Stokhos_KokkosViewUQPCEUnitTest_Serial_MPI_1 (Failed)
	 43 - Stokhos_KokkosViewUQPCEUnitTest_Cuda_MPI_1 (Failed)
	 44 - Stokhos_KokkosCrsMatrixUQPCEUnitTest_Serial_MPI_1 (Failed)
	 45 - Stokhos_KokkosCrsMatrixUQPCEUnitTest_Cuda_MPI_1 (Failed)
	 46 - Stokhos_TpetraCrsMatrixUQPCEUnitTest_Serial_MPI_4 (Failed)
	 47 - Stokhos_TpetraCrsMatrixUQPCEUnitTest_Cuda_MPI_4 (Failed)
	 59 - Stokhos_TpetraCrsMatrixMPVectorUnitTest_Cuda_MPI_4 (Timeout)
	 60 - Stokhos_KokkosArrayKernelsUnitTest_Serial_MPI_1 (Failed)
	 61 - Stokhos_KokkosArrayKernelsUnitTest_Cuda_MPI_1 (Failed)
	 63 - Stokhos_hermite_example_MPI_1 (Failed)
	 64 - Stokhos_Linear2D_Diffusion_PCE_Example_MPI_2 (Failed)
	 65 - Stokhos_Linear2D_Diffusion_PCE_Interlaced_Example_MPI_2 (Failed)
	 66 - Stokhos_nox_example_MPI_1 (Failed)
	 67 - Stokhos_Linear2D_Diffusion_PCE_NOX_Example_MPI_2 (Failed)
	 68 - Stokhos_Linear2D_Diffusion_GMRES_Mean_Based_MPI_2 (Failed)
	 69 - Stokhos_Linear2D_Diffusion_GMRES_AGS_MPI_2 (Failed)
	 70 - Stokhos_Linear2D_Diffusion_CG_AGS_MPI_2 (Failed)
	 71 - Stokhos_Linear2D_Diffusion_GMRES_GS_MPI_2 (Failed)
	 72 - Stokhos_Linear2D_Diffusion_GMRES_AJ_MPI_2 (Failed)
	 73 - Stokhos_Linear2D_Diffusion_GMRES_KP_MPI_2 (Failed)
	 74 - Stokhos_Linear2D_Diffusion_GS_MPI_2 (Failed)
	 75 - Stokhos_Linear2D_Diffusion_JA_MPI_2 (Failed)
	 76 - Stokhos_Linear2D_Diffusion_LN_MPI_2 (Failed)
	 77 - Stokhos_Linear2D_Diffusion_GSLN_MPI_2 (Failed)
	 78 - Stokhos_Linear2D_Diffusion_GMRES_FA_MPI_2 (Failed)
	 79 - Stokhos_Linear2D_Diffusion_GMRES_KL_MPI_2 (Failed)
	 80 - Stokhos_Linear2D_Diffusion_GMRES_KLR_MPI_2 (Failed)
	 81 - Stokhos_uq_handbook_nonlinear_sg_example_MPI_1 (Failed)
	 82 - Stokhos_sacado_example_MPI_1 (Failed)
	 83 - Stokhos_division_example_MPI_1 (Failed)
	 84 - Stokhos_sacado_ensemble_example_MPI_1 (Failed)
Errors while running CTest

So that passed the build, but there are a bunch of failing Stokhos tests. We will deal with that in a new issue.

Checklist

  • My commit messages mention the appropriate GitHub issue numbers.
  • I have updated the documentation accordingly.

Merge request reports