Finish robust install of Trilinos when there are individual package build or install failures
Created by: bartlettroscoe
This should finally make the installs of Trilinos robust if there are build failures (as part of #2689). For example, a build failure in Panzer would not break the installs of packages used by SPARC. I tested a real-live use case with Trilinos and SPARC and verified this really works (see details below).
Origin repo remote tracking branch: 'github/master'
Origin repo remote repo URL: 'github = git@github.com:TriBITSPub/TriBITS.git'
At commit:
commit 3b02ce896ad948e7505804caa86e18835b677d3e
Author: Roscoe A. Bartlett <rabartl@sandia.gov>
Date: Thu Apr 18 17:36:32 2019 -0600
Summary: Make usage of <Package>Config.cmake robust when there are broken packages (trilinos/Trilinos#2689)
How this was tested?
There are strong automated tests in TriBITS for this but I also did a real-live use case where I broke Phalanx and therefore also Panzer and verified that SPARC was able to correctly build and passed tests for the remaining packages (because SPARC does not use Phalanx or Panzer).
Detailed manual test details (click to expand)
.
(4/19/2019)
Testing SPARC against an install of Trilios where Intrepid2 is broken. To do that, I basically need to build and install Trilinos manually and then build and test SPARC against that Trilinos install. I will use the build: cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt
.
First, get Trilinos and TriBITS repos into the right state:
$ cd /scratch/rabartl/Trilinos.base/Trilinos/
$ git checkout atdm-nightly
$ git pull
$ cd TriBITS/
$ git fetch github
$ git checkout --track github/trilinos-2689-robust-proj-config-install
Now break the Phalanx build so that Phalanx will not produc libphalanx.a
:
$ cd /scratch/rabartl/Trilinos.base/Trilinos/
$ echo "This file is broken" >> packages/phalanx/cmake/Phalanx_config.hpp.in
$ git diff
diff --git a/packages/phalanx/cmake/Phalanx_config.hpp.in b/packages/phalanx/cmake/Phalanx_config.hpp.in
index 31af6f4..8f690c0 100644
--- a/packages/phalanx/cmake/Phalanx_config.hpp.in
+++ b/packages/phalanx/cmake/Phalanx_config.hpp.in
@@ -27,3 +27,4 @@
@PHALANX_DEPRECATED_DECLARATIONS@
#endif
+This file is broken
Now to build and install Trilinos using this version of TriBITS:
$ cd /scratch/rabartl/Trilinos.base/BUILDS/ATDM/CEE-RHEL6/CHECKIN/
$ ./checkin-test-atdm-cee-rhel6.sh \
cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt \
--enable-all-packages=on \
--configure
$ cd cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt/
$ . load-env.sh
Hostname 'ceerws1113' matches known ATDM host 'cee-rhel6' and system 'cee-rhel6'
Setting compiler and build options for buld name 'cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt'
Using CEE RHEL6 compiler stack CLANG-5.0.1_OPENMPI-1.10.2 to build RELEASE code with Kokkos node type SERIAL
$ rm -r CMake*
$ time ./do-configure \
-DCMAKE_INSTALL_PREFIX=install \
-DCMAKE_SKIP_INSTALL_ALL_DEPENDENCY=ON \
-DTrilinos_TRIBITS_DIR:STRING=TriBITS/tribits \
-DTrilinos_ENABLE_ALL_PACKAGES=ON \
-DTrilinos_ENABLE_TESTS=OFF \
&> configure.out
real 0m22.984s
user 0m16.525s
sys 0m17.262s
$ time ninja -j16 -k 999999 &> make.out
real 14m52.077s
user 223m8.497s
sys 8m12.935s
$ time ninja install_package_by_package &> make.install.out
real 0m12.278s
user 0m9.684s
sys 0m1.507s
This created the installation with a lot of libraries:
$ ls install/lib/ | wc -l
132
This did not create Phalanx lib but it did create a few Panzer libs:
$ ls install/lib/ | grep phalanx
[empty]
$ ls install/lib/ | grep panzer
libpanzer-core.a
libpanzer-dof-mgr.a
We see a lot of build errors in Phalanx and Panzer:
$ grep FAILED make.out | grep /phalanx/ | wc -l
4
$ grep FAILED make.out | grep /panzer/ | wc -l
155
Set up for the standard install format:
$ cd /scratch/rabartl/Trilinos.base/BUILDS/ATDM/CEE-RHEL6/CHECKIN/cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt/
$ ln -s install cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt
Now to test SPARC 'master' against this:
$ env \
ATDM_TRIL_SPARC_BUILDS_LIST=cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt \
ATDM_TRIL_SPARC_SKIP_NATIVE_BUILD=1 \
ATDM_TRIL_SPARC_ATDM_USE_INSTALL_DIR=/scratch/rabartl/Trilinos.base/BUILDS/ATDM/CEE-RHEL6/CHECKIN/cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt \
./sparc-tril-dev-scripts/run_builds_and_tests.sh
Shoot, this gave the same build error:
/scratch/rabartl/Trilinos.base/BUILDS/ATDM/CEE-RHEL6/CHECKIN/cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt/cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt/include/Ifpack2_Relaxation_def.hpp:147:6: error: variable templates are a C++14 extension [-Werror,-Wc++14-extensions]
void Relaxation<MatrixType>::updateCachedMultiVector(const Teuchos::RCP<const Tpetra::Map<local_ordinal_type,global_ordinal_type,node_type> > & map, size_t numVecs) const{
^
reported in:
I will turn off -Werror and see what happens. I added the
$ env \
ATDM_TRIL_SPARC_BUILDS_LIST=cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt \
SPARC_CONFIG_NO_WERROR=1 \
ATDM_TRIL_SPARC_SKIP_NATIVE_BUILD=1 \
ATDM_TRIL_SPARC_ATDM_USE_INSTALL_DIR=/scratch/rabartl/Trilinos.base/BUILDS/ATDM/CEE-RHEL6/CHECKIN/cee-rhel6_clang-5.0.1_openmpi-1.10.2_serial_static_opt \
./sparc-tril-dev-scripts/run_builds_and_tests.sh
I had to fix other problems with SPARC using Trilnos as well (which Ifpack2 breaking backward compatiblity). That returned the test result:
100% tests passed, 0 tests failed out of 280
...
Total Test time (real) = 665.91 sec
So that actually passed!
That means that I have verified that the TriBITS 'install_package_by_pacakge' target is robust to package build errors for clients that don't use the broken packages!