Skip to content

Exclude using the node 'white26' (TRIL-253)

James Willenbring requested to merge bartlettroscoe:tril-253-no-white26 into develop

Created by: bartlettroscoe

The build Trilinos-atdm-white-ride-cuda-9.2-gnu-7.2.0-debug has 11 tests that timeout when run on node 'white26' that don't timeout when run on 'white24'. Therefore, we will assume that 'white26' is not working correctly so we are excluding it from the pool of nodes to run jobs on 'white'.

NOTE: At this point, we have removed nodes 'white25' (removed from the queue by the 'white' admins), 'white26' and 'white27'. Not sure how many nodes that leaves.

I tested this on 'white' using:

$ cd /home/rabartl/Trilinos.base/BUILD/WHITE/JENKINS/

$ env Trilinos_PACKAGES=Kokkos,Teuchos \
   ./ gnu-7.2.0-openmp-debug

*** ./  gnu-7.2.0-openmp-debug

ATDM_TRILINOS_DIR = '/home/rabartl/Trilinos.base/Trilinos'

Load some env to get python, cmake, etc ...

Hostname 'white11' matches known ATDM host 'white' and system 'ride'
Setting compiler and build options for buld name 'default'
Using white/ride compiler stack GNU-7.2.0 to build DEBUG code with Kokkos node type SERIAL and KOKKOS_ARCH=Power8

Running builds: gnu-7.2.0-openmp-debug

Running Jenkins driver ...

real    4m48.124s
user    0m0.682s
sys     0m0.299s

The output file:

  • Trilinos-atdm-white-ride-gnu-7.2.0-openmp-debug/smart-jenkins-driver.out


+ bsub -x -Is -q rhel7F -n 16 -J Trilinos-atdm-white-ride-gnu-7.2.0-openmp-debug -W 12:00 -R 'hname!=white26&&hname!=white27' /ascldap/users/rabartl/Trilinos.base/BUILD/WHITE/JENKINS/Trilinos-atdm-white-ride-gnu-7.2.0-openmp-debug/Trilinos/cmake/ctest/drivers/atdm/
***Forced exclusive execution
Job <44407> is submitted to queue <rhel7F>.
<<Waiting for dispatch ...>>
<<Starting on white24>>

While the job was running, bjobs -l 44407 showed:

 Combined: select[(hname != white26 &&hname != white27 ) && (type == any)] orde
 Effective: select[(hname != white26 &&hname != white27 ) && (type == any)] ord

Merge request reports