Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #2446
Closed
Open
Issue created Mar 23, 2018 by James Willenbring@jmwilleOwner

Address expensive Panzer tests that timeout at 10 minutes in ATDM builds

Created by: bartlettroscoe

CC: @trilinos/panzer, @bathmatt, @fryeguy52

Next Action Status

Pushed the commits 245e01d9 and d852fa33 to 'develop' to address timeouts and it removed the timing out tests on 3/25/2108. Addressing memory issues and re-enabling these tests will be done in other follow-on issues.

Description

This story is to analyze and then to address some expensive Panzer tests that are timing out routinely in the ATDM Trilinos builds as shown, for example, in the following query that lists all of the timing out tests over the last week as shown in the query:

  • https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-21&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=7&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=65&value2=Panzer&field3=status&compare3=62&value3=passed&field4=status&compare4=62&value4=notrun&field5=buildstarttime&compare5=84&value5=2018-03-23&field6=buildstarttime&compare6=83&value6=2018-03-16&field7=details&compare7=63&value7=timeout

This query shows the following 6 timing out tests:

  • PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4
  • PanzerAdaptersSTK_main_driver_energy-ss-loca-eigenvalue
  • PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-2
  • PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3
  • PanzerAdaptersSTK_PoissonInterfaceExample_2d_diffsideids_MPI_1
  • PanzerAdaptersSTK_PoissonInterfaceExample_2d_MPI_4

which include the builds:

  • Trilinos-atdm-hansen-shiller-cuda-debug
  • Trilinos-atdm-hansen-shiller-cuda-opt
  • Trilinos-atdm-hansen-shiller-intel-debug-serial
  • Trilinos-atdm-white-ride-cuda-debug
  • Trilinos-atdm-white-ride-cuda-opt
  • Trilinos-atdm-white-ride-gnu-debug-openmp

As was discovered in https://github.com/trilinos/Trilinos/issues/2318#issuecomment-375494367, many of these tests will actually complete if you increase the timeouts . In particular, for the CUDA builds on hansen/shiller the following set of 5 tests all passed once the timeouts were increased to over 40 minutes for those CUDA builds:

  • PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4
  • PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-2
  • PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3
  • PanzerAdaptersSTK_PoissonInterfaceExample_2d_diffsideids_MPI_1
  • PanzerAdaptersSTK_PoissonInterfaceExample_2d_MPI_4

The only test missing from the above list for CUDA builds on hansen/shiller was PanzerAdaptersSTK_main_driver_energy-ss-loca-eigenvalue and that test only timed out on the Trilinos-atdm-white-ride-cuda-opt build.

This Issue will be to investigate these tests some more and then decide how to address them.

Tasks:

  1. Inspect the timing out tests in the last week on all builds of Trilinos ... All can be addressed with increasing timesouts and one disable (see below) [DONE]
  2. Increase timeouts on all of the timing out Panzer tests in the last week to 45 minutes and set CATEGORIES NIGHTLY ...
  3. See if these tests pass with longer timeouts in automated builds and see what their runtimes are when they are displayed on CDash ...
  4. Decrease the timeouts for some of the tests that are not taking 45 minutes to complete ...
  5. ???

Related Issues

  • Related to #2318 (closed)
Assignee
Assign to
Time tracking