Skip to content
GitLab
Projects Groups Topics Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • T Trilinos
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributor statistics
    • Graph
    • Compare revisions
  • Issues 936
    • Issues 936
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 22
    • Merge requests 22
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • James Willenbring
  • Trilinos
  • Issues
  • #229
Closed
Open
Issue created Mar 18, 2016 by James Willenbring@jmwilleOwner

Make Teuchos Memory Management Classes thread-safe

Created by: bartlettroscoe

This story is to address the long-standing problem that the Teuchos Memory management Classes which use reference-counting are not thread safe.

CC: @MicheldeMessieres, @jwillenbring,

Next Action Status: See tasks ...

Tasks:

  1. Initial development and testing for multi-thread correctness [Done]
  2. Add configure time switch for thread safety: Define configure-time options Trilinos_ENABLE_THREAD_SAFE and Teuchos_ENABLE_THREAD_SAFE (latter is given the given the default of the former value).
  3. Turn off for Trilinos_ENABLE_CXX11=OFF: That is, set Teuchos_ENABLE_THREAD_SAFE=OFF in this case. Run full Trilinos test suite with -DTrilinos_ENABLE_CXX11=OFF.
  4. Update the Teuchos test suite:
    • Inform CTest of number of threads for thread-safe tests: Figure this out at configure time and then set NUM_TOTAL_CORES_USED (see [TRIBITS_ADD_TEST())(https://tribits.org/doc/TribitsDevelopersGuide.html#formal-arguments-tribits-add-test))
    • Make pre-push BASIC test suite fast: Make the longer running threading tests NIGHTLY.
  5. Performance testing:
    • For builds:
      • -DCMAKE_BUILD_TYPE=RELEASE -DTrilinos_ENABLE_DEBUG=ON (Trilinos_ENABLE_THREAD_SAFE on and off)
      • -DCMAKE_BUILD_TYPE=RELEASE -DTrilinos_ENABLE_DEBUG=OFF (Trilinos_ENABLE_THREAD_SAFE on and off)
    • For compilers:
      • GCC version 4.8.x .
      • Intel version 15.x
      • Clang X
    • Run Trilinos (nearly full) test suite with and without thread-safety turned on.
    • Run Nalu, Albany, and Drekar test suites with thread safety on and off and see the performance impact with debug-mode checking turned on.
    • Request report from Cedric about usage and performance.
    • If performance okay, continue. Otherwise, decide what to do.
  6. Disallow throwing exceptions from destructors: We just need to disallow exceptions and make Teuchos MM classes abort in destructors when errors occur. Update unit tests for the case of circular references and exceptions. Need to provide TEUCHOS_ABORT_IF(<condition>) that will print and then call abort.
  7. Merge into develop branch with Trilinos_ENABLE_THREAD_SAFE=OFF by default:
    • Update teuchos/ReleaseNotes.txt to discuss exception destructor difference.
    • Announce time schedule for turning this on by default.
  8. Update documentation / Code review:
    • Update unit test documentation: With final tests in place, will create a uniformly formatted summary for each in code to describe it’s purpose.
    • Update RCP documentation: Need to update RCP documents to reflect these changes
    • Ross reviews code, tests, and updated documentation.
  9. Turn on Trilinos_ENABLE_THREAD_SAFE=OFF by default:
    • Update teuchos/ReleaseNotes.txt
    • Send out announcement
  10. Other considerations and improvements: (move to new stories?)
    1. Review Array.h mutex implementation: This was new code I added after our last review to make Array.h thread safe - I have implemented suggested tests we discussed on Github.
    2. Discuss plan for debug detection of dangling weak ptr. Debug builds have checks to validate weak ptrs but those checks can fail if another thread kills the data. I’ve got tests in place which detect and demonstrate this issue but need to discuss further how we would like to address this.
    3. Consider additional changes for ArrayView, ArrayRCP, Tuple, Ptr: Implemented fairly limited sanity checks on these.
    4. Weak to strong conversion: Have code in place which implements thread safe upgrade of a weak ptr to a strong ptr, along with a unit test, but the role of this is unclear at the moment.
    5. Make tests have inverted case: Tests should demonstrate they can detect thread problems when the fix is not applied - the inverted case. I’ve got some #defines set up to do this but wanted to discuss how to best organize those. Many of the inverted tests will need separate main functions.
Assignee
Assign to
Time tracking