Create GPU-capable Thyra/Tpetra adapters for linear iterative algorithms
Created by: bartlettroscoe
Next Action Status:
PR #1442 implementing this merged to 'develop'. Next: Wait for feedback from automated testing and customers ...
CC List: @rppawlo
Description:
The current Thyra-Tpetra adaptors simply inherit from Thyra::SpmdMultiVectorBase, implementing the getNonconstLocalDataImpl() and getLocalDataImpl() methods to get data pointers to the Tpetra data. For a GPU node, this has the effect of copying the data to the host, so that Thyra (or more likely RTOp) can manipulate it there, and then copy it back (if it is modified). What we need for this work is a new Thyra-Tpetra adapter that will call native Tpetra methods on Tpetra::MultiVector.
The initial scope of this work will be to just avoid calling RTOps in the linear iterative algorithms that are in Belos and Anasazi. In particular, the goal is to avoid copies of data and RTOps for:
- Anasazi::BlockKrylovSchur and Anasazi::BlockKrylovSchurSolverManager
- Belos::BlockGmresSolMgr via Stratimikos DefaultLinearSolverBuilder
A tentative list of arithmetic methods needed by Anasazi BKS is:
- MVT::SetBlock(), which calls Thyra::assign()
- MVT::MvRandom(), which calls Thyra::randomize()
- MVT::MvTimesMatAddMv(), which calls MultiVectorBase::apply()
- MVT::MvNorm(), which calls Thyra::norms_2()
- MVT::MvAddMv(), which calls Thyra::linear_combination()
Belos::BlockGmresIter and BlockGmresSolMgr require similar functionality.
It appears that the adaptor between Thyra::LinearOpBase and Tpetra::Operator is fine.
NOTE: This is the GitHub version of the Trilinos Bugzilla ticket #5837 "Create GPU-capable Thyra adaptors for Tpetra". This work will now be tracked in this GitHub issue and only refer back to the Bugzilla ticket for historical purposes.
Definition of Done:
???
Tasks:
- Provide complete accounting of all of the Belos MultiVector traits functions that are called for the targeted Belos GMRES solver (set up a test case for this purpose). (Hint: Run profiler or use Teuchos Time Monitor to determine this.)
- Determine plan for adding minimal new virtual functions to Thyra::MultiVectorBase to replace the majority of the calls to RTOps.
- Add new pure virtual functions to the Thyra::MultiVectorBase interface assign() and linearCombination() to replace as many RTOps as possible.
- ???